We have joined forces in a collaborative consortium whose aim is to develop the concept of an integrated mathematical model that is able to represent syntactic and semantic aspects of language in a reduced lexicon, namely Basic English, and ultimately develop an artificial cognition system. The fundamental concept lies in creating a mathematical means to create a multi-dimensional graphical representation of a sentence, in a space pre-conditioned by the inter-relationship of all words in this reduced lexicon. To translate from standard English a wizard is employed, based on a vocabulary of 100,000+ words, to a representation using less than 2,000 words, which is available at simplish. Thus, in a multidimensional space the central low-dimensionality points describe words and their relationship, while increasingly complex phrases are represented by ideograms. The core relationships have been derived from a single interchangeable individual and then refined; unlike the majority of current efforts which rely on a large corpus of text, which have the disadvantage that ambiguity is a major problem. A descriptive phrase may either be represented by a complex word (i.e. not one of the basic English lexicon) or it may represent an abstract concept. In this way, our method is able to cope with both data-driven requirements and concept-driven information needed for problem-solving.
Possibly the most important aspect of cognition has to do with memory. We know that the processes of acquisition, storage and retrieval of knowledge lie at the heart of human cognition. Furthermore, it has long been known that organization and memorization are inseparable and that memory is aided by meaning. Therefore, working out a way to establish the meaning of words in a general cognition engine helps organization and is a crucial step in developing these systems. Our way to achieve this objective is to assign meaning in terms of the other words in a reduced vocabulary, unlike standard natural language where 100,000 words would have to be clustered somehow and related to each other.
Relating the core 1,000 words of Basic English to each other is being done by another team within our Consortium using multivariate methods, which also provides means to create a multi-dimensional graphical representation of a sentence, in a space pre-conditioned by the inter-relationship of all words in this reduced lexicon. Thus, these 1,000 words enable unique infinite expressiveness in a relatively low-dimensionality space. A descriptive phrase may either represent a complex word (i.e. not one of the Basic English lexicon) or it may represent an abstract concept. In this way, our method is able to cope with both data-driven requirements and concept-driven information needed for problem-solving. This representation is conceptually similar to Chinese writing with ideograms. In this pre-conditioned space two phrases written using different words that convey similar meaning will be represented by a similar ideogram, which can be compared to existing data, analyzed for concepts being searched for and/or extrapolated. The net result will be to allow us to move from pattern-matching to concept-matching, in a multidimensional nearest-neighbor search.
Core relationships have been derived from a single individual and then refined; unlike the majority of current efforts which rely on a large corpus of text, which have the disadvantage that ambiguity is a major problem since the meaning of a word becomes the consensus of perhaps tens of thousand of people’s ideas. In our case, the core relationships can be recalculated for people with different outlooks and this compared to each other to establish “meaning” to different people of the same phrase.
In this broad strategy, it is possible to assign trajectories to each user, service or specific information in a collaborative network. Moreover, the direction of these trajectories can be related to a planning path or the collection of the corresponding mission-derived data. Trajectories may end on a known precursor condition that can be extrapolated to an actionable conclusion, while other users’ threads, which can be allocated degrees of priority/credibility, are used to corroborate or reject said action. In this multi-threaded space information fusion is achievable and moreover, users can be added or deleted as they come into and out of a theater.
The consortium is currently collaborating on a joint submission of “The Rachel Repp Site” to the Loebner Prize, a competition aimed at finding a computer system able to pass Turing´s test.
Finally, this broad approach can be helpful in the general field of human-machine interaction by enabling humans to express themselves in full-vocabulary natural language at the same time as machines only actually process semantically-filtered limited-vocabulary phrases.
The Goodwill Company Ltd. ( U.K. ) – Provides overall system design and integration for commercial and C4ISR applications.
UP campus GDL ( México ) – Provides IT facilities, computer science and A.I. expertise.
AVNTK S.C. ( México ) – Provides IT facilities, computer science and AI expertise.
NaturaXalli SA de CV ( México ) – Responsible for the knowledge acquisition and storage modules.
Ardita Aeronautica ( México ) – Responsible for adapting the technology to human-machine interaction applications.
University of Saskatchewan (Advanced Computing Research Group, information and Communications Technology, Canada) – Provides IT facilities, computer science and mathematics expertise.
Universidad Marista de Guadalajara (Guadalajara, México) – Summarizing Technologies.
M. en C. Jesus Torres, Spark UP Guadalajara, México
PhD. Juan Carlos Zuñiga Anaya , University of Saskatchewan, Canada.
M. en I. Monica Bueno Martínez, NaturaXalli SA de CV, México.
Dr. Marcelo Funes-Gallanzi, The Goodwill Company Ltd.
Ing. Ladislao Acosta Cortes, Ardita Aeronautica
Ing. Cesar Torres, AVNTK
M. en C. Adolfo Ruiz Aceves, Universidad Marista de Guadalajara, México