Not necessarily childrens language, as tehy have their own problems and often use the wrong words and rules of grammar, but a simplified english, a reduced rule set.
 Something like no compound sentences for a start.  I believe most everything can be written without compound sentences, and that would greatly reduce the processing complexity,
and anaphora resolution as a part of the language rules, so if you reference something in one place it will stay the same throughout the section.

Its not quite as natural, but could be understood simply enough by humans as well as computers.
One problem I have with all of this, is the super-flowery writing styles of cramming as many words and complex topics all into one sentence.

James

Matt Mahoney <[EMAIL PROTECTED]> wrote:
----- Original Message ----
From: Ben Goertzel <[EMAIL PROTECTED]>
To: agi@v2.listbox.com
Sent: Tuesday, October 31, 2006 9:26:15 PM
Subject: Re: Re: [agi] Natural versus formal AI interface languages

>Here is how I intend to use Lojban++ in teaching Novamente. When
>Novamente is controlling a humanoid agent in the AGISim simulation
>world, the human teacher talks to it about what it is doing. I would
>like the human teacher to talk to it in both Lojban++ and English, at
>the same time. According to my understanding of Novamente's learning
>and reasoning methods, this will be the optimal way of getting the
>system to understand English. At once, the system will get a
>perceptual-motor grounding for the English sentences, plus an
>understanding of the logical meaning of the sentences. I can think of
>no better way to help a system understand English. Yes, this is not
>the way humans do it. But so what? Novamente does not have a human
>brain, it has a different sort of infrastructure with different
>strengths and weaknesses.

What about using "baby English" instead of an artificial language? By this I mean simple English at the level of a 2 or 3 year old child. Baby English has many of the properties that make artificial languages desirable, such as a small vocabulary, simple syntax and lack of ambiguity. Adult English is ambiguous because adults can use vast knowledge and context to resolve ambiguity in complex sentences. Children lack these abilities.

I don't believe it is possible to map between natural and structured language without solving the natural language modeling problem first. I don't believe that having structured knowledge or a structured language available makes the problem any easier. It is just something else to learn. Humans learn natural language without having to learn structured languages, grammar rules, knowledge representation, etc. I realize that Novamente is different from the human brain. My argument is based on the structure of natural language, which is vastly different from artificial languages used for knowledge representation. To wit:

- Artificial languages are designed to be processed (translated or compiled) in the order: lexical tokenization, syntactic parsing, semantic extraction. This does not work for natural language. The correct order is the order in which children learn: lexical, semantics, syntax. Thus we have successful language models that extract semantics without syntax (such as information retrieval and text categorization), but not vice versa.

- Artificial language has a structure optimized for serial processing. Natural language is optimized for parallel processing. We resolve ambiguity and errors using context. Context detection is a type of parallel pattern recognition. Patterns can be letters, groups of letters, words, word categories, phrases, and syntactic structures. We recognize and combine perhaps tens or hundreds of patterns simultaneously by matching to perhaps 10^5 or more from memory. Artificial languages have no such mechanism and cannot tolerate ambiguity or errors.

- Natural language has a structure that allows incremental learning. We can add words to the vocabulary one at a time. Likewise for phrases, idioms, classes of words and syntactic structures. Artificial languages must be processed by fixed algorithms. Learning algorithms are unknown.

- Natural languages evolve slowly in a social environment. Artificial languages are fixed according to some specificiation.

- Children can learn natural languages. Artificial languages are difficult to learn even for adults.

- Writing in an artificial language is an iterative process in which the output is checked for errors by a computer and the utterance is revised. Natural language uses both iterative and forward error correction.

By "natural language" I include man made languages like Esperanto. Esperanto was designed for communication between humans and has all the other properties of natural language. It lacks irregular verbs and such, but this is really a tiny part of a language's complexity. A natural language like English has a complexity of about 10^9 bits. How much information does it take to list all the irregularities in English like swim-swam, mouse-mice, etc?

-- Matt Mahoney, [EMAIL PROTECTED]




-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]



Thank You
James Ratcliff
http://falazar.com


Check out the New Yahoo! Mail - Fire up a more powerful email and get things done faster.
This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303

Reply via email to