Re: [agi] Language modeling

Pei Wang Tue, 24 Oct 2006 12:16:11 -0700

On 10/23/06, Matt Mahoney <[EMAIL PROTECTED]> wrote:

I am interested in identifying barriers to language modeling and how to 
overcome them.


I have no doubt that probabilistic models such as NARS and Novamente can 
adequately represent human knowledge.

NARS isn't a probabilistic model, though Novamente is, in a sense.

Also, I have no doubt they can learn e.g. relations such as "all frogs are green" from 
examples of green frogs.  My question relates to solving the language problem: how to convert natural 
language statements like "frogs are green" and equivalent variants into the formal internal 
representation without the need for humans to encode stuff like (for all X, frog(X) => green(X)).

In NARS,  "all frogs are green" can be converted into a Narsese
judgment "frog-->[green] <1,c>", where "-->" is the inheritance
relation, "1" the frequency value corresponding to "all", and "c" the
default confidence value for affirmative judgments, such as 0.9.

The language learning process will build the following mappings from
English to Narsese, based in the system's experience:
*. word "frog" to term "frog" (I use the same word to make things
simple, though it can be any Narsese identifier).
*. word "green" to term "[green]" (the same as above, plus special
treatment for adjectives, as discussed in my book)
*. sentence "X are Y" to statement "X-->Y" (here syntax and semantics
are learned together) --- let's ignore the issues of sense and number
for now.
*. the word "all" contribute to the truth value of the statement.

Please note that semantically it is different from "(for all X,
frog(X) => green(X))" on at least two crucial points:

(1) In NARS, "all" is interpreted, according to the
experience-grounded semantics
(http://nars.wang.googlepages.com/wang.semantics.pdf) as "according to
all evidence", but not as "for all instance in the domain", which is
how universal quantifier is defined in predicate logic.

(2) The inheritance relation is defined differently from material
implication, though the two are intuitively similar.

In summary, Narsese is not "first-order predicate calculus plus
probability theory".

This problem is hard because there might not be terms that exactly correspond to "frog" or 
"green", and also because interpreting natural language statements is not always straightforward, 
e.g. "I know it was either a frog or a leaf because it was green".

Sure, that is why I said in
http://nars.wang.googlepages.com/wang.roadmap.pdf that the relation
between terms in Narsese and words in English (or any natural
language) is not one-to-one, but many-to-many, and each relation is
only true to a certain degree.

Converting natural language to a formal representation requires language 
modeling at the highest level.  The levels from lowest to highest are: 
phonemes, word segmentation rules, semantics, simple sentences, compound 
sentences.  Regardless of whether your child learned to read at age 3 or not at 
all, children always learn language in this order.

I'm afraid you are confusing two issues: (1) in general simple
sentences are indeed learned before compound sentences (even this may
have exceptions in the case of proverbs and idioms), but (2) the
traditional learning order from syntax to semantics to pragmatics is
wrong, and these three aspects of a language are learned together (as
well as phonemes), as Richard argued.

With this observation, it seems that hard coding rules for inheritance, equivalence, logical, temporal etc. relations, into a knowledge 
representation will not help in learning these relations from text.  The language model still has to learn these relations from previously 
learned, simpler concepts.  In other words, the model has to learn the meanings of "is", "and", "not", 
"if-then", "all", "before", etc. without any help from the structure of the knowledge represenation or 
explicit encoding.  The model has to first learn how to convert compound sentences into a formal representation and back, and only then can 
it start using or adding to the knowledge base.

Again, there are two different topics. In NARS, there are innate
notions roughly correspond to "is", "and", "not", "if-then", "all",
"before", etc, but all these English words need to be learned by the
system, which will be gradually linked to the innate notions. In
principle, the meaning of every word in every natural language will be
learned, not coded.

So my question is: what is needed to extend language models to the level of 
compound sentences?  More training data?  Different training data?  A new 
theory of language acquisition?  More hardware?  How much?

All of them, of course, but to me, the key is a  new theory of
language acquisition, which treat the process not as stand-alone, but
as part of the thinking/cognitive process, carried out by the general
mechanism of intelligence/cognition/thinking.

For example, the current statistical learning technique may be proper
for limited application, but I don't think it is the right way to go
in the long run.

Disclaimer: I haven't done much language acquisition on NARS yet, and
haven't worked out all the details. I'm not claiming that I've solved
this problem. ;-)

Pei

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/[EMAIL PROTECTED]

Re: [agi] Language modeling

Reply via email to