Re: AW: AW: [agi] Language learning (was Re: Defining AGI)

Matt Mahoney Wed, 22 Oct 2008 17:56:26 -0700

--- On Wed, 10/22/08, Dr. Matthias Heger <[EMAIL PROTECTED]> wrote:

> You make the implicit assumption that a natural language
> understanding system will pass the turing test. Can you prove this?


If you accept that a language model is a probability distribution over text, 
then I have already proved something stronger. A language model exactly 
duplicates the distribution of answers that a human would give. The output is 
indistinguishable by any test. In fact a judge would have some uncertainty 
about other people's language models. A judge could be expected to attribute 
some errors in the model to normal human variation.

> Furthermore,  it is just an assumption that the ability to
> have and to apply
> the rules are really necessary to pass the turing test.
> 
> For these two reasons, you still haven't shown 3a and
> 3b.

I suppose you are right. Instead of encoding mathematical rules as a grammar, 
with enough training data you can just code all possible instances that are 
likely to be encountered. For example, instead of a grammar rule to encode the 
commutative law of addition,

  5 + 3 = a + b = b + a = 3 + 5

a model with a much larger training data set could just encode instances with 
no generalization:

  12 + 7 = 7 + 12
  92 + 0.5 = 0.5 + 92
  etc.

I believe this is how Google gets away with brute force n-gram statistics 
instead of more sophisticated grammars. It's language model is probably 10^5 
times larger than a human model (10^14 bits vs 10^9 bits). Shannon observed in 
1949 that random strings generated by n-gram models of English (where n is the 
number of either letters or words) look like natural language up to length 2n. 
For a typical human sized model (1 GB text), n is about 3 words. To model 
strings longer than 6 words we would need more sophisticated grammar rules. 
Google can model 5-grams (see 
http://googleresearch.blogspot.com/2006/08/all-our-n-gram-are-belong-to-you.html
 ), so it is able to generate and recognize (thus appear to understand) 
sentences up to about 10 words. 

> By the way:
> The turing test must convince 30% of the people.
> Today there is a system which can already convince 25%
> 
> http://www.sciencedaily.com/releases/2008/10/081013112148.htm

It would be interesting to see a version of the Turing test where the human 
confederate, machine, and judge all have access to a computer with an internet 
connection. I wonder if this intelligence augmentation would make the test 
easier or harder to pass?

> 
> -Matthias
> 
> 
> > 3) you apply rules such as 5 * 7 = 35 -> 35 / 7 = 5
> but
> > you have not shown that
> > 3a) that a language understanding system
> necessarily(!) has
> > this rules
> > 3b) that a language understanding system
> necessarily(!) can
> > apply such rules
> 
> It must have the rules and apply them to pass the Turing
> test.
> 
> -- Matt Mahoney, [EMAIL PROTECTED]


-- Matt Mahoney, [EMAIL PROTECTED]



-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=117534816-b15a34
Powered by Listbox: http://www.listbox.com

Re: AW: AW: [agi] Language learning (was Re: Defining AGI)

Reply via email to