Re: Language modeling (was Re: [agi] draft for comment)

Pei Wang Fri, 05 Sep 2008 18:56:50 -0700

Matt,

Thanks for taking the time to explain your ideas in detail. As I said,
our different opinions on how to do AI come from our very different
understanding of "intelligence". I don't take "passing Turing Test" as
my research goal (as explained in
http://nars.wang.googlepages.com/wang.logic_intelligence.pdf and
http://nars.wang.googlepages.com/wang.AI_Definitions.pdf).  I disagree
with Hutter's approach, not because his SOLUTION is not computable,
but because his PROBLEM is too idealized and simplified to be relevant
to the actual problems of AI.


Even so, I'm glad that we can still agree on somethings, like
semantics comes before syntax. In my plan for NLP, there won't be
separate 'parsing' and 'semantic mapping' stages. I'll say more when I
have concrete results to share.

Pei

On Fri, Sep 5, 2008 at 8:39 PM, Matt Mahoney <[EMAIL PROTECTED]> wrote:
> --- On Fri, 9/5/08, Pei Wang <[EMAIL PROTECTED]> wrote:
>
>> Like to many existing AI works, my disagreement with you is
>> not that
>> much on the solution you proposed (I can see the value),
>> but on the
>> problem you specified as the goal of AI. For example, I
>> have no doubt
>> about the theoretical and practical values of compression,
>> but don't
>> think it has much to do with intelligence.
>
> In http://cs.fit.edu/~mmahoney/compression/rationale.html I explain why text 
> compression is an AI problem. To summarize, if you know the probability 
> distribution of text, then you can compute P(A|Q) for any question Q and 
> answer A to pass the Turing test. Compression allows you to precisely measure 
> the accuracy of your estimate of P. Compression (actually, word perplexity) 
> has been used since the early 1990's to measure the quality of language 
> models for speech recognition, since it correlates well with word error rate.
>
> The purpose of this work is not to solve general intelligence, such as the 
> universal intelligence proposed by Legg and Hutter [1]. That is not 
> computable, so you have to make some arbitrary choice with regard to test 
> environments about what problems you are going to solve. I believe the goal 
> of AGI should be to do useful work for humans, so I am making a not so 
> arbitrary choice to solve a problem that is central to what most people 
> regard as useful intelligence.
>
> I had hoped that my work would lead to an elegant theory of AI, but that 
> hasn't been the case. Rather, the best compression programs were developed as 
> a series of thousands of hacks and tweaks, e.g. change a 4 to a 5 because it 
> gives 0.002% better compression on the benchmark. The result is an opaque 
> mess. I guess I should have seen it coming, since it is predicted by 
> information theory (e.g. [2]).
>
> Nevertheless the architectures of the best text compressors are consistent 
> with cognitive development models, i.e. phoneme (or letter) sequences -> 
> lexical -> semantics -> syntax, which are themselves consistent with layered 
> neural architectures. I already described a neural semantic model in my last 
> post. I also did work supporting Hutchens and Alder showing that lexical 
> models can be learned from n-gram statistics, consistent with the observation 
> that babies learn the rules for segmenting continuous speech before they 
> learn any words [3].
>
> I agree it should also be clear that semantics is learned before grammar, 
> contrary to the way artificial languages are processed. Grammar requires 
> semantics, but not the other way around. Search engines work using semantics 
> only. Yet we cannot parse sentences like "I ate pizza with Bob", "I ate pizza 
> with pepperoni", "I ate pizza with chopsticks", without semantics.
>
> My benchmark does not prove that there aren't better language models, but it 
> is strong evidence. It represents the work of about 100 researchers who have 
> tried and failed to find more accurate, faster, or less memory intensive 
> models. The resource requirements seem to increase as we go up the chain from 
> n-grams to grammar, contrary to symbolic approaches. This is my argument why 
> I think AI is bound by lack of hardware, not lack of theory.
>
> 1. Legg, Shane, and Marcus Hutter (2006), A Formal Measure of Machine 
> Intelligence, Proc. Annual machine learning conference of Belgium and The 
> Netherlands (Benelearn-2006). Ghent, 2006.  
> http://www.vetta.org/documents/ui_benelearn.pdf
>
> 2. Legg, Shane, (2006), Is There an Elegant Universal Theory of Prediction?,  
> Technical Report IDSIA-12-06, IDSIA / USI-SUPSI, Dalle Molle Institute for 
> Artificial Intelligence, Galleria 2, 6928 Manno, Switzerland.
> http://www.vetta.org/documents/IDSIA-12-06-1.pdf
>
> 3. M. Mahoney (2000), A Note on Lexical Acquisition in Text without Spaces, 
> http://cs.fit.edu/~mmahoney/dissertation/lex1.html
>
>
> -- Matt Mahoney, [EMAIL PROTECTED]
>
>
>
> -------------------------------------------
> agi
> Archives: https://www.listbox.com/member/archive/303/=now
> RSS Feed: https://www.listbox.com/member/archive/rss/303/
> Modify Your Subscription: https://www.listbox.com/member/?&;
> Powered by Listbox: http://www.listbox.com
>


-------------------------------------------
agi
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=8660244&id_secret=111637683-c8fa51
Powered by Listbox: http://www.listbox.com

Re: Language modeling (was Re: [agi] draft for comment)

Reply via email to