Re: Neural language models (was Re: [singularity] Help get the 400k SIAI matching challenge on DIGG's front page)

Richard Loosemore Thu, 17 May 2007 12:15:07 -0700

Matt Mahoney wrote:

--- Richard Loosemore <[EMAIL PROTECTED]> wrote:
Matt Mahoney wrote:
What did your simulation actually accomplish? What were the results?
What do
you think you could achieve on a modern computer?
Oh, I hope there's no misunderstanding: I did not build networks to doany kind of syntactic learning, they just learned relationships betweenphonemic representations and graphemes. (They learned to spell). Whatthey showed was something already known for the learning ofpronunciation: that the system first learns spellings by rote, thenincreases its level of accuracy and at the same time starts to pick upregularities in the mapping. Then it starts to "regularize" thespellings. For example: having learned to spell "height" correctly inthe early stages, it would then start to spell it incorrectly as "hite"because it had learned many other words in which the spelling of thephoneme sequence in "height" would involve "-ite". Then in the laststages it would learn the correct spellings again.
That's interesting, because children make similar mistakes at higher language
levels.  For example, a child will learn an irregular verb like "went", then
later generalize to "goed" before switching back to the correct form.

Uh... I forgot to mention that explaining those data about childlanguage learning was the point of the work. It's a well known effect,and this is one of the reasons why the connectionist models got everyoneexcited: psychological facts started to be explained by the performanceof the connectionist nets.

I am convinced that similar neural learning mechanisms are involved at the
lexical and syntactic levels, but on different scales.  For example, we learn
to classify letters into vowels and consonants by their context, just as we do
for nouns and verbs.  Then we learn sequential patterns.  Just as every word
needs a vowel, every sentence needs a verb.

You are treading paths that could benefit from going back over theliterature (basically psycholinguistics and connectionist). If you keeppursuing this line of thought you will be reading the path that I was onback in 1987 (I'm not being patronizing: just trying to give you aheads up).

The next problem that you will face, along this path, is to figure outhow you can get such nets to elegantly represent such things as morethan one token of a concept in one sentence: you can't just activatethe "duck" node when you here that phrase from the Dire Straits songWild West End: "I go down to Chinatown ... Duck inside a doorway;Duck to Eat".

Then you'll need to represent sequential information in such a way thatyou can do something with it. Recurrent neural nets suck very badly ifyou actually try to use them for anything, so don't get fooled by theirSoren Song.

Then you will need to represent layered representations: conceptslearned from conjunctikons of other concepts rather than layer-1percepts. Then represent action, negation, operations, intentions,variables.......

When you've done that you are in the world of Generalized Connectionism.That's what I do.

I think that learning syntax is a matter of computational power.  Children
learn the rules for segmenting continuous speech at 7-10 months, but don't
learn grammar until years later.  So you need more training data and a larger
network.  The reason I say the problem is O(n^2) is because when you double
the information content of the training data, you need to double the number of
number of connections to represent it.  Actually I think it is a little less

than O(n^2) (maybe O(n^2/log n)?) because of redundancy in the training data.There are about 1000 times more words than there are letters, so this suggests

you need 100,000 times more computing power for adult level grammar.  This
might explain why the problem is still unsolved.

Your numbers contain way too many assumptions about the process. When Isaid that it was not O(n^2) I meant that in practice that is not what*we* needed. I believe it was logN, but such stuff just was notimportant enough for me to track it.

It is just not procuctive to focus on the computaional complexity issuesat this stage: gotta get a lot of mechanisms tried out before we caneven begin to talk about such stuff (and, as I say, I don't believe wewill really care even then).



Richard Loosemore.

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=4007604&user_secret=8eb45b07

Re: Neural language models (was Re: [singularity] Help get the 400k SIAI matching challenge on DIGG's front page)

Reply via email to