I think SHRDLU (Blocks World) would have been more interesting if the language model was learned rather than programmed. There is an important lesson here, and Winograd knew it: this route is a dead end. Adult English has a complexity of about 10^9 bits (my estimate). SHRDLU has a complexity of less than 7 x 10^5 bits. (I measured the upper bound by compressing the source code from http://hci.stanford.edu/winograd/shrdlu/code/ with paq8f). One lesson I hope we learned is that there is no shortcut around complexity. We have tried that route for 50 years. There is no "simple" algorithm for AGI. OpenCyc 1.0 has a download size (zip) of 147 MB.
It does not help that words in SHRDLU are grounded in an artificial world. Its failure to scale hints that approaches such as AGI-Sim will have similar problems. You cannot simulate complexity. I learned this not from studying language, but from my dissertation work in a seemingly unrelated area: network intrusion detection. In 1998 and 1999 MIT Lincoln Labs and DARPA developed a data set of simulated network traffic with various simulated attacks and ran contests to see which intrusion detection systems were best at detecting them. They spent probably millions of dollars trying to make the traffic seem realistic as possible, simulating hundreds of machines on a local network and thousands more on the Internet, generating fake email using word bigram models, web page downloads from public sites, etc, based on studies of real traffic. My approach was to use anomaly detection - model normal traffic and flag anything unusual as suspicious. The problem turned out to be ridiculously easy: look at the first few dozen bytes of each network packet and flag any byte value you haven't seen before in that position. It easily beat every system in the original contest. If only it worked in real traffic. The result of my studies was to basically discredit the data set. What happened here can be explained in terms of algorithmic complexity. The program that generated the artificial traffic was much smaller than the "program" that generates real traffic, so that inserting the attacks disproportionally increased the total complexity, making the traffic less predictable (or compressable). In a similar way, SHRDLU performed well in its artificial, simple world. But how would you measure its performance in a real world? If we are going to study AGI, we need a way to perform tests and measure results. It is not just that we need to know what works and what doesn't. The systems we build will be too complex to know what we have built. How would you measure them? The Turing test is the most widely accepted, but it is somewhat subjective and not really appropriate for an AGI with sensorimotor I/O. I have proposed text compression. It gives hard numbers, but it seems limited to measuring ungrounded language models. What else would you use? Suppose that in 10 years, NARS, Novamente, Cyc, and maybe several other systems all claim to have solved the AGI problem. How would you test their claims? How would you decide the winner? -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?list_id=303