Thank you. The detailed info is appreciated. --Abram
On Sun, Sep 28, 2008 at 11:54 PM, Stephen Reed <[EMAIL PROTECTED]> wrote: > Matt said: > The overview claims to be able to convert natural language sentences into > Cycl assertions, and to convert questions to Cycl queries. So I wonder why > the knowledge base is still not being built this way. And I wonder why there > is no public demo of the interface, and no papers giving verifiable > experimental results. > > From my employment at Cycorp, 1999-2006, I can answer your questions, and > also say why I am pursuing a natural language approach with Texai, extending > the OpenCyc ontology. First understand that Doug Lenat is a mathematician, > not a linguist nor cognitive psychologist. The Cyc Project began, in the > 1980's, hand-entering knowledge in a predicate calculus schema that was > intentionally agnostic with respect to natural languages. To this day, > increasingly sophisticated tools have enabled Cycorp's ontologists to more > precisely and more rapidly enter knowledge and perform queries, than they > can by relying on rather incomplete and relatively poorly performing English > language tools. For example, I witnessed some very long parsing times for > input sentences over 15 words. In contrast, new concepts can be defined and > positioned in the existing ontology using point and click, non-NL screens > very rapidly if the ontologist is fully prepared with respect to what they > want to accomplish - for example using the "Create Similar" tool. > > Addressing the public demo, I believe that first Cycorp does not want to > expend the considerable effort to create and maintain a publicly accessible > NL interface of high quality, when there are so many other areas of Cyc that > sponsors are paying for that need attention. Secondly, I believe that > because Cyc is proprietary, it precludes a public NL interface that enables > the proprietary content to be extracted. Furthermore, the effort required > to create some reasonable, but small example public partition would be > prevented by my first observation. > > During my tenure, Cycorp had no fewer than three PhD computational linguists > continuously employed on NL interfaces. But I believe their progress has > been blunted by the need to maintain a large number of legacy parsers, all > of which have some dead-end (i.e. not cognitively plausible) > characteristics. Furthermore, the Cyc NL parsing and NL generation systems > are at least two completely different bodies of code. Therefore Cyc is not > capable of understanding all of what it can say and vice versa. Moreover, > Cyc's NL system, similar to its other long-standing code components, is > quite large, and sadly demanding that the great majority of the developer's > time is spent maintaining, fixing, migrating, rewriting and tailoring the > existing code rather than adding new functionality. > > In previous posts here and on my blog, I have described the Texai system as > an English dialog system to achieve AGI via bootstrapping a small code > base. By extending OpenCyc's ontology, and in particular biasing it towards > the semantics of English language constructions, I hope to avoid some of the > problems I saw at Cycorp. > > Matt also said: > It seems to me the main limitation is that the language model has to be > described formally in Cycl, as a lexicon and rules for parsing and > disambiguation.t seems to me the main limitation is that the language model > has to be described formally in Cycl, as a lexicon and rules for parsing and > disambiguation. > > Agreed. For the Texai language model, I employ Fluid Construction Grammar > as the encoding, and Double-R Theory for its grammatical constructs. > Although I currently hand-write these in an symbolic-expression external > format, they are serialized into RDF from corresponding Java rule objects, > and stored in the Texai KB as any other assertion. My plan is to task the > dialog system first to interact with its mentors to acquire new vocabulary > (e.g. mappings from word senses to Cyc concepts, argument mappings to event > roles, etc.) and new grammar constructions (e.g. "on the table" as a phase > can have as one of its senses an instance of cyc:Negotiating). > > Those wanting to know more about Cyc should attend my Cyc tutorial at > AGI-09. Or you can download OpenCyc whose first release I lobbied for, > and then created with John DeOliveira while at Cycorp. > > -Steve > > Stephen L. Reed > > Artificial Intelligence Researcher > http://texai.org/blog > http://texai.org > 3008 Oak Crest Ave. > Austin, Texas, USA 78704 > 512.791.7860 > > ----- Original Message ---- > From: Matt Mahoney <[EMAIL PROTECTED]> > To: agi@v2.listbox.com > Sent: Sunday, September 28, 2008 8:38:36 PM > Subject: Re: [agi] universal logical form for natural language > > --- On Sun, 9/28/08, Ben Goertzel <[EMAIL PROTECTED]> wrote: > >>FYI, Cyc has a natural language front end and a lot of folks have been >> >working on it for the last 5+ years... > > It still needs work. I found this undated (2004 or later) white paper which > is apparently not linked from cyc.com. > http://www.cyc.com/doc/white_papers/KRAQ2005.pdf > > And also this overview. > http://www.cyc.com/cyc/cycrandd/areasofrandd_dir/nlu > > The overview claims to be able to convert natural language sentences into > Cycl assertions, and to convert questions to Cycl queries. So I wonder why > the knowledge base is still not being built this way. And I wonder why there > is no public demo of the interface, and no papers giving verifiable > experimental results. > > It seems to me the main limitation is that the language model has to be > described formally in Cycl, as a lexicon and rules for parsing and > disambiguation. There seems to be no mechanism for learning natural language > by example. For example, if Cyc receives a sentence it cannot parse, or is > ambiguous, or has a word not in its vocabulary or used in a different way, > then there is no mechanism to update the model, which is something humans > easily do. Given the complexity of English, I think this is a serious > limitation with no easy solution. > > -- Matt Mahoney, [EMAIL PROTECTED] > > > > > ------------------------------------------- > agi > Archives: https://www.listbox.com/member/archive/303/=now > RSS Feed: https://www.listbox.com/member/archive/rss/303/ > Modify Your Subscription: https://www.listbox.com/member/?& > Powered by Listbox: http://www.listbox.com > > ________________________________ > agi | Archives | Modify Your Subscription ------------------------------------------- agi Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/ Modify Your Subscription: https://www.listbox.com/member/?member_id=8660244&id_secret=114414975-3c8e69 Powered by Listbox: http://www.listbox.com