This may be a better link: http://www.amazon.com/NEWCAT-Parsing-Language-Left-Associative-Computer/dp/3540167811/ref=sr_1_3?s=books&ie=UTF8&qid=1363976391&sr=1-3
NEWCAT : Parsing Natural Language using Left-Associative Grammar (Lecture Notes in Computer Science) From the Introduction: This book describes a left-associative approach to the syntax and semantics of natural language. A left associative system analyzes a sentence from left to right, first combining word 1 and word 2, then adding word 3, then addingword 4, etc., until there are no more next words. Conceptually, the left-associative approach is based on the notionof possible continuations: after word n has been added, the grammar specifies precisely what the categories of word n+1 may be. The formal description of the possible continuations at the end of a 'sentence start' may be used to choose a grammatically compatible "next word" (generation), or it may be used to decide whether a given next word is grammatically compatible with the sentence start (parsing). Left-associative grammar is suited equally well for generation and for parsing. Analyzing a language in a linear, left-associative fashion in terms of possible continuations represents a substantialdeparture from contemporary linguistic analysis, which works in terms of constituent structures. Constituent structureanalysis takes place in the theoretical space between the root of the constituent structure tree (usually called the S-node),representing an abstract category, and the elaves of the tree, representing the concrete words of teh sentence (called theterminal symbols). Constituent structure analysis views the whole sentence by looking from the root of the tree to the terminal symbols (top-down analysis), or from the terminal symols to the root of the tree (bottom-up analysis). Left-associative analysis, on the other hand, takes place in the theoretical space between the first and last words of a sentence or text. The only combinations permitted are between sentence starts and next words. The resulting trees areof a compeletely regular, left-associative nature. The 'root' of a left-associative tree is not an abstract start symbol, but the result of the last combination of a sentence start and a next word. Left-associative trees are built only from the bottom up; every combination of a sentence star and a next word results in a new 'root'.... The basic idea of left-associative parsing was implemented as a LISP-program in December 1984. After returning to CSLI in March, 1985, the linguistic scope of the parser expanded very quickly. NEWCAT (for 'NEW CATegorial apprach')handles the word order of German in declarative and interrogative main causes with and without auxiliaries, as well as ni subordinate caluses. It handles all free word order variations, center embedded relative caluses of arbitrary depth, extraposed relative caluses, auxiliaries, modals, passive voice in main and subordinate caluses, multiple infinitives,conjunction, gapping, obilgatory and optional adverbs, adverbial caluses, perpositional clauses, discontinuous elements,and the agreement between determiners, adjectives, nouns and verbs. In May 1985 the parser NEWCAT was demenstratedat Stanford University and the Stanford Research Institute..... And here's another link for his follow up book: http://www.amazon.com/gp/product/3540508821/sr=1-1/qid=1363976520/ref=olp_product_details?ie=UTF8&me=&qid=1363976520&seller=&sr=1-1 Computation of Language by Roland Hausser Publication Date: July 25, 1989 | ISBN-10: 3540508821 | ISBN-13: 978-3540508823 | Edition: 1This book analyzes the functioning of natural language in communication. The resulting system, called left-associative grammar (LA-grammar), incorporates the basic input-output conditions of speech as (i) a strictly time-linear (left-associative) derivation order, and (ii) a decidable, bidirectional mapping between the surface of sentences and their meaning. The new algorithm of LA-grammar computes possible continuations in contrast to most contemporary systems (e.g., phrase structure grammar), which are based on possible substitutions. The regular, context-free, and context-sensitive languages are reconstructed in LA-grammar, and questions of generative power, decidability, and computational complexity are explored in detail. It is proven that LA-grammar generates all - and only - the recursive languages, and that LA-grammar is more efficient computationally than corresponding substitution systems. The linguistic, mathematical, and computational analysis of natural (and formal) languages is followed by a philosophical discussion of communication. Topics are the theory of signs; the nature of reference; the role of ontology, truth, and the metalanguage; the nature of presuppositions and vagueness; the purpose of logic in meaning analysis; and the function of a semantically interpreted language in a speaking robot. An appendix illustrates the application of LA-grammar to natural language parsing with a large fragment of semantically interpreted English, implemented in LISP. As a comprehensive theory of grammar and the foundations of communication, this book is relevant for all applications of natural language processing, such as information retrieval, database interfaces, content analysis, database maintenance and up-scaling, dialog systems, machine translation, and foreign language teaching. I suggest starting with these two books first. ~PM Date: Thu, 21 Mar 2013 23:29:14 -0700 Subject: Re: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected] PM, I tried the hyperlink, got to a book, and it fell apart on the second paragraph. I states that "The man who saw a movie yesterday" and "The man and John" are grammatically identical, in that they can both be continued the same way. Now, try putting each in front of " are both idiots." e.g. "The man who saw a movie yesterday are both idiots." doesn't make sense, whereas there is no problem with "The man and John are both idiots". It is hard to get into a book that falls apart SO quickly - on its very first example. Didn't this guy have anyone review his writings? OK, I scanned the table of contents, read a few pages, etc. I think I kinda see what he is doing, so maybe we can discuss this... Continuing... On Thu, Mar 21, 2013 at 9:45 PM, Piaget Modeler <[email protected]> wrote: I wrote several LA parsers using Hausser's methods before. GREAT. Hopefully you will make more sense than the book. For NATURAL language? About half the sentences in the wild cannot be diagrammed, because they are missing important words. "Grammatical" is a high standard that is rarely met in the real world. How does LA parsing work for these? What did your parsers do? I understand LA-parsing very well. But I'm thinking that you may not understand LA-parsing. Guilty as charged. And you may not want to bother to learn it either, even though by learning about it you can show me why I should adopt your method. I think we will have to work together on this. Is there something somewhere that can be used to compute approximate memory requirements and performance? I didn't see anything in the table of contents. With my approach it isn't hard to guesstimate, but of course that doesn't mean that it wins the race. That's okay. If you want people to adopt your invention you've gotta sell them on it. I agree. I should probably include some performance guesstimates. Not the other way around. Roland's technique is to parse character by character, It is hard to believe that any character-by-character approach wouldn't be at least an order of magnitude or two slower than hashing words to ordinals, and then computing on entire words with integer operations. Further, ONLY those rules where the least likely elements are present need even be performed. For a full scale implementation, this would probably be just a handful of rules per word, where each rule compiles into a few integer operations. In short, simplistic hashing is a significant part of the job. and have semantic information in a trie based dictionary, How does disambiguation work, as most words can have multiple meanings? How much memory is that going to take for, say, 100,000 words of English lexicon and associated rules? so that by the time you reach the last character you have a complete parse of the sentence. I'm not sure what you mean by a "complete parse". Depending on who is writing (or saying) it, as much as half of English sentences are hard/impossible to parse, yet you can often answer questions from the fragments that do make sense, even though they are embedded in text that doesn't make sense. I have been in many discussions over this, some on this forum. When challenged, I start taking people's own "clear" statements and showing that they can equally well mean completely unintended things. Disambiguation seems simple to us because we have a really complex disambiguation mechanism behind our eyeballs. I wasn't expecting to parse all input, just the input that provided information important to the application. Language translation is different, in that they really DO need the detailed structure, e.g. which adjective or noun is being modified by a particular adverb, which typically requires an ontological approach, where there is a complex representation of the characteristics of various objects that can be stated. for example what is being talked about in the noun phrase "the red fire truck", is this: 1. A truck that carries red fires? Fires can certainly be red. 2. A red truck that carries fires? Trucks can carry almost anything, and I doubt that anyone is going to bother entering the exceptions into a dictionary. 3. A special kind of truck called a fire truck that is red? This is probably the intended meaning, because we know that fire trucks ARE a special kind of truck. No, I a not trying to be perverse. Remember the "firemen" in 1984? They were men who made fires. Maybe even several parses if there is ambiguity. Pretty much all bottom-up methods (including my method) do that. (Do a search on NEWCAT parser. Never mind, here are the links:) http://books.google.com/books/about/NEWCAT_Parsing_Natural_Language_Using_Le.html?id=h0zHHAP79yoC http://books.google.com/books?id=wQse-vqdBzkC&source=gbs_similarbooks Roland's parsing with LA grammar is rapid. My request is for you to first seek to understand how NEWCAT and similar LA parsers work, then tell me why it is slower than your method. I haven't gotten that from you yet I (and everyone else) needs some simple information like what it does, what it does NOT do, how fast it runs, how much memory it takes, what it does with broken structure, how it handles disambiguation and ontological issues (like the red fire truck), how easy it is to program, etc. Steve. I hope we're not talking past one another, but I think we might be. Of course we are. Now we must get down to the nuts and bolts of both systems. Hey, if we take our lemons and make lemonade, we might be onto something really valuable here. We need some way of rating parsers - which could be used for nearly all parsers, like something you might see in a Consumers Reports magazine of the distant future. Logan, you, and I could provide three lines in the side-by-side comparison. I should probably work on a list of potential capabilities, features, and limitations. Steve Date: Thu, 21 Mar 2013 16:26:01 -0700 Subject: Re: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected] PM, On Thu, Mar 21, 2013 at 1:58 PM, Piaget Modeler <[email protected]> wrote: Good, but I'm not convinced your approach is the fastest. If we are going to CONVINCINGLY compare approaches, then we need SOMEONE with detailed knowledge about the systems being compared, to be able to figure out one might be faster than the other. Are you there with Hausser's methods, or do you know someone who is there? How does it compare with Hausser's Left Associative parsing, I provided an explanation. Was my explanation deficient in some way? or using Trie trees to make dictionaries? Why use any tree, when hashed access is faster? Note that my method REQUIRES about a gigabyte of RAM to hold the Lexicon and all the working storage, which simply hasn't been available until fairly recently. Before then, other methods WERE faster, because my method couldn't have been crammed into smaller computers. My understanding is that Hausser's LA parsers are the fastest mode of parsing. What does Hausser's LA parsing lack that you address? 1. It works with characters, which Matt's recent test shows to be an order-of-magnitude slower than ordinals. 2. While it may be better than most other strategies, still, >99% of the tests it makes will fail, and hence won't affect output in any way. I eliminate the vast majority of tests that will fail, so that some fraction approaching half will succeed. It will be the SAME tests that succeed with or without my system. I just eliminate the failures. You are apparently working with someone else's opinion that Hausser's method is best, which it may have been when the opinion was rendered. I suggest recruiting the expert who expressed that opinion into this conversation, have him look over what I have written, and then we can have a really productive conversation about this. Steve Date: Wed, 20 Mar 2013 18:30:24 -0700 Subject: Re: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected] PM, Here is the essential point: My approach is to parsing what an operating system is to programming. You can implement ANY (that I have ever heard of) approach to parsing - it just runs 3-4 orders of magnitude faster when embedded in my structure. When I first started programming (on vacuum tube computers like the IBM-650 and IBM-709) there were no operating systems. Then a bright young engineer at IBM named Gene Amdahl figured out that some smart canned I/O routines could outperform the usual direct addressing of peripherals, by using leftover RAM as buffers. Suddenly, 709s were running MUCH faster, because programs no longer had to wait for their I/O. Similarly, the vast majority of things that ALL parsing methods check for are not there. This means that >99% of what they do is completely wasted, because it leads to NO output. Hausser's methods seek to implement new rules to effectively deal with word order scrambling and other anomalies that occur in real-world English. I describe some similar rules in the patent. However, there is no need to use the sample rule types I describe, as it is easy to implement ANY sorts of rules you can imagine. My method isn't perfect, as probably more than half of the rules will still not find what they are looking for. Of course, if a system could ever get to 100%, there would then no longer be any reason to perform the rules 8-:D> Regarding NELL, it appears that it learns things with language, rather that learning language. Steve ================= On Wed, Mar 20, 2013 at 5:55 PM, Piaget Modeler <[email protected]> wrote: There is also the NELL project which is already in flight. http://rtw.ml.cmu.edu/rtw/ http://www.cmu.edu/homepage/computing/2010/fall/nell-computer-that-learns.shtml It's aim is to read the web. ALL of it. How would your parsing approach be different from, enhance, or make obsolete the Never Ending Language Learning (NELL) system? Just curious? ~PM Date: Wed, 20 Mar 2013 15:18:22 -0700 Subject: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected] Hi all, Has my previous posting made my implied point to everyone's satisfaction, that without my new parsing method, that there is NO presently known or suspected approach to parsing full-blown English fast enough to be practical, for at least another decade or two? Hence, there seems to be no useful purpose in anyone wasting their time building yet another ad hoc or table-driven parser that doesn't use my method, beyond proving my point with yet another failed NLP project. Things I am **NOT** saying include: 1. That more breakthroughs aren't necessary to achieve "understanding", though it is my gut feeling that present-day parsing technology would be adequate for most use, given another 3-4 orders of magnitude in both speed and rules count. 2. That this is easy. On the contrary, I suspect that ~1 man-decade of linguistic rules building would be needed. This would be needed regardless of the approach used, so this is NOT a disadvantage of my approach. Of course, this would take one person a decade of hard work to complete, but could be completed by an organized team in a year or two. So, is anyone (else) here interested in some sort of team effort to make this happen? Once completed, we already understand that this may be the most valuable software on the planet. If we can't get a critical mass going on this, then there would seem to be little reason to continue this forum for the next decade or two, beyond maybe discussing pie-in-the-sky plans for what might be done decades in the future, using presently unknown technologies, which would better be posted on the singularity forum and NOT here on the AGI forum. Am I missing anything here? Any interest? Steve AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription -- Full employment can be had with the stoke of a pen. Simply institute a six hour workday. That will easily create enough new jobs to bring back full employment. AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription -- Full employment can be had with the stoke of a pen. Simply institute a six hour workday. That will easily create enough new jobs to bring back full employment. AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription -- Full employment can be had with the stoke of a pen. Simply institute a six hour workday. That will easily create enough new jobs to bring back full employment. AGI | Archives | Modify Your Subscription ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
