I wrote several LA parsers using Hausser's methods before. I understand LA-parsing very well. But I'm thinking that you may not understand LA-parsing. And you may not want to bother to learn it either, even though by learning about it you can show me why I should adopt your method. That's okay. If you want people to adopt your invention you've gotta sell them on it. Not the other way around. Roland's technique is to parse character by character, and have semantic information in a trie based dictionary, so that by the time you reach the last character you have a complete parse of the sentence. Maybe even several parses if there is ambiguity. (Do a search on NEWCAT parser. Never mind, here are the links:) http://books.google.com/books/about/NEWCAT_Parsing_Natural_Language_Using_Le.html?id=h0zHHAP79yoC http://books.google.com/books?id=wQse-vqdBzkC&source=gbs_similarbooks Roland's parsing with LA grammar is rapid. My request is for you to first seek to understand how NEWCAT and similar LA parsers work, then tell me why it is slower than your method. I haven't gotten that from you yet Steve. I hope we're not talking past one another, but I think we might be. ~PM Date: Thu, 21 Mar 2013 16:26:01 -0700 Subject: Re: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected]
PM, On Thu, Mar 21, 2013 at 1:58 PM, Piaget Modeler <[email protected]> wrote: Good, but I'm not convinced your approach is the fastest. If we are going to CONVINCINGLY compare approaches, then we need SOMEONE with detailed knowledge about the systems being compared, to be able to figure out one might be faster than the other. Are you there with Hausser's methods, or do you know someone who is there? How does it compare with Hausser's Left Associative parsing, I provided an explanation. Was my explanation deficient in some way? or using Trie trees to make dictionaries? Why use any tree, when hashed access is faster? Note that my method REQUIRES about a gigabyte of RAM to hold the Lexicon and all the working storage, which simply hasn't been available until fairly recently. Before then, other methods WERE faster, because my method couldn't have been crammed into smaller computers. My understanding is that Hausser's LA parsers are the fastest mode of parsing. What does Hausser's LA parsing lack that you address? 1. It works with characters, which Matt's recent test shows to be an order-of-magnitude slower than ordinals. 2. While it may be better than most other strategies, still, >99% of the tests it makes will fail, and hence won't affect output in any way. I eliminate the vast majority of tests that will fail, so that some fraction approaching half will succeed. It will be the SAME tests that succeed with or without my system. I just eliminate the failures. You are apparently working with someone else's opinion that Hausser's method is best, which it may have been when the opinion was rendered. I suggest recruiting the expert who expressed that opinion into this conversation, have him look over what I have written, and then we can have a really productive conversation about this. Steve Date: Wed, 20 Mar 2013 18:30:24 -0700 Subject: Re: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected] PM, Here is the essential point: My approach is to parsing what an operating system is to programming. You can implement ANY (that I have ever heard of) approach to parsing - it just runs 3-4 orders of magnitude faster when embedded in my structure. When I first started programming (on vacuum tube computers like the IBM-650 and IBM-709) there were no operating systems. Then a bright young engineer at IBM named Gene Amdahl figured out that some smart canned I/O routines could outperform the usual direct addressing of peripherals, by using leftover RAM as buffers. Suddenly, 709s were running MUCH faster, because programs no longer had to wait for their I/O. Similarly, the vast majority of things that ALL parsing methods check for are not there. This means that >99% of what they do is completely wasted, because it leads to NO output. Hausser's methods seek to implement new rules to effectively deal with word order scrambling and other anomalies that occur in real-world English. I describe some similar rules in the patent. However, there is no need to use the sample rule types I describe, as it is easy to implement ANY sorts of rules you can imagine. My method isn't perfect, as probably more than half of the rules will still not find what they are looking for. Of course, if a system could ever get to 100%, there would then no longer be any reason to perform the rules 8-:D> Regarding NELL, it appears that it learns things with language, rather that learning language. Steve ================= On Wed, Mar 20, 2013 at 5:55 PM, Piaget Modeler <[email protected]> wrote: There is also the NELL project which is already in flight. http://rtw.ml.cmu.edu/rtw/ http://www.cmu.edu/homepage/computing/2010/fall/nell-computer-that-learns.shtml It's aim is to read the web. ALL of it. How would your parsing approach be different from, enhance, or make obsolete the Never Ending Language Learning (NELL) system? Just curious? ~PM Date: Wed, 20 Mar 2013 15:18:22 -0700 Subject: [agi] AGI prospects for the next decade or two... From: [email protected] To: [email protected] Hi all, Has my previous posting made my implied point to everyone's satisfaction, that without my new parsing method, that there is NO presently known or suspected approach to parsing full-blown English fast enough to be practical, for at least another decade or two? Hence, there seems to be no useful purpose in anyone wasting their time building yet another ad hoc or table-driven parser that doesn't use my method, beyond proving my point with yet another failed NLP project. Things I am **NOT** saying include: 1. That more breakthroughs aren't necessary to achieve "understanding", though it is my gut feeling that present-day parsing technology would be adequate for most use, given another 3-4 orders of magnitude in both speed and rules count. 2. That this is easy. On the contrary, I suspect that ~1 man-decade of linguistic rules building would be needed. This would be needed regardless of the approach used, so this is NOT a disadvantage of my approach. Of course, this would take one person a decade of hard work to complete, but could be completed by an organized team in a year or two. So, is anyone (else) here interested in some sort of team effort to make this happen? Once completed, we already understand that this may be the most valuable software on the planet. If we can't get a critical mass going on this, then there would seem to be little reason to continue this forum for the next decade or two, beyond maybe discussing pie-in-the-sky plans for what might be done decades in the future, using presently unknown technologies, which would better be posted on the singularity forum and NOT here on the AGI forum. Am I missing anything here? Any interest? Steve AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription -- Full employment can be had with the stoke of a pen. Simply institute a six hour workday. That will easily create enough new jobs to bring back full employment. AGI | Archives | Modify Your Subscription AGI | Archives | Modify Your Subscription -- Full employment can be had with the stoke of a pen. Simply institute a six hour workday. That will easily create enough new jobs to bring back full employment. AGI | Archives | Modify Your Subscription ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
