Logan, One (and perhaps both) of us is aware of something important that the other has missed. I will respond literally to your comments, in the hope that you can see what is needed to clarify your assertions.
I suspect that your comments regarding RADP may related to methods that use RADP to process characters in the input, rather than to methods where the input is first tokenized into word ordinals before serious parsing commences. Once you start processing characters in NL, you have lost the battle. Continuing... On Tue, Mar 19, 2013 at 6:59 AM, Logan Streondj <[email protected]> wrote: > > On Mon, Mar 18, 2013 at 7:05 PM, Steve Richfield < > [email protected]> wrote: > >> Logan, >> >> Interesting you noticed the simile between this and recursive >> ascent-descent parsing. >> >> Note that my comments below apply ONLY to NL, and **NOT** to computer >> languages like C. NL has orders of magnitude more variation in structure >> than computer languages, and it is those structural variations, acting in >> recursion, that have been the speed trap for NLP. If I were going to write >> a C compiler, I would NOT use the methods I am describing here. >> >> In a way, the simplest use of this IS a form of recursive ascent parsing, >> only the evaluations of every rule that is missing its least frequently >> used component (which is the VAST majority of rules) is omitted, and hence >> has no cost in machine time. >> >> in RADP this is default behavior anyways, it only looks for things if > they are required. > However, in my system, only ~twice the number of things that are actually present need even be tested for, whereas RADP must look for everything that might be present. It is the ratio between "are" and "might" that is the basis for the speedup. > > >> The equivalent of the "descent" part is easily handled by having early >> rules set flags within appropriate scopes, to later be tested by >> lower-level rules. This allows unlimited zig-zagging, ascending up and >> descending down as needed to parse almost anything, as in recursive >> ascent-descent. >> >> table-driven and recursive approach are completely different. > table-driven simply is not and never was scalable, due to it's inherent > complexity and context-free nature. > My approach emulates a limited recursive approach using queues, which at first glance looks like a kludge (and might actually be a kludge if there turns out to be a better way). The simple expedient of stepping the queue numbers to emulate recursion makes my approach fully recursive, which doesn't appear to be necessary to process NL. > > >> I suspect that the best implementation of this will involve coding rules >> as though they were to be executed in recursive ascent-descent fashion, but >> instead compiling those rules to actually executed as I describe. >> > > that doesn't make sense, > since RADP code is faster by default than table driven code. > You still haven't grokked that my approach doesn't cleanly fall into either category (but is more closely RADP than table-driven), the leverage that least frequently used (LFU) word triggering provides, or that clever compilers can often recast statements made in one paradigm into a different execution paradigm. LFU word triggering is worth orders of magnitude in speed, and this can NOT be incorporated into present RADP methods because they require the rules to be performed in intricate order to guide the parsing logic, when >99% of those rules don't find what they are looking for. Granted that RADP's "batting average" on rules finding what they are looking for is better than other conventional methods for compiling COMPUTER languages, but LFU is MUCH better for processing NL. > > All forms of parsing, including recursive ascent-descent, first require >> that every token be fully characterized - and there is no faster way than >> with the floating point method I described, in or out of hand-coded >> assembly language. >> > > you don't know RADP, > I have never used it, just watched from afar, and I suspect most of the people on this forum haven't used it, so I recommend that you sprinkle some mini-tutorials and examples in with your postings, for all of our benefits. it doesn't require "full characterization" or any really if you uses spaces > between words. Tokenizing may be necessary for languages that don't have > spaces such as arabic, or perhaps Chinese, but only to seperate them into > words with spaces. > > otherwise the words themselves are all that's necessary. > Once you start processing characters, rather than ordinals or other higher-level tokens representing words or concepts of some sort, you discard 1-2 orders of magnitude in speed. > > >> Beyond that, there is still another 2-3 orders of magnitude to be gained >> by omitting the evaluation of >99% of all rules that would be evaluated by >> other methods that evaluate all rules that are in their recursive path, >> rather than ONLY the rules that are known to have their least likely >> criteria met. >> > > See this is a major difference in RADP there is no "table of rules", so > there is no wasting time on it either. > You misunderstand something, and I think it is at the very bottom-level rules. There is no more "table" in my approach than in RADP. There are syntax equations in both approaches, that RADP links together to execute in selective linked order. I also link them together, but the lowest-level rules are only executed when their least likely elements are present, whereas RADP must execute all lowest-level rules that might possibly be satisfied. RADP's advantage is at the upper levels, but the BIG opportunity for speedup is at the bottom level. Eliminating the need to perform the vast majority of bottom-level rules then goes on to GREATLY "trim the tree" of mid-level and high-level rules that need evaluation. > There are simply a variety of functions that process components, > which only operate if they are requested, and find the component. > The bugaboo in the above is in "are requested", that is used instead of "already have their least frequently used components present". There is no reason to bother evaluating any rule that can be determined for free (zero cycles) not to have its least frequently used component present, whereas RADP goes ahead and performs those rules, only to discover in the process of performing them that theier least frequently used component is missing. > > To continue this discussion, I would have to understand how any >> competitive system that follows a recursive path that involves evaluating >> every rule in its recursive path, could possibly compete with a system that >> can summarily omit evaluating >99% of the rules? >> > > because it doesn't evaluate such rules. > there are no parse-trees, or abstract do hickies, > it's plain and simple how humans would parse things. > read sentence, figure out verb, then the subject or object and any > relevant cases to pass to it as arguments. > The first thing I had to learn with I first started working on NLP was to discard all of the things I thought I knew about grammar. Only about half of the sentences "in the wild" are even diagrammable, and most those still have serious disambiguation issues due to omitted words. In short, people don't work at all like you suspect. Maybe I see our disconnect here. Looking at a Google search, all of the RADP discussions I found have to do with parsing CHARACTER strings, which in Western languages is an unnecessary waste of time when characters can be directly translated into hashed words with just a few machine cycles per character, as is shown in Figure 4 of the patent. Once this translation has taken place, and the hashes have been converted to ordinals, all subsequent processing is on integers that represent whole words, not characters as in classical RADP. I suspect the above, and your apparent erroneous presumption that ALL parsing MUST consider (at non-zero cost) the possibility of everything that could be present in the context being present. I have "stood on my head" to relate RADP to what I have done. Now, you need to do the same for me. This starts with: 1. Seeing that dealing with ordinals is MUCH more efficient than dealing with characters, regardless of the methodology applied. RADP could work directly with words, which is what I first thought you were referring to, akin to having a gigantic alphabet, but there would be some interesting challenges. Even then, RADP would have to consider all low-level rules that incorporate a particular word, rather than just the low-level rules for which a particular word is the least frequently used word in the rule. This "little" difference, though worth little in computer languages, is probably worth 2-3 orders of magnitude in speed in NL. 2. Understanding that NL is entirely different from computer languages in ways that cause serious scaling problems for conventional parsing methods. Unlike computer languages, every word in NL is a "reserved word" having "complex behavior". RADP is better than other conventional methods, but it still has the same serious scaling issues. 3. Understanding how it is possible to "perform" >99% of the syntax equations that will evaluate to be false, without expending a single machine cycle. This applies to any set of them you wish to consider, whether in RADP or other methods. Granted RADP greatly reduces the number of syntax equations that need evaluation, but then eliminating >99% of the remainder is worth a LOT. If your world model contains the element "RADP is great" akin to Allah Akbar, it won't be possible to discuss the places where RADP would needlessly waste >99% of its time when processing NL. You really need to understand what I am doing. Steve ------------------------------------------- AGI Archives: https://www.listbox.com/member/archive/303/=now RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424 Modify Your Subscription: https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657 Powered by Listbox: http://www.listbox.com
