Logan,

One (and perhaps both) of us is aware of something important that the other
has missed. I will respond literally to your comments, in the hope that you
can see what is needed to clarify your assertions.

I suspect that your comments regarding RADP may related to methods that use
RADP to process characters in the input, rather than to methods where the
input is first tokenized into word ordinals before serious parsing
commences. Once you start processing characters in NL, you have lost the
battle.

Continuing...
On Tue, Mar 19, 2013 at 6:59 AM, Logan Streondj <[email protected]> wrote:

>
> On Mon, Mar 18, 2013 at 7:05 PM, Steve Richfield <
> [email protected]> wrote:
>
>> Logan,
>>
>> Interesting you noticed the simile between this and recursive
>> ascent-descent parsing.
>>
>> Note that my comments below apply ONLY to NL, and **NOT** to computer
>> languages like C. NL has orders of magnitude more variation in structure
>> than computer languages, and it is those structural variations, acting in
>> recursion, that have been the speed trap for NLP. If I were going to write
>> a C compiler, I would NOT use the methods I am describing here.
>>
>> In a way, the simplest use of this IS a form of recursive ascent parsing,
>> only the evaluations of every rule that is missing its least frequently
>> used component (which is the VAST majority of rules) is omitted, and hence
>> has no cost in machine time.
>>
>> in RADP this is default behavior anyways, it only looks for things if
> they are required.
>

However, in my system, only ~twice the number of things that are actually
present need even be tested for, whereas RADP must look for everything that
might be present. It is the ratio between "are" and "might" that is the
basis for the speedup.

>
>
>> The equivalent of the "descent" part is easily handled by having early
>> rules set flags within appropriate scopes, to later be tested by
>> lower-level rules. This allows unlimited zig-zagging, ascending up and
>> descending down as needed to parse almost anything, as in recursive
>> ascent-descent.
>>
>> table-driven and recursive approach are completely different.
> table-driven simply is not and never was scalable, due to it's inherent
> complexity and context-free nature.
>

My approach emulates a limited recursive approach using queues, which at
first glance looks like a kludge (and might actually be a kludge if there
turns out to be a better way). The simple expedient of stepping the queue
numbers to emulate recursion makes my approach fully recursive, which
doesn't appear to be necessary to process NL.

>
>
>> I suspect that the best implementation of this will involve coding rules
>> as though they were to be executed in recursive ascent-descent fashion, but
>> instead compiling those rules to actually executed as I describe.
>>
>
> that doesn't make sense,
> since RADP code is faster by default than table driven code.
>

You still haven't grokked that my approach doesn't cleanly fall into either
category (but is more closely RADP than table-driven), the leverage that
least frequently used (LFU) word triggering provides, or that clever
compilers can often recast statements made in one paradigm into a different
execution paradigm.

LFU word triggering is worth orders of magnitude in speed, and this can NOT
be incorporated into present RADP methods because they require the rules to
be performed in intricate order to guide the parsing logic, when >99% of
those rules don't find what they are looking for. Granted that RADP's
"batting average" on rules finding what they are looking for is better than
other conventional methods for compiling COMPUTER languages, but LFU is
MUCH better for processing NL.

>
> All forms of parsing, including recursive ascent-descent, first require
>> that every token be fully characterized - and there is no faster way than
>> with the floating point method I described, in or out of hand-coded
>> assembly language.
>>
>
> you don't know RADP,
>

I have never used it, just watched from afar, and I suspect most of the
people on this forum haven't used it, so I recommend that you sprinkle some
mini-tutorials and examples in with your postings, for all of our benefits.

it doesn't require "full characterization" or any really if you uses spaces
> between words. Tokenizing may be necessary for languages that don't have
> spaces such as arabic, or perhaps Chinese, but only to seperate them into
> words with spaces.
>

> otherwise the words themselves are all that's necessary.
>

Once you start processing characters, rather than ordinals or other
higher-level tokens representing words or concepts of some sort, you
discard 1-2 orders of magnitude in speed.

>
>
>> Beyond that, there is still another 2-3 orders of magnitude to be gained
>> by omitting the evaluation of >99% of all rules that would be evaluated by
>> other methods that evaluate all rules that are in their recursive path,
>> rather than ONLY the rules that are known to have their least likely
>> criteria met.
>>
>
> See this is a major difference in RADP there is no "table of rules", so
> there is no wasting time on it either.
>

You misunderstand something, and I think it is at the very bottom-level
rules. There is no more "table" in my approach than in RADP. There are
syntax equations in both approaches, that RADP links together to execute in
selective linked order. I also link them together, but the lowest-level
rules are only executed when their least likely elements are present,
whereas RADP must execute all lowest-level rules that might possibly be
satisfied. RADP's advantage is at the upper levels, but the BIG opportunity
for speedup is at the bottom level. Eliminating the need to perform the
vast majority of bottom-level rules then goes on to GREATLY "trim the tree"
of mid-level and high-level rules that need evaluation.


> There are simply a variety of functions that process components,
> which only operate if they are requested, and find the component.
>

The bugaboo in the above is in "are requested", that is used instead of
"already have their least frequently used components present". There is no
reason to bother evaluating any rule that can be determined for free (zero
cycles) not to have its least frequently used component present, whereas
RADP goes ahead and performs those rules, only to discover in the process
of performing them that theier least frequently used component is missing.

>
> To continue this discussion, I would have to understand how any
>> competitive system that follows a recursive path that involves evaluating
>> every rule in its recursive path, could possibly compete with a system that
>> can summarily omit evaluating >99% of the rules?
>>
>
> because it doesn't evaluate such rules.
> there are no parse-trees, or abstract do hickies,
> it's plain and simple how humans would parse things.
> read sentence, figure out verb, then the subject or object and any
> relevant cases to pass to it as arguments.
>

The first thing I had to learn with I first started working on NLP was to
discard all of the things I thought I knew about grammar. Only about half
of the sentences "in the wild" are even diagrammable, and most those still
have serious disambiguation issues due to omitted words. In short, people
don't work at all like you suspect.

Maybe I see our disconnect here. Looking at a Google search, all of the
RADP discussions I found have to do with parsing CHARACTER strings, which
in Western languages is an unnecessary waste of time when characters can be
directly translated into hashed words with just a few machine cycles per
character, as is shown in Figure 4 of the patent. Once this translation has
taken place, and the hashes have been converted to ordinals, all subsequent
processing is on integers that represent whole words, not characters as in
classical RADP.

I suspect the above, and your apparent erroneous presumption that ALL
parsing MUST consider (at non-zero cost) the possibility of everything that
could be present in the context being present.

I have "stood on my head" to relate RADP to what I have done. Now, you need
to do the same for me. This starts with:

1.  Seeing that dealing with ordinals is MUCH more efficient than dealing
with characters, regardless of the methodology applied. RADP could work
directly with words, which is what I first thought you were referring to,
akin to having a gigantic alphabet, but there would be some interesting
challenges. Even then, RADP would have to consider all low-level rules that
incorporate a particular word, rather than just the low-level rules for
which a particular word is the least frequently used word in the rule. This
"little" difference, though worth little in computer languages, is probably
worth 2-3 orders of magnitude in speed in NL.

2.  Understanding that NL is entirely different from computer languages in
ways that cause serious scaling problems for conventional parsing methods.
Unlike computer languages, every word in NL is a "reserved word" having
"complex behavior". RADP is better than other conventional methods, but it
still has the same serious scaling issues.

3.  Understanding how it is possible to "perform" >99% of the syntax
equations that will evaluate to be false, without expending a single
machine cycle. This applies to any set of them you wish to consider,
whether in RADP or other methods. Granted RADP greatly reduces the number
of syntax equations that need evaluation, but then eliminating >99% of the
remainder is worth a LOT.

If your world model contains the element "RADP is great" akin to Allah
Akbar, it won't be possible to discuss the places where RADP would
needlessly waste >99% of its time when processing NL. You really need to
understand what I am doing.

Steve



-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

Reply via email to