RE: [agi] AGI prospects for the next decade or two...

Piaget Modeler Fri, 22 Mar 2013 11:48:50 -0700

This may be a better link: 
http://www.amazon.com/NEWCAT-Parsing-Language-Left-Associative-Computer/dp/3540167811/ref=sr_1_3?s=books&ie=UTF8&qid=1363976391&sr=1-3



NEWCAT : Parsing Natural Language using Left-Associative Grammar (Lecture Notes 
in Computer Science)
From the Introduction:
This book describes a left-associative approach to the syntax and semantics of 
natural language. A left associative system analyzes a sentence from left to 
right, first combining word 1 and word 2, then adding word 3, then addingword 
4, etc., until there are no more next words. Conceptually, the left-associative 
approach is based on the notionof possible continuations: after word n has been 
added, the grammar specifies precisely what the categories of word n+1 may be.  
   The formal description of the possible continuations at the end of a 
'sentence start' may be used to choose a grammatically compatible "next word" 
(generation), or it may be used to decide whether a given next word is 
grammatically compatible with the sentence start (parsing).  Left-associative 
grammar is suited equally well for generation and for parsing.     Analyzing a 
language in a linear, left-associative fashion in terms of possible 
continuations represents a substantialdeparture from contemporary linguistic 
analysis, which works in terms of constituent structures.  Constituent 
structureanalysis takes place in the theoretical space between the root of the 
constituent structure tree (usually called the S-node),representing an abstract 
category, and the elaves of the tree, representing the concrete words of teh 
sentence (called theterminal symbols).  Constituent structure analysis views 
the whole sentence by looking from the root of the tree to the terminal symbols 
(top-down analysis), or from the terminal symols to the root of the tree 
(bottom-up analysis).     Left-associative analysis, on the other hand, takes 
place in the theoretical space between the first and last words of a sentence 
or text.  The only combinations permitted are between sentence starts and next 
words. The resulting trees areof a compeletely regular, left-associative 
nature.  The 'root' of a left-associative tree is not an abstract start symbol, 
but the result of the last combination of a sentence start and a next word.  
Left-associative trees are built only from the bottom up; every combination of 
a sentence star and a next word results in a new 'root'....     The basic idea 
of left-associative parsing was implemented as a LISP-program in December 1984. 
 After returning to CSLI in March, 1985, the linguistic scope of the parser 
expanded very quickly.  NEWCAT (for 'NEW CATegorial apprach')handles the word 
order of German in declarative and interrogative main causes with and without 
auxiliaries, as well as ni subordinate caluses.  It handles all free word order 
variations, center embedded relative caluses of arbitrary depth, extraposed 
relative caluses, auxiliaries, modals, passive voice in main and subordinate 
caluses, multiple infinitives,conjunction, gapping, obilgatory and optional 
adverbs, adverbial caluses, perpositional clauses, discontinuous elements,and 
the agreement between determiners, adjectives, nouns and verbs. In May 1985 the 
parser NEWCAT was demenstratedat Stanford University and the Stanford Research 
Institute.....

And here's another link for his follow up book: 
http://www.amazon.com/gp/product/3540508821/sr=1-1/qid=1363976520/ref=olp_product_details?ie=UTF8&me=&qid=1363976520&seller=&sr=1-1


Computation of Language by Roland Hausser
Publication Date: July 25, 1989 | ISBN-10: 3540508821 | ISBN-13: 978-3540508823 
| Edition: 1This book analyzes the functioning of natural language in 
communication. The resulting system, called left-associative grammar 
(LA-grammar), incorporates the basic input-output conditions of speech as (i) a 
strictly time-linear (left-associative) derivation order, and (ii) a decidable, 
bidirectional mapping between the surface of sentences and their meaning. The 
new algorithm of LA-grammar computes possible continuations in contrast to most 
contemporary systems (e.g., phrase structure grammar), which are based on 
possible substitutions. The regular, context-free, and context-sensitive 
languages are reconstructed in LA-grammar, and questions of generative power, 
decidability, and computational complexity are explored in detail. It is proven 
that LA-grammar generates all - and only - the recursive languages, and that 
LA-grammar is more efficient computationally than corresponding substitution 
systems. The linguistic, mathematical, and computational analysis of natural 
(and formal) languages is followed by a philosophical discussion of 
communication. Topics are the theory of signs; the nature of reference; the 
role of ontology, truth, and the metalanguage; the nature of presuppositions 
and vagueness; the purpose of logic in meaning analysis; and the function of a 
semantically interpreted language in a speaking robot. An appendix illustrates 
the application of LA-grammar to natural language parsing with a large fragment 
of semantically interpreted English, implemented in LISP. As a comprehensive 
theory of grammar and the foundations of communication, this book is relevant 
for all applications of natural language processing, such as information 
retrieval, database interfaces, content analysis, database maintenance and 
up-scaling, dialog systems, machine translation, and foreign language teaching.


I suggest starting with these two books first.
~PM
Date: Thu, 21 Mar 2013 23:29:14 -0700
Subject: Re: [agi] AGI prospects for the next decade or two...
From: [email protected]
To: [email protected]

PM,

I tried the hyperlink, got to a book, and it fell apart on the second 
paragraph. I states that "The man who saw a movie yesterday" and "The man and 
John" are grammatically identical, in that they can both be continued the same 
way. Now, try putting each in front of " are both idiots." e.g. "The man who 
saw a movie yesterday are both idiots." doesn't make sense, whereas there is no 
problem with "The man and John are both idiots".


It is hard to get into a book that falls apart SO quickly - on its very first 
example. Didn't this guy have anyone review his writings?

OK, I scanned the table of contents, read a few pages, etc. I think I kinda see 
what he is doing, so maybe we can discuss this...


Continuing...
On Thu, Mar 21, 2013 at 9:45 PM, Piaget Modeler <[email protected]> 
wrote:




I wrote several LA parsers using Hausser's methods before.
GREAT. Hopefully you will make more sense than the book.

For NATURAL language? About half the sentences in the wild cannot be 
diagrammed, because they are missing important words. "Grammatical" is a high 
standard that is rarely met in the real world. How does LA parsing work for 
these? 


What did your parsers do?

I understand LA-parsing very well.

But I'm thinking that you may not understand LA-parsing.
Guilty as charged.
 

And you may not want  to bother to learn it either, even though by learning 
about it you can show me why I should adopt your method.
I think we will have to work together on this. Is there something somewhere 
that can be used to compute approximate memory requirements and performance? I 
didn't see anything in the table of contents. With my approach it isn't hard to 
guesstimate, but of course that doesn't mean that it wins the race.


That's okay.  If you want people to adopt your invention you've gotta sell them 
on it.

I agree. I should probably include some performance guesstimates.
 

Not the other way around.
Roland's technique is to parse character by character,
It is hard to believe that any character-by-character approach wouldn't be at 
least an order of magnitude or two slower than hashing words to ordinals, and 
then computing on entire words with integer operations. Further, ONLY those 
rules where the least likely elements are present need even be performed. For a 
full scale implementation, this would probably be just a handful of rules per 
word, where each rule compiles into a few integer operations. In short, 
simplistic hashing is a significant part of the job.

 
and have semantic information in a trie based dictionary,

How does disambiguation work, as most words can have multiple meanings? 

How much memory is that going to take for, say, 100,000 words of English 
lexicon and associated rules?
 

so that by the time you reach the last character you have a complete parse of 
the sentence.
I'm not sure what you mean by a "complete parse". Depending on who is writing 
(or saying) it, as much as half of English sentences are hard/impossible to 
parse, yet you can often answer questions from the fragments that do make 
sense, even though they are embedded in text that doesn't make sense.


I have been in many discussions over this, some on this forum. When challenged, 
I start taking people's own "clear" statements and showing that they can 
equally well mean completely unintended things. Disambiguation seems simple to 
us because we have a really complex disambiguation mechanism behind our 
eyeballs.


I wasn't expecting to parse all input, just the input that provided information 
important to the application.

Language translation is different, in that they really DO need the detailed 
structure, e.g. which adjective or noun is being modified by a particular 
adverb, which typically requires an ontological approach, where there is a 
complex representation of the characteristics of various objects that can be 
stated. for example what is being talked about in the noun phrase "the red fire 
truck", is this:


1.  A truck that carries red fires?  Fires can certainly be red.

2.  A red truck that carries fires?  Trucks can carry almost anything, and I 
doubt that anyone is going to bother entering the exceptions into a dictionary.


3.  A special kind of truck called a fire truck that is red? This is probably 
the intended meaning, because we know that fire trucks ARE a special kind of 
truck.

No, I a not trying to be perverse. Remember the "firemen" in 1984? They were 
men who made fires.


Maybe even several parses if there is ambiguity.


Pretty much all bottom-up methods (including my method) do that. 


(Do a search on NEWCAT parser.  Never mind, here are the links:)
http://books.google.com/books/about/NEWCAT_Parsing_Natural_Language_Using_Le.html?id=h0zHHAP79yoC

http://books.google.com/books?id=wQse-vqdBzkC&source=gbs_similarbooks

Roland's parsing with LA grammar is rapid.  My request is for you to first seek 
to understand how NEWCAT and similar LA parsers work, then tell me why it is 
slower than your method.  I haven't gotten that from you yet 

I (and everyone else) needs some simple information like what it does, what it 
does NOT do, how fast it runs, how much memory it takes, what it does with 
broken structure, how it handles disambiguation and ontological issues (like 
the red fire truck), how easy it is to program, etc.

 
Steve.  I hope we're not talking past one another, but I think we might be.

Of course we are. Now we must get down to the nuts and bolts of both systems.

Hey, if we take our lemons and make lemonade, we might be onto something really 
valuable here. We need some way of rating parsers - which could be used for 
nearly all parsers, like something you might see in a Consumers Reports 
magazine of the distant future. Logan, you, and I could provide three lines in 
the side-by-side comparison.


I should probably work on a list of potential capabilities, features, and 
limitations.

Steve


Date: Thu, 21 Mar 2013 16:26:01 -0700
Subject: Re: [agi] AGI prospects for the next decade or two...
From: [email protected]

To: [email protected]

PM,

On Thu, Mar 21, 2013 at 1:58 PM, Piaget Modeler <[email protected]> 
wrote:





Good, but I'm not convinced your approach is the fastest.
If we are going to CONVINCINGLY compare approaches, then we need SOMEONE with 
detailed knowledge about the systems being compared, to be able to figure out 
one might be faster than the other.



Are you there with Hausser's methods, or do you know someone who is there? 


How does it compare with Hausser's Left Associative parsing,
I provided an explanation. Was my explanation deficient in some way?
 


or using Trie trees to make dictionaries?
Why use any tree, when hashed access is faster?

Note that my method REQUIRES about a gigabyte of RAM to hold the Lexicon and 
all the working storage, which simply hasn't been available until fairly 
recently.



Before then, other methods WERE faster, because my method couldn't have been 
crammed into smaller computers.


My understanding is that Hausser's LA parsers are the fastest mode of parsing.  
What does Hausser's LA parsing lack that you address?


1.  It works with characters, which Matt's recent test shows to be an 
order-of-magnitude slower than ordinals.

2.  While it may be better than most other strategies, still, >99% of the tests 
it makes will fail, and hence won't affect output in any way. I eliminate the 
vast majority of tests that will fail, so that some fraction approaching half 
will succeed. It will be the SAME tests that succeed with or without my system. 
I just eliminate the failures.



You are apparently working with someone else's opinion that Hausser's method is 
best, which it may have been when the opinion was rendered. I suggest 
recruiting the expert who expressed that opinion into this conversation, have 
him look over what I have written, and then we can have a really productive 
conversation about this.



Steve

Date: Wed, 20 Mar 2013 18:30:24 -0700
Subject: Re: [agi] AGI prospects for the next decade or two...


From: [email protected]
To: [email protected]

PM,

Here is the essential point: My approach is to parsing what an operating system 
is to programming. You can implement ANY (that I have ever heard of) approach 
to parsing - it just runs 3-4 orders of magnitude faster when embedded in my 
structure.




When I first started programming (on vacuum tube computers like the IBM-650 and 
IBM-709) there were no operating systems. Then a bright young engineer at IBM 
named Gene Amdahl figured out that some smart canned I/O routines could 
outperform the usual direct addressing of peripherals, by using leftover RAM as 
buffers. Suddenly, 709s were running MUCH faster, because programs no longer 
had to wait for their I/O.




Similarly, the vast majority of things that ALL parsing methods check for are 
not there. This means that >99% of what they do is completely wasted, because 
it leads to NO output. Hausser's methods seek to implement new rules to 
effectively deal with word order scrambling and other anomalies that occur in 
real-world English. I describe some similar rules in the patent. However, there 
is no need to use the sample rule types I describe, as it is easy to implement 
ANY sorts of rules you can imagine.




My method isn't perfect, as probably more than half of the rules will still not 
find what they are looking for. Of course, if a system could ever get to 100%, 
there would then no longer be any reason to perform the rules   8-:D>




Regarding NELL, it appears that it learns things with language, rather that 
learning language.

Steve
=================
On Wed, Mar 20, 2013 at 5:55 PM, Piaget Modeler <[email protected]> 
wrote:






There is also the NELL project which is already in flight.  
http://rtw.ml.cmu.edu/rtw/
 http://www.cmu.edu/homepage/computing/2010/fall/nell-computer-that-learns.shtml



It's aim is to read the web.  
ALL of it.
How would your parsing approach be different from, enhance, or make obsolete 
the Never Ending Language Learning (NELL) system?



Just curious? 
~PM
Date: Wed, 20 Mar 2013 15:18:22 -0700

Subject: [agi] AGI prospects for the next decade or two...
From: [email protected]
To: [email protected]




Hi all,

Has my previous posting made my implied point to everyone's satisfaction, that 
without my new parsing method, that there is NO presently known or suspected 
approach to parsing full-blown English fast enough to be practical, for at 
least another decade or two?





Hence, there seems to be no useful purpose in anyone wasting their time 
building yet another ad hoc or table-driven parser that doesn't use my method, 
beyond proving my point with yet another failed NLP project.





Things I am **NOT** saying include:

1.  That more breakthroughs aren't necessary to achieve "understanding", though 
it is my gut feeling that present-day parsing technology would be adequate for 
most use, given another 3-4 orders of magnitude in both speed and rules count.





2.  That this is easy. On the contrary, I suspect that ~1 man-decade of 
linguistic rules building would be needed. This would be needed regardless of 
the approach used, so this is NOT a disadvantage of my approach. Of course, 
this would take one person a decade of hard work to complete, but could be 
completed by an organized team in a year or two.





So, is anyone (else) here interested in some sort of team effort to make this 
happen?

Once completed, we already understand that this may be the most valuable 
software on the planet.

If we can't get a critical mass going on this, then there would seem to be 
little reason to continue this forum for the next decade or two, beyond maybe 
discussing pie-in-the-sky plans for what might be done decades in the future, 
using presently unknown technologies, which would better be posted on the 
singularity forum and NOT here on the AGI forum.





Am I missing anything here?

Any interest?

Steve





  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  

                                          


  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  





-- 
Full employment can be had with the stoke of a pen. Simply institute a six hour 
workday. That will easily create enough new jobs to bring back full employment.








  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  

                                          


  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  





-- 
Full employment can be had with the stoke of a pen. Simply institute a six hour 
workday. That will easily create enough new jobs to bring back full employment.







  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  

                                          


  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  





-- 
Full employment can be had with the stoke of a pen. Simply institute a six hour 
workday. That will easily create enough new jobs to bring back full employment.






  
    
      
      AGI | Archives

 | Modify
 Your Subscription


      
    
  

                                          


-------------------------------------------
AGI
Archives: https://www.listbox.com/member/archive/303/=now
RSS Feed: https://www.listbox.com/member/archive/rss/303/21088071-f452e424
Modify Your Subscription: 
https://www.listbox.com/member/?member_id=21088071&id_secret=21088071-58d57657
Powered by Listbox: http://www.listbox.com

RE: [agi] AGI prospects for the next decade or two...

Reply via email to