I'll answer you point by point; others who find this tedious can just scroll
down to "conclusions" at the bottom.

On 4/27/07, Mark Waser <[EMAIL PROTECTED]> wrote:
I am NOT suggesting a rule-based system at this level.  First I figure out
a good representation for the minimal Basic English grammar that
fundamentally has the simplest grammatical rules embedded into it's
structure rather than expressed as rules (i.e. I *AM* hand-crafting the
design of the initial structure -- nouns, verbs, adjectives, adverbs,
prepositions, noun clauses, verb clauses, prepositional phrases, simple SV
sentences, simple SVO sentences, etc.  I am also setting up different types
of inheritance and analogy links between words/terms/structures).  But I am
definitely not going to be hand-coding grammar "rules" per se.  After the
fact, you obviously could say that certain rules define the structure (and
you could obviously design an analogous rule-based system -- with *a lot*
more effort) but, at the lowest level, my version of the system is not going
to be run as a rule-based system.

So you are indeed trying to hand-code certain aspects of English grammar
into your KR structure.  But clearly you cannot embed *full* English grammar
into your KR.  So what about the *rest* of the grammar knowledge?  You
must learn it via machine learning.  Then how do you call
this "machine-learned stuff"?  I call it "grammar rules", but you may call
it "declarative linguistic knowledge" or whatever.  It's the same.

Can you explain more about "inheritance and analogy links"?  It seems that
you're trying to represent grammar structures using these "links".  I seems
that they are just computationally equivalent to production rules.

You're also missing the fact that language is both grammar and
vocabulary.  The system needs to be able to "understand" from the beginning
(Understand, in this case, meaning being able to translate anything to
"seed-only" form -- and thus, to be able, via the built in structure, to
know how to transform between alternative forms and to know what aspects and
data attach where).

I agree and I am well aware of this.

3. You then need to add more complex grammar rules to handle "real"
English, but such rules are difficult to hand-craft and thus may probably
require machine learning.

At this point, the tools I mentioned come into play to extend the
structure until it can handle "real" English.  This extension is clearly in
the realm of machine learning but, I believe, is structured and limited
enough to be feasible -- particularly if I start by pointing it
at "well-behaved/well-defined" sources like dictionaries, encyclopedias,
etc.
Yes it is 100% feasible, but it requires complex machine learning.  All I'm
saying is that I prefer a route where we collect commonsense knowledge with
Basic English *before* we go into the realm of full English.

4. Only at this stage you can digest the web or newspapers.

Actually, it can probably start *attempting* to digest such sources fairly
early.  All it really has to do is to be able to tell when it's pretty
sure that it's correct and when it needs to try again after it's learned
more (and discard the data until then).  In particular though, it can always
go to a dictionary when it runs across a new word (or when it seems that a
known word has another, unknown definition) and it can also go to a trusted
human if it's really not sure about how to parse a sentence (at which point
it gives that human a list of alternatives to choose from -- which doesn't
require extensive training to handle).
The vocab is only part of the problem.  The problem that I'm pointing out is
how do you handle full English grammar.  If your system don't have
*comprehensive* grammatical knowledge, it will have problems *interpreting*
complex or "irregular" English sentences, to the point that the "facts" you
acquire would be incorrect.

>> I guess (3) and (4) won't happen immediately.  And after (2) we can
start collecting commonsense facts via Basic English.  So it seems to me
that a viable "first product" could be a commonsense engine using Basic
English, without going to 3 & 4.

I disagree.  Building the extension/learning tools is a fundamental part
of the initial design.  If you start collecting commonsense facts without a
good data structure and the "understanding" described above, you end up with
Cyc (the same facts encoded multiple ways, almost all facts inaccessible
unless you access them in almost *exactly* the form in which they were
encoded, etc., etc.).  I don't find *any* value in that at all.  Facts are
only useful if you can access and use them.

I see your point, but you misunderstand my approach.  My KR scheme is
minimalistic.  Grammar is represented as additional rules (whereas you embed
some linguistic structure into your KR).  I plan to hand-code enough grammar
for Basic English first, and the system will later use machine learning to
learn more complex grammar rules, until it knows full English.

You assume that my system don't have true "understanding" because it
cannot inter-relate the same idea expressed in different sentences (ie,
paraphrasing).  That's a false accusation because my system can do this by
logical reasoning.

Conclusion:

1.  The only difference between your approach and mine is that you
hand-code *some* linguistic knowledge directly into your KR while I leave
that to grammar rules.

2.  You have not answered the question how you can easily digest
web/newspaper without adult-level grammar knowledge, which is not easy to
acquire/learn.

3.  The issue really is which route is easier:
  (A) build the system to the point of understanding Basic English, then
start collecting commensense facts.
  (B) build the system to understand real English, and crawl the
web/newspapers.

Don't forget that (A) can eventually grow into a full AGI;  the issue here
is the learning pathway.  (B) is harder because it requires adult-level
grammar AND a lot of commonsense knowledge.  If you do not follow an
INCREMENTAL LEARNING PATHWAY then your learning speed will be unfeasibly
slow - that's my theory.

YKY

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=231415&user_secret=fabd7936

Reply via email to