Not really.  Semantics is an easier problem.

If so, then why "When you write a compiler, you develop it in this order: lexical, syntax, semantics."

Information retrieval and text
classification systems work pretty well by ignoring word order.

Semantics is defined as the study of meaning. Information retrieval and text classification systems do not understand the meaning of what they return. They do work . . . . but their job isn't semantics.

----- Original Message ----- From: "Matt Mahoney" <[EMAIL PROTECTED]>
To: <agi@v2.listbox.com>
Sent: Tuesday, May 01, 2007 12:49 PM
Subject: Re: [agi] rule-based NL system


--- Mark Waser <[EMAIL PROTECTED]> wrote:

> we want computers to
> understand natural language because we think: if you know the syntax,
> the semantics follow easily

Huh?  "We" don't think anything of the sort.  Syntax is relatively easy.
Semantics are AGI.

Not really. Semantics is an easier problem. Information retrieval and text classification systems work pretty well by ignoring word order. (I know there are exceptions where word order is important, like "Bob hit Alice"). On the
other hand, parsing natural language is an unsolved problem.

Word order is extremely important in artificial languages such as C. Almost any rearrangement of the words in a program will change its meaning. When you write a compiler, you develop it in this order: lexical, syntax, semantics.
This approach does not work for natural language.  Syntax depends on
semantics.  The parse of "I ate pizza with X" depends on the meaning of X.
Children learn language in the order: lexical, semantics, syntax.

Artificial languages were designed to be processed efficiently on fast,
sequential computers in a clean environment.  Thus we have strictly
unambiguous parsing, a stack-based short term memory (context free grammar),
long chains of sequential dependencies, and a lack of error recovery
capabilities. On the other hand, natural language evolved to be processed on
massively parallel computers with slow, unreliable components in a noisy
environment with a small short term memory (about 100 bits), and a slow long term learning rate (a few bits per second). Thus, we have noisy, ambiguous messages whose resolution requires integrating thousands of learned lexical, semantic, syntactic, and pragmatic constraints simultaneously, and a language structure that allows for gradual but efficient learning and updating of these
constraints in the course of normal communication.

This is not to say that a neural architecture like the brain is the best way to process natural language. Rather, natural language is the most efficient
language for communication between the neural computers we already have.


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?&;



-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=231415&user_secret=fabd7936

Reply via email to