--- Linas Vepstas <[EMAIL PROTECTED]> wrote:

> Thus, I find that my interests are now turning to representing
> conversational state. How does novamente deal with it? What
> about Pei Wang's NARS? It seems that NARS is a reasoning system;
> great; but what is holding me back right now is not an ability
> to reason per-se, but the ability to maintain a conversational 
> state.

In a statistical or neural model, conversational state is a type of short term
memory.  You represent what a conversation is "about" as a bag of recently
used words.  For example, if the words "snow" and "rain" were recently used,
then you increase the probability of using words like "weather" or "wet" in
your next response.

Word associations can take the form of a word-word matrix.  For example, there
would be a numerical value associating "rain" and "wet" that depends on their
co-occurrence frequency in a corpus of training text.

If the matrix is sparse it can be compressed using singular value
decomposition (SVD).  A word-word matrix A would be factored into A = USV
where U and V are orthonormal and S is diagonal.  Then only the few hundred
largest terms of S are retained.  This allows most of U and V to be discarded
as well.  This representation is called "latent semantic analysis" (LSA)
because it makes inferences using the transitive property of semantics.  For
example, if there are associations rain-wet and wet-water then it will infer
rain-water even if the original value in A was 0 (due to a small training
corpus).  SVD is equivalent to a linear neural network with one hidden layer.

This is not a model you can tack onto a structured knowledge base.  As I said,
language has to be an integral part of it.  Your approach has been tried
hundreds of times.  There is a great temptation to insert knowledge directly,
but the result is always the same.  Natural language is a complicated beast. 
You cannot hand code all the language rules.  After 23 years of developing the
Cyc database, Doug Lenat guesses it is between 0.1% and 10% finished.

I don't claim that LSA is a solution either.  So far there are no good
statistical models of grammar.  What I can say about statistical models is
that they have been used successfully (e.g. Google), and they are bottom up,
more like the way children learn semantics before syntax.

AFAIK, neither NARS nor Novamente is capable of processing natural language
(yet).  I am sure Pai or Ben could tell you more.


-- Matt Mahoney, [EMAIL PROTECTED]

-----
This list is sponsored by AGIRI: http://www.agiri.org/email
To unsubscribe or change your options, please go to:
http://v2.listbox.com/member/?member_id=8660244&id_secret=60139818-b3253c

Reply via email to