--- Linas Vepstas <[EMAIL PROTECTED]> wrote: > Thus, I find that my interests are now turning to representing > conversational state. How does novamente deal with it? What > about Pei Wang's NARS? It seems that NARS is a reasoning system; > great; but what is holding me back right now is not an ability > to reason per-se, but the ability to maintain a conversational > state.
In a statistical or neural model, conversational state is a type of short term memory. You represent what a conversation is "about" as a bag of recently used words. For example, if the words "snow" and "rain" were recently used, then you increase the probability of using words like "weather" or "wet" in your next response. Word associations can take the form of a word-word matrix. For example, there would be a numerical value associating "rain" and "wet" that depends on their co-occurrence frequency in a corpus of training text. If the matrix is sparse it can be compressed using singular value decomposition (SVD). A word-word matrix A would be factored into A = USV where U and V are orthonormal and S is diagonal. Then only the few hundred largest terms of S are retained. This allows most of U and V to be discarded as well. This representation is called "latent semantic analysis" (LSA) because it makes inferences using the transitive property of semantics. For example, if there are associations rain-wet and wet-water then it will infer rain-water even if the original value in A was 0 (due to a small training corpus). SVD is equivalent to a linear neural network with one hidden layer. This is not a model you can tack onto a structured knowledge base. As I said, language has to be an integral part of it. Your approach has been tried hundreds of times. There is a great temptation to insert knowledge directly, but the result is always the same. Natural language is a complicated beast. You cannot hand code all the language rules. After 23 years of developing the Cyc database, Doug Lenat guesses it is between 0.1% and 10% finished. I don't claim that LSA is a solution either. So far there are no good statistical models of grammar. What I can say about statistical models is that they have been used successfully (e.g. Google), and they are bottom up, more like the way children learn semantics before syntax. AFAIK, neither NARS nor Novamente is capable of processing natural language (yet). I am sure Pai or Ben could tell you more. -- Matt Mahoney, [EMAIL PROTECTED] ----- This list is sponsored by AGIRI: http://www.agiri.org/email To unsubscribe or change your options, please go to: http://v2.listbox.com/member/?member_id=8660244&id_secret=60139818-b3253c