Francisco, Your work is very cool. Do you think it would be possible to make available your word SDRs (or a sufficient subset of them) for experimentation? I imagine there would be interested in the NuPIC community in training a CLA on text using your word SDRs. You might get some useful results more quickly. You could do this under a research only license or something like that. Jeff
-----Original Message----- From: nupic [mailto:[email protected]] On Behalf Of Francisco Webber Sent: Wednesday, August 21, 2013 1:01 PM To: NuPIC general mailing list. Subject: Re: [nupic-dev] HTM in Natural Language Processing Hello, I am one of the founders of CEPT Systems and lead researcher of our retina algorithm. We have developed a method to represent words by a bitmap pattern capturing most of its "lexical semantics". (A text sensor) Our word-SDRs fulfill all the requirements for "good" HTM input data. - Words with similar meaning "look" similar - If you drop random bits in the representation the semantics remain intact - Only a small number (up to 5%) of bits are set in a word-SDR - Every bit in the representation corresponds to a specific semantic feature of the language used - The Retina (sensory organ for a HTM) can be trained on any language - The retina training process is fully unsupervised. We have found out that the word-SDR by itself (without using any HTM yet) can improve many NLP problems that are only poorly solved using the traditional statistic approaches. We use the SDRs to: - Create fingerprints of text documents which allows us to compare them for semantic similarity using simple (euclidian) similarity measures - We can automatically detect polysemy and disambiguate multiple meanings. - We can characterize any text with context terms for automatic search-engine query-expansion . We hope to successfully link-up our Retina to an HTM network to go beyond lexical semantics into the field of "grammatical semantics". This would hopefully lead to improved abstracting-, conversation-, question answering- and translation- systems.. Our correct web address is www.cept.at (no kangaroos in Vienna ;-) I am interested in any form of cooperation to apply HTM technology to text. Francisco On 21.08.2013, at 20:16, Christian Cleber Masdeval Braz wrote: > > Hello. > > As many of you here i am prety new in HTM technology. > > I am a researcher in Brazil and I am going to start my Phd program soon. My field of interest is NLP and the extraction of knowledge from text. I am thinking to use the ideas behind the Memory Prediction Framework to investigate semantic information retrieval from the Web, and answer questions in natural language. I intend to use the HTM implementation as base to do this. > > I apreciate a lot if someone could answer some questions: > > - Are there some researches related to HTM and NLP? Could indicate them? > > - Is HTM proper to address this problem? Could it learn, without supervision, the grammar of a language or just help in some aspects as Named Entity Recognition? > > > > Regards, > > Christian > > > _______________________________________________ > nupic mailing list > [email protected] > http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org _______________________________________________ nupic mailing list [email protected] http://lists.numenta.org/mailman/listinfo/nupic_lists.numenta.org
