Re: newbie question: LSA anaylsis + others

Ted Dunning Tue, 16 Jun 2009 19:21:55 -0700

As yet, no.

There is, however, an active project going to implement LDA.  This will give
you "semantic" representations for words which could then be clustered.  We
do have several clustering algorithms that would be entirely sufficient for
that step.

It would be a very interesting addition to have some other language modeling
implementation as well.  The one that I would find interesting would be
something like a center embedded neural model such as was used in this
article by Ronan
Collobert<http://ronan.collobert.com/pub/matos/2008_nlp_icml.pdf>.
I would be very willing to advise on the implementation of such a beast, but
due to my normal level of over-commitment could only provide minimal direct
code contributions.

On Tue, Jun 16, 2009 at 7:04 PM, Paul Jones <[email protected]>wrote:

> 1. Take a set of words
> 2. Build clusters of these words, i.e work out the semantic relationship
> between these (I guess I could use wordnet as a starter) words. i.e
> inter-relationships
> 3. Once clusters have been formed of words, also work out relationship
> between the clusters themselves.
>
> so in essence I could work out that red was similiar to crimson, and hence
> a search on red would produce docs with crimson in them even though red was
> not mentioned.
>
> would mahout work here?
>

Re: newbie question: LSA anaylsis + others

Reply via email to