The clustering component built-in inside Solr/Lucene is primarily meant to
cluster search results, so it's not a matter of "not implemented yet", but
"not indended for this functionality". There may be full-index document
clustering algorithms implemented in the future, but you may also use Apache
Mahout or other large-scale text clustering package if your document count
is high.

For smaller sets of documents, try using Carrot2 API directly. If the
documents and associated data structures fit in memory, this may yield
sensible output. More here, for example:

http://download.carrot2.org/head/manual/index.html#section.java-api

Dawid

Carrot2 clustering algorithms

On Wed, Sep 28, 2011 at 10:56 AM, Oleksandr Gamanjuk
<[email protected]>wrote:

> Hi,****
>
> ** **
>
> Is it possible to implement document automatic categorization using
> Solr/Lucene?****
>
> As far I as understand it is not implemented yet, according to the wiki
> page<http://wiki.apache.org/solr/ClusteringComponent#Document_Clustering.>
> ****
>
> ** **
>
> ps:The same question is already asked 
> here<http://stackoverflow.com/questions/7574492/how-to-implement-auto-categorization-with-solr-lucene>,
> but no results.****
>
> ** **
>
> ** **
>
> *Oleksandr Gamanjuk*
>
> *Abiliton Senior Software Engineer*
>
> *[image: Description: Description: softserve-logo.gif]*
>
> 1, Barykadna St.
> Dnipropetrovsk, 49044, Ukraine****
>
> [email protected]****
>
> ** **
>

<<image001.gif>>

Reply via email to