The clustering component built-in inside Solr/Lucene is primarily meant to cluster search results, so it's not a matter of "not implemented yet", but "not indended for this functionality". There may be full-index document clustering algorithms implemented in the future, but you may also use Apache Mahout or other large-scale text clustering package if your document count is high.
For smaller sets of documents, try using Carrot2 API directly. If the documents and associated data structures fit in memory, this may yield sensible output. More here, for example: http://download.carrot2.org/head/manual/index.html#section.java-api Dawid Carrot2 clustering algorithms On Wed, Sep 28, 2011 at 10:56 AM, Oleksandr Gamanjuk <[email protected]>wrote: > Hi,**** > > ** ** > > Is it possible to implement document automatic categorization using > Solr/Lucene?**** > > As far I as understand it is not implemented yet, according to the wiki > page<http://wiki.apache.org/solr/ClusteringComponent#Document_Clustering.> > **** > > ** ** > > ps:The same question is already asked > here<http://stackoverflow.com/questions/7574492/how-to-implement-auto-categorization-with-solr-lucene>, > but no results.**** > > ** ** > > ** ** > > *Oleksandr Gamanjuk* > > *Abiliton Senior Software Engineer* > > *[image: Description: Description: softserve-logo.gif]* > > 1, Barykadna St. > Dnipropetrovsk, 49044, Ukraine**** > > [email protected]**** > > ** ** >
<<image001.gif>>
