BTW, Carrot2 has a very impressive Clustering Workbench (based on eclipse) that has built-in integration with Solr. If you have a Solr service running, it is a just a matter of point the workbench to it. The clustering results and visualization are amazing. (http://project.carrot2.org/download.html).
Yao Ge wrote: > > FYI. I did a direct integration with Carrot2 with Solrj with a separate > Ajax call from UI for top 100 hits to clusters terms in the two text > fields. It gots comparable performance to other facets in terms of > response time. > > In terms of algorithms, their listed two "Lingo" and "STC" which I don't > reconize. But I think at least one of them might have used SVD > (http://en.wikipedia.org/wiki/Singular_value_decomposition). > > -Yao > > > Otis Gospodnetic wrote: >> >> >> I'd call it related (their application in search encourages exploration), >> but also distinct enough to never mix them up. I think your assessment >> below is correct, although I'm not familiar with the details of Carrot2 >> any more (was once), so I can't tell you exactly which algo is used under >> the hood. >> >> Otis >> -- >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch >> >> >> >> ----- Original Message ---- >>> From: Michael Ludwig <m...@as-guides.com> >>> To: solr-user@lucene.apache.org >>> Sent: Wednesday, June 10, 2009 9:41:54 AM >>> Subject: Re: Faceting on text fields >>> >>> Otis Gospodnetic schrieb: >>> > >>> > Solr can already cluster top N hits using Carrot2: >>> > http://wiki.apache.org/solr/ClusteringComponent >>> >>> Would it be fair to say that clustering as detailed on the page you're >>> referring to is a kind of dynamic faceting? The faceting not being done >>> based on distinct values of certain fields, but on the presence (and >>> frequency) of terms in one field? >>> >>> The main difference seems to be that with faceting, grouping criteria >>> (facets) are known beforehand, while with clustering, grouping criteria >>> (the significant terms which create clusters - the cluster keys) have >>> yet to be determined. Is that a correct assessment? >>> >>> Michael Ludwig >> >> >> > > -- View this message in context: http://www.nabble.com/Faceting-on-text-fields-tp23872891p23980959.html Sent from the Solr - User mailing list archive at Nabble.com.