Hi: I recently included the CLustering component into Solr and updated the requestHandler accordingly (in solrconfig.xml). Snippet of the Config for the CLuserting:
<searchComponent name="clusteringComponent" enable="${solr.clustering.enabled:false}" class="org.apache.solr.handler.clustering.ClusteringComponent" > <!-- Declare an engine --> <lst name="engine"> <!-- The name, only one can be named "default" --> <str name="name">default</str> <!-- Class name of Carrot2 clustering algorithm. Currently available algorithms are: * org.carrot2.clustering.lingo.LingoClusteringAlgorithm * org.carrot2.clustering.stc.STCClusteringAlgorithm See http://project.carrot2.org/algorithms.html for the algorithm's characteristics. --> <str name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str> <!-- Overriding values for Carrot2 default algorithm attributes. For a description of all available attributes, see: http://download.carrot2.org/stable/manual/#chapter.components. Use attribute key as name attribute of str elements below. These can be further overridden for individual requests by specifying attribute key as request parameter name and attribute value as parameter value. --> <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str> </lst> <lst name="engine"> <str name="name">stc</str> <str name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str> </lst> </searchComponent> snippet of the Config for requestHandler <requestHandler name="standard" class="solr.SearchHandler" default="true"> <!-- default values for query parameters --> <lst name="defaults"> <str name="echoParams">explicit</str> <!-- <int name="rows">10</int> <str name="fl">*</str> <str name="version">2.1</str> --> <bool name="clustering">true</bool> <str name="clustering.engine">default</str> <bool name="clustering.results">true</bool> <!-- The title field --> <str name="carrot.title">headline</str> <str name="carrot.url">pi</str> <!-- The field to cluster on --> <str name="carrot.snippet">headline</str> <!-- produce summaries --> <bool name="carrot.produceSummary">true</bool> <!-- the maximum number of labels per cluster --> <!--<int name="carrot.numDescriptions">5</int>--> <!-- produce sub clusters --> <bool name="carrot.outputSubClusters">false</bool> </lst> <arr name="last-components"> <str>clusteringComponent</str> </arr> </requestHandler> When I perform a search, I see that the Cluster section within the Solr results shows me results that are not quite consistent. There are two documents that are reported in two different documents Are there parameters that can be set that will prevent this from happening ? Thanks much Ramdev