Hi:
  I recently included the CLustering component into Solr and updated the 
requestHandler accordingly (in solrconfig.xml).
Snippet of the Config for the CLuserting:

  <searchComponent
    name="clusteringComponent"
    enable="${solr.clustering.enabled:false}"
    class="org.apache.solr.handler.clustering.ClusteringComponent" >
    <!-- Declare an engine -->
    <lst name="engine">
      <!-- The name, only one can be named "default" -->
      <str name="name">default</str>
      <!-- 
           Class name of Carrot2 clustering algorithm. Currently available 
algorithms are:
           
           * org.carrot2.clustering.lingo.LingoClusteringAlgorithm
           * org.carrot2.clustering.stc.STCClusteringAlgorithm
           
           See http://project.carrot2.org/algorithms.html for the algorithm's 
characteristics.
        -->
      <str 
name="carrot.algorithm">org.carrot2.clustering.lingo.LingoClusteringAlgorithm</str>
      <!-- 
           Overriding values for Carrot2 default algorithm attributes. For a 
description
           of all available attributes, see: 
http://download.carrot2.org/stable/manual/#chapter.components.
           Use attribute key as name attribute of str elements below. These can 
be further
           overridden for individual requests by specifying attribute key as 
request
           parameter name and attribute value as parameter value.
        -->
      <str name="LingoClusteringAlgorithm.desiredClusterCountBase">20</str>
    </lst>
    <lst name="engine">
      <str name="name">stc</str>
      <str 
name="carrot.algorithm">org.carrot2.clustering.stc.STCClusteringAlgorithm</str>
    </lst>
  </searchComponent>

snippet of the Config for requestHandler
  <requestHandler name="standard" class="solr.SearchHandler" default="true">
    <!-- default values for query parameters -->
     <lst name="defaults">
       <str name="echoParams">explicit</str>
       <!--
       <int name="rows">10</int>
       <str name="fl">*</str>
       <str name="version">2.1</str>
        -->
       <bool name="clustering">true</bool>
       <str name="clustering.engine">default</str>
       <bool name="clustering.results">true</bool>
       <!-- The title field -->
       <str name="carrot.title">headline</str>
       <str name="carrot.url">pi</str>
       <!-- The field to cluster on -->
       <str name="carrot.snippet">headline</str>
       <!-- produce summaries -->
       <bool name="carrot.produceSummary">true</bool>
       <!-- the maximum number of labels per cluster -->
       <!--<int name="carrot.numDescriptions">5</int>-->
       <!-- produce sub clusters -->
       <bool name="carrot.outputSubClusters">false</bool>
     </lst>
    <arr name="last-components">
      <str>clusteringComponent</str>
    </arr>
  </requestHandler>


When I perform a search, I see that the Cluster section within the Solr results
shows me results that are not quite consistent. There are two documents that 
are reported in two different documents

Are there parameters that can be set that will prevent this from happening ?


Thanks much

Ramdev

Reply via email to