Hi,

I'm using Solr 5.2.1, and I've indexed about 1GB of data into Solr.

However, I find that clustering is exceeding slow after I index this 1GB of
data. It took almost 30 seconds to return the cluster results when I set it
to cluster the top 1000 records, and still take more than 3 seconds when I
set it to cluster the top 100 records.

Is this speed normal? Cos i understand Solr can index terabytes of data
without having the performance impacted so much, but now the collection is
slowing down even with just 1GB of data.

Below is my clustering configurations in solrconfig.xml.

 <requestHandler name="/clustering"
                  startup="lazy"
                  enable="${solr.clustering.enabled:true}"
                  class="solr.SearchHandler">
    <lst name="defaults">
       <str name="echoParams">explicit</str>
  <int name="rows">1000</int>
       <str name="wt">json</str>
       <str name="indent">true</str>
  <str name="df">text</str>
  <str name="fl">null</str>

      <bool name="clustering">true</bool>
      <bool name="clustering.results">true</bool>
      <str name="carrot.title">subject content tag</str>
      <bool name="carrot.produceSummary">true</bool>

 <int name="carrot.fragSize">20</int>
      <!-- the maximum number of labels per cluster -->
      <int name="carrot.numDescriptions">20</int>
      <!-- produce sub clusters -->
      <bool name="carrot.outputSubClusters">false</bool>
 <str name="LingoClusteringAlgorithm.desiredClusterCountBase">7</str>

      <!-- Configure the remaining request handler parameters. -->
      <str name="defType">edismax</str>
    </lst>
    <arr name="last-components">
      <str>clustering</str>
    </arr>
  </requestHandler>


Regards,
Edwin

Reply via email to