Solr grouping performance porblem

Shamik Bandopadhyay Wed, 30 Oct 2013 22:38:05 -0700

Hi,

   I've recently upgraded to SolrCloud (4.4) from Master-Slave mode. One of
the changes I did the in queries is to add group functionality to remove
duplicate results. The grouping is done on a specific field. But the change
seemed to have a huge effect on the query performance. The "group" option
decreased the performance by 10 times. For e.g. this query takes 1 sec to
execute. The number of results is around 105387.


http://localhost:8083/solr/browse?fq=language:(english)&wt=xml&rows=10&start=0&fq=(ContentGroup-local:"Learn
& Explore" OR ADSKContentGroup-local:"Getting Started")&q=line&sort=score
desc&group=true&group.field=dedup&group.ngroups=true

If I exclude group option, it comes down to 190ms

http://localhost:8083/solr/browse?fq=language:(english)&wt=xml&rows=10&start=0&fq=(ContentGroup-local:"Learn
& Explore" OR ADSKContentGroup-local:"Getting Started")&q=line

I'm running this query against a 8 million doc index . I've 2 shard with 1
replica each, running on a m1x.large EC2 instance, each having 8gb allocat
ed memory.

Is this a known issue or am I missing something which is making this query
expensive.

I bumped into this JIRA -->
https://issues.apache.org/jira/browse/SOLR-5027 which
talks about CollapsingQParserPlugin as an alternate to grouping, but that
seemed to be available in 4.6. Just wondering if it can be an alternate in
my case and whether if its possible to apply as a patch in 4.4 version.

Any pointer will be appreciated.

- Thanks,
Shamik

Solr grouping performance porblem

Reply via email to