
The CollapsingQParserPlugin will be available in Solr 4.6 and it should
perform much better when collapsing on a high cardinality field. The 4.6
code doesn't directly port back to Solr 4.4 though due to some changes in
the build for 4.6. The jira ticket has a conversation about this though and
you may be able to follow it and create a patch for 4.4.


> Hi,
>    I've recently upgraded to SolrCloud (4.4) from Master-Slave mode. One of
> the changes I did the in queries is to add group functionality to remove
> duplicate results. The grouping is done on a specific field. But the change
> seemed to have a huge effect on the query performance. The "group" option
> decreased the performance by 10 times. For e.g. this query takes 1 sec to
> execute. The number of results is around 105387.
> http://localhost:8083/solr/browse?fq=language:(english)&wt=xml&rows=10&start=0&fq=(ContentGroup-local
> :"Learn
> & Explore" OR ADSKContentGroup-local:"Getting Started")&q=line&sort=score
> desc&group=true&group.field=dedup&group.ngroups=true
> If I exclude group option, it comes down to 190ms
> http://localhost:8083/solr/browse?fq=language:(english)&wt=xml&rows=10&start=0&fq=(ContentGroup-local
> :"Learn
> & Explore" OR ADSKContentGroup-local:"Getting Started")&q=line
> I'm running this query against a 8 million doc index . I've 2 shard with 1
> replica each, running on a m1x.large EC2 instance, each having 8gb allocat
> ed memory.
> Is this a known issue or am I missing something which is making this query
> expensive.
> I bumped into this JIRA -->
> which
> talks about CollapsingQParserPlugin as an alternate to grouping, but that
> seemed to be available in 4.6. Just wondering if it can be an alternate in
> my case and whether if its possible to apply as a patch in 4.4 version.
> Any pointer will be appreciated.
> - Thanks,
> Shamik

