Re: How to speed up field collapsing on large number of groups

2016-07-14 Thread Joel Bernstein
The top_fc hint doesn't come into play until Solr 5. With Solr 4x the CollapsingQParserPlugin always uses a top level field cache. Joel Bernstein http://joelsolr.blogspot.com/ On Mon, Jul 4, 2016 at 9:20 AM, Alessandro Benedetti wrote: > Have you tried with docValues for the fields involved in

Re: How to speed up field collapsing on large number of groups

2016-07-13 Thread Jichi Guo
Hi everyone, Is it possible to optimize collapsing on large index through parallelization without sharding? Or can we conclude that sharding is currently the only approach to geometrically speedup slow collapsing queries? I tried manually parallelizing CollapsingQParserPlugin by diff

Re: How to speed up field collapsing on large number of groups

2016-07-04 Thread Alessandro Benedetti
Have you tried with docValues for the fields involved in the collapse group head selection ? With a group head selection of "min" "max"and "sort" should work quite well. Of course it depends of your formula. Does your index change often ? If the warming time is not a problem you could try with :

Re: How to speed up field collapsing on large number of groups

2016-06-28 Thread Jichi Guo
Thanks for the quick response, Joel! I am hoping to delay sharding if possible, which might involve more things to consider :) 1) What is the size of the result set before the collapse? When search with q=*:* for example, before collapse numFound is around 5 million, and that after col

Re: How to speed up field collapsing on large number of groups

2016-06-28 Thread Joel Bernstein
Sharding will help, but you'll need to co-locate documents by group ID. A few questions / suggestions: 1) What is the size of the result set before the collapse? 2) Have you tested without the long formula, just using a field for the min/max. It would be good to understand the impact of the formul

How to speed up field collapsing on large number of groups

2016-06-28 Thread jichi
Hi everyone, I am using Solr 4.10 to index 20 million documents without sharding. Each document has a groupId field, and there are about 2 million groups. I found the search with collapsing on groupId significantly slower comparing to without collapsing, especially when combined with facet queries