Hey guys,
I've been noticing for quite a long time that using minmatch parameter with
a value less than 100%
alongside the dismax qparser seriously degrades performance. My particular
use case involves
using dismax over a set of 4-6 textual fields, about half of which do *not*
filter stop words. ( so yes,
these do involve iterating over large portion of my index in some cases).

This is somewhat understandable as the task of constructing result sets is
no longer simply intersection based,
however I do wonder what work-arounds / standard solutions exist for this
problem and which are applicable
in the solr/lucene environment ( I.e dividing index to 'primary' /
'secondary' sections, using n-gram indices, caching configuration, sharding
might help..? )
I'm working with not such a large corpus (~20 million documents) and the
query processing time is way too long
to my mind ( my goal is 90% percentile QTime to hit around 200ms, I can say
that currently its more than double that.. )
Can anyone please share some of his knowledge? what is practiced i.e in
google, yahoo..? Any plans to address these issue in solr/lucene or am
i just using it wrongly?

Any feedback appreciated..

Thanks,
-Chak

Reply via email to