On Mon, 2019-04-08 at 09:58 +1000, Ash Ramesh wrote:
> We have a corpus of 50+ million documents in our collection. I've
> noticed that some queries with specific keywords tend to be extremely
> slow.
> E.g. the q=`photography' or q='background'. After digging into the
> raw documents, I could see that these two terms appear in greater
> than 90% of all documents, which means solr has to score each of
> those documents.

That is known behaviour, which can be remedied somewhat. Stop words is
a common approach, but your samples does not seem to fit well with
that. Instead you can look at Common Grams, where your high-frequency
words gets concatenated with surrounding words. This only works with
phrases though. There's a nice article at

https://www.hathitrust.org/blogs/large-scale-search/slow-queries-and-common-words-part-2

- Toke Eskildsen, Royal Danish Library


Reply via email to