Re: How to trace one query？the debug/debugQuery info are not enough to find out why a query is slow

Shawn Heisey Thu, 23 Aug 2018 03:04:17 -0700

On 8/23/2018 3:41 AM, zhenyuan wei wrote:

Thank you very much to answer.  @Jan Høydahl
My query is simple, just wildcard last 2 char in this query（have more other
query to optimize）


  curl "
http://emr-worker-1:8983/solr/collection005/query?q=v10_s:OOOOOOOOVVVVVVVVYY*&rows=10&&fl=id&echoParams=all
"

I think that's the answer right there -- wildcard query. Wildcardqueries have a tendency to be slow, because of how they work. What isthe nature of your v10_s field? Does that wildcard query match a lot ofterms? When a wildcard query executes, Solr asks the index for allterms that match it, and then constructs a query with all of those termsin it. If there are ten million terms that match the wildcard, thequery will *quite literally* have ten million entries inside it. Everyone of the terms will need to be separately searched against the index. Each term will be fast, but it adds up if there are a lot of them. Thisquery had a numFound larger than one hundred thousand. Which suggeststhat there were at least that many terms in the query. So basically inthe time it took, Solr first gathered a huge list of terms, and theninternally executed over one hundred thousand individual queries.

Changing your field definition so you can avoid wildcard queries will goa long way towards speeding things up.Typically this involves some kindof ngram tokenizer or filter. It will make the index much larger, buttends to speed things up.

Your example says the QTime is 125 milliseconds, and your message talksabout times of 40 milliseconds. This is NOT slow. If you're trying tomaximize queries per second, you need to know that handling a high queryload requires multiple servers handling multiple replicas of your index,and some kind of load balancing.

Configuring caches cannot speed up the first time a query runs. Thatspeeds up later runs. To speed up the first time will require two things:

1) Ensuring that there is enough memory in the system for the operatingsystem to effectively cache the index. This is memory *beyond* the javaheap that is not allocated to any program.2) Changing the query to a type that executes faster and adjusting theschema to allow the new type to work. Wildcard queries are one of theworst options.

In a later message, you indicated that your cache autowarmCount valuesare mostly set to zero. This means that anytime you make a change tothe index, your caches are completely gone, and that the one cache witha nonzero setting is using NoOpRegenerator, so it's not actually doingany warming. With auto warming, the most recent entries in the cachewill be re-executed to warm the new caches. This can help withperformance, but if you make autoWarmCount too large, it will makecommits take a very long time. Note that documentCache actually doesn'tdo warming, so that setting is irrelevant on that cache.


Thanks,
Shawn

Re: How to trace one query？the debug/debugQuery info are not enough to find out why a query is slow

Reply via email to