Re: Slow qTime for distributed search

Manuel Le Normand Tue, 09 Apr 2013 13:11:08 -0700

Thanks for replying.
My config:

   - 40 dedicated servers, dual-core each
   - Running Tomcat servlet on Linux
   - 12 Gb RAM per server, splitted half between OS and Solr
   - Complex queries (up to 30 conditions on different fields), 1 qps rate


Sharding my index was done for two reasons, based on 2 servers (4shards)
tests:

   1. As index grew above few million of docs qTime raised greatly, while
   sharding the index to smaller pieces (about 0.5M docs) gave way better
   results, so I bound every shard to have 0.5M docs.
   2. Tests showed i was cpu-bounded during queries. As i have low qps rate
   (emphasize: lower than expected qTime) and as a query runs single-threaded
   on each shard, it made sense to accord a cpu to each shard.

For the same amount of docs per shards I do expect a raise in total qTime
for the reasons:

   1. The response should wait for the slowest shard
   2. Merging the responses from 40 different shards takes time

What i understand from your explanation is that it's the merging that takes
time and as qTime ends only after the second retrieval phase, the qTime on
each shard will take longer. Meaning during a significant proportion of the
first query phase (right after the [id,score] are retieved), all cpu's are
idle except the response-merger thread running on a single cpu. I thought
of the merge as a simple sorting of [id,score], way more simple than
additional 300 ms cpu time.

Why would a RAM increase improve my performances, as it's a
"response-merge" (CPU resource) bottleneck?

Thanks in advance,
Manu


On Mon, Apr 8, 2013 at 10:19 PM, Shawn Heisey <s...@elyograg.org> wrote:

> On 4/8/2013 12:19 PM, Manuel Le Normand wrote:
>
>> It seems that sharding my collection to many shards slowed down
>> unreasonably, and I'm trying to investigate why.
>>
>> First, I created "collection1" - 4 shards*replicationFactor=1 collection
>> on
>> 2 servers. Second I created "collection2" - 48 shards*replicationFactor=2
>> collection on 24 servers, keeping same config and same num of documents
>> per
>> shard.
>>
>
> The primary reason to use shards is for index size, when your index is so
> big that a single index cannot give you reasonable performance. There are
> also sometimes performance gains when you break a smaller index into
> shards, but there is a limit.
>
> Going from 2 shards to 3 shards will have more of an impact that going
> from 8 shards to 9 shards.  At some point, adding shards makes things
> slower, not faster, because of the extra work required for combining
> multiple queries into one result response.  There is no reasonable way to
> predict when that will happen.
>
>  Observations showed the following:
>>
>>     1. Total qTime for the same query set is 5 time higher in collection2
>>     (150ms->700 ms)
>>     2. Adding to colleciton2 the *shard.info=true* param in the query
>> shows
>>
>>     that each shard is much slower than each shard was in collection1
>> (about 4
>>     times slower)
>>     3.  Querying only specific shards on collection2 (by adding the
>>
>>     shards=shard1,shard2...shard12 param) gave me much better qTime per
>> shard
>>     (only 2 times higher than in collection1)
>>     4. I have a low qps rate, thus i don't suspect the replication factor
>>
>>     for being the major cause of this.
>>     5. The avg. cpu load on servers during querying was much higher in
>>
>>     collection1 than in collection2 and i didn't catch any other
>> bottlekneck.
>>
>
> A distributed query actually consists of up to two queries per shard. The
> first query just requests the uniqueKey field, not the entire document.  If
> you are sorting the results, then the sort field(s) are also requested,
> otherwise the only additional information requested is the relevance score.
>  The results are compiled into a set of unique keys, then a second query is
> sent to the proper shards requesting specific documents.
>
>
>  Q:
>> 1. Why does the amount of shards affect the qTime of each shard?
>> 2. How can I overcome to reduce back the qTime of each shard?
>>
>
> With more shards, it takes longer for the first phase to compile the
> results, so the second phase (document retrieval) gets delayed, and the
> QTime goes up.
>
> One way to reduce the total time is to reduce the number of shards.
>
> You haven't said anything about how complex your queries are, your index
> size(s), or how much RAM you have on each server and how it is allocated.
>  Can you provide this information?
>
> Getting good performance out of Solr requires plenty of RAM in your OS
> disk cache.  Query times of 150 to 700 milliseconds seem very high, which
> could be due to query complexity or a lack of server resources (especially
> RAM), or possibly both.
>
> Thanks,
> Shawn
>
>

Re: Slow qTime for distributed search

Reply via email to