Right, you’re running into the “laggard” problem, you can’t get the overall
result back until every shard has responded. There’s an interesting
parameter “shards.info=true” will give you some information about
the time taken by the sub-search on each shard.

But given your numbers, I think your root problem is that 
your hardware is overmatched. In total, you have 14B documents, correct?
and a replication factor of 2. Meaning each of your 10 machines has 2.8
billion docs in 128G total memory. Another way to look at it is that you
have 7 x 20 x 2 = 280 replicas each with a 40G index. So each node has
28 replicas/node and handles over a terabyte of index in aggregate. At first
blush, you’ve overloaded your hardware. My guess here is that one node or
the other has to do a lot of swapping/gc/whatever quite regularly when
you query. Given that you’re on HDDs, this can be quite expensive. 

I think you can simplify your analysis problem a lot by concentrating on
a single machine, load it up and analyze it heavily. Monitor I/O, analyze
GC and CPU utilization. My first guess is you’ll see heavy I/O. Once the index
is read into memory, a well-sized Solr installation won’t see much I/O,
_especially_ if you simplify the query to only ask for, say, some docValues
field or rows=0. I think that your OS is swapping significant segments of the
Lucene index in and out to satisfy your queries.

GC is always a place to look. You should have GC logs available to see if
you spend lots of CPU cycles in GC. I doubt that GC tuning will fix your
performance issues, but it’s something to look at.

A quick-n-dirty way to see if it’s swapping as I suspect is to monitor
CPU utilization. A red flag is if it’s low and your queries _still_ take
a long time. That’s a “smoking gun” that you’re swapping. While
indicative, that’s not definitive since I’ve also seen CPU pegged because
of GC, so if CPUs are running hot, you have to dig deeper...


So to answer your questions:

1> I doubt decreasing the number of shards will help.. I think you
     simply have too many docs per node, changing the number of 
     shards isn’t going to change that.

2> A strongly suspect that increasing replicas will make the 
     problem _worse_ since it’ll just add more docs per node.

3> It Depends (tm). I wrote “the sizing blog” a long time ago because
     this question comes up so often and is impossible to answer in the
     abstract. Well, it’s possible in theory, but in practice there are just
     too many variables. What you need to do to fully answer this in your
     situation with your data is set up a node and “test it to destruction”.

     
https://lucidworks.com/post/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

    I want to emphasize that you _must_ have realistic queries when
    you test as in that blog, and you _must_ have a bunch of them
    in order to not get fooled by hitting, say, your queryResultCache. I
    had one client who “stress tested” with the same query and was
    getting 3ms response times because, after the first one, they never
    needed to do any searching at all, everything was in caches. Needless
    to say that didn’t hold up when they used a realistic query mix.

Best,
Erick

> On May 29, 2020, at 4:53 AM, Anshuman Singh <singhanshuma...@gmail.com> wrote:
> 
> Thanks for your reply, Erick. You helped me in improving my understanding
> of how Solr distributed requests work internally.
> 
> Actually my ultimate goal is to improve search performance in one of our
> test environment where the queries are taking upto 60 seconds to execute.
> *We want to fetch at least the first top 100 rows in seconds (< 5 seconds).
> *
> 
> Right now, we have 7 Collections across 10 Solr nodes, each Collection
> having approx 2B records equally distributed across 20 shards with rf 2.
> Each replica/core is ~40GB in size . The number of users are very few
> (<10). We are using HDDs and each host has 128 GB RAM. Solr JVM Heap size
> is 24GB. In the actual production environment, we are planning for 100 such
> machines and we will be ingesting ~2B records on daily basis. We will
> retain data of upto 3 months.
> 
> I followed your suggestion of not querying more than 100 rows and this is
> my observation. I ran queries with the debugQuery param and found that the
> query response time depends on the worst performing shard as some of the
> shards take longer to execute the query than other shards.
> 
> Here are my questions:
> 
>   1.  Is decreasing number of shards going to help us as there will be
>   lesser number of shards to be queried?
>   2.  Is increasing number of replicas going to help us as there will be
>   load balancing?
>   3.  How many records should we keep in each Collection or in each
>   replica/core? Will we face performance issues if the core size becomes too
>   big?
> 
> Any other suggestions are appreciated.
> 
> On Wed, May 27, 2020 at 9:23 PM Erick Erickson <erickerick...@gmail.com>
> wrote:
> 
>> First of all, asking for that many rows will spend a lot of time
>> gathering the document fields. Assuming you have stored fields,
>> each doc requires
>> 1> the aggregator node getting the candidate 100000 docs from each shard
>> 
>> 2> The aggregator node sorting those 100000 docs from each shard into the
>> true top 100000 based on the sort criteria (score by default)
>> 
>> 3> the aggregator node going back to the shards and asking them for those
>> docs of that 100000 that are resident on that shard
>> 
>> 4> the aggregator node assembling the final docs to be sent to the client
>> and sending them.
>> 
>> So my guess is that when you fire requests at a particular replica that
>> has to get them from the other shard’s replica on another host, the network
>> back-and-forth is killing your perf. It’s not that your network is having
>> problems, just that you’re pushing a lot of data back and forth in your
>> poorly-performing cases.
>> 
>> So first of all, specifying 100K rows is an anti-pattern. Outside of
>> streaming, Solr is built on the presumption that you’re after the top few
>> rows (< 100, say). The times vary a lot depending on whether you need to
>> read stored fields BTW.
>> 
>> Second, I suspect your test is bogus. If you run the tests in the order
>> you gave, the first one will read the necessary data from disk and probably
>> have it in the OS disk cache for the second and subsequent. And/or you’re
>> getting results from your queryResultCache (although you’d have to have a
>> big one). Specifying the exact same query when trying to time things is
>> usually a mistake.
>> 
>> If your use-case requires 100K rows, you should be using streaming or
>> cursorMark. While that won’t make the end-to-end time shorter, but won’t
>> put such a strain on the system.
>> 
>> Best,
>> Erick
>> 
>>> On May 27, 2020, at 10:38 AM, Anshuman Singh <singhanshuma...@gmail.com>
>> wrote:
>>> 
>>> I have a Solr cloud setup (Solr 7.4) with a collection "test" having two
>>> shards on two different nodes. There are 4M records equally distributed
>>> across the shards.
>>> 
>>> If I query the collection like below, it is slow.
>>> http://localhost:8983/solr/*test*/select?q=*:*&rows=100000
>>> QTime: 6930
>>> 
>>> If I query a particular shard like below, it is also slow.
>>> http://localhost:8983/solr/*test_shard1_replica_n2*
>>> /select?q=*:*&rows=100000&shards=*shard2*
>>> QTime: 5494
>>> *Notice shard2 in shards parameter and shard1 in the core being queried.*
>>> 
>>> But this is faster:
>>> http://localhost:8983/solr/*test_shard1_replica_n2*
>>> /select?q=*:*&rows=100000&shards=*shard1*
>>> QTime: 57
>>> 
>>> This is also faster:
>>> http://localhost:8983/solr/*test_shard2_replica_n4*
>>> /select?q=*:*&rows=100000&shards=*shard2*
>>> QTime: 71
>>> 
>>> I don't think it is the network as I performed similar tests with a
>> single
>>> node setup as well. If you query a particular core and the corresponding
>>> logical shard, it is much faster than querying a different shard or core.
>>> 
>>> Why is this behaviour? How to make the first two queries work as fast as
>>> the last two queries?
>> 
>> 

Reply via email to