Re: solrcloud load balancing

Jay Potharaju Sat, 22 Oct 2016 22:38:10 -0700

Thanks Erick & Shawn for the response.

In case of non-distributed queries(single shard with replicas) is there a
way for me to determine how long does it take to retrieve the documents
 and send the response.


In my load test , i see that the response time at the client API is in
seconds but I am not able to see any high response time in the solr logs.
Is it possible that the under high load it takes a long time to retrieve
and send the documents?
If i run the same query in browser individually it comes back in quick time.

Thanks
Jay

On Sat, Oct 22, 2016 at 6:14 PM, Shawn Heisey <apa...@elyograg.org> wrote:

> On 10/22/2016 6:19 PM, Jay Potharaju wrote:
> > I am trying to understand how load balancing works in solrcloud.
> >
> > As per my understanding solrcloud provides load balancing when querying
> > using an http endpoint.  When a query is sent to any of the nodes , solr
> > will intelligently decide which server can fulfill the request and will
> be
> > processed by one of the nodes in the cluster.
>
> Erick already responded, but I had this mostly written before I saw his
> response.  I decided to send it anyway.
>
> > 1) Does the logic change when there is only 1 shard vs multiple shards?
>
> The way I understand it, each shard is independently load balanced.  You
> might have a situation where one shard has more replicas than another
> shard, and I believe in that even in that situation, all replicas should
> be used.
>
> > 2) Does the QTime displayed is sum of processing time for the query
> request + latency(if processed by another node) + time to decide which node
> will process the request(which i am guessing is minimal and can be ignored)
>
> There are three phases in a distributed (multi-shard) query.
>
> 1) Each shard is sent the query, with the field list set to include the
> score, the unique key field, and if there is a sort parameter, whichever
> fields are used for sorting.  These requests happen in parallel.
> Whichever request takes the longest will determine the total time for
> this phase.
>
> 2) The responses from the subqueries are combined to determine which
> documents will make up the final result.
>
> 3) Additional queries are sent to the individual shards to retrieve the
> matching documents.  These requests are also in parallel, so the slowest
> such request will determine the time for this whole phase.
>
> > 3) In my solr logs i display the "slow" queries, is the qtime displayed
> > takes all of the above and shows the correct time taken.
>
> For non-distributed queries, QTime includes the time required to process
> the query, but not the time to retrieve the documents and send the
> response.  I *think* that when the query is distributed, QTime will be
> the sum of the first two phases that I mentioned above, but I'm not 100%
> sure.
>
> Thanks,
> Shawn
>
>


-- 
Thanks
Jay Potharaju

Re: solrcloud load balancing

Reply via email to