No matter where you send the update to initially, it will get sent to the 
leader of the shard first.  The leader does a parsing of it to ensure it can be 
indexed, then it will send it to all the replicas in parallel.  The replicas 
will do their parsing and report back that they have persisted the data to 
their tlogs.  Once the leader hears back from all the replicas, the leader will 
reply back that the update is complete, and your client will receive it's HTTP 
response on the transaction.

At least that's the general case flow.

So it really won't matter how your load balancing is handled above the cloud.  
All the work is done the same way, with the leader having to do slightly more 
work than the replicas.

If you can manage to initially send all the updates to the correct leader, you 
can skip one hop before the work starts, which may buy you a small performance 
boost compared to randomly picking a node to send the request to.  But you'll 
need to be taxing the cloud pretty heavily before that difference becomes too 
noticeable.

-----Original Message-----
From: Sadheera Vithanage [mailto:sadhee...@gmail.com] 
Sent: Thursday, October 20, 2016 5:55 PM
To: solr-user@lucene.apache.org
Subject: Re: Load balancing with solr cloud

Thank you very much John and Garth,

I've tested it out and it works fine, I can send the updates to any of the solr 
nodes.

If I am not using a zookeeper aware client and If I direct all my queries (read 
queries) always to the leader of the solr instances,does it automatically load 
balance between the replicas?

Or do I have to hit each instance in a round robin way and have the load 
balanced through the code?

Please advise the best way to do so..

Thank you very much again..



On Fri, Oct 21, 2016 at 9:18 AM, Garth Grimm < 
garthgr...@averyranchconsulting.com> wrote:

> Actually, zookeeper really won't participate in the update process at all.
>
> If you're using a "zookeeper aware" client like SolrJ, the SolrJ 
> library will read the cloud configuration from zookeeper, but will 
> send all the updates to the leader of the shard that the document is meant to 
> go to.
>
> If you're not using a "zookeeper aware" client, you can send the 
> update to any of the solr nodes, and they will evaluate the cloud 
> configuration information they've already received from zookeeper, and 
> then forward the document to leader of the shard that will handle the 
> document update.
>
> In general, Zookeeper really only provides the cloud configuration 
> information once (at most) during all the updates, the actual document 
> update only gets sent to solr nodes.  There's definitely no need to 
> distribute load between zookeepers for this situation.
>
> Regards,
> Garth Grimm
>
> -----Original Message-----
> From: Sadheera Vithanage [mailto:sadhee...@gmail.com]
> Sent: Thursday, October 20, 2016 5:11 PM
> To: solr-user@lucene.apache.org
> Subject: Load balancing with solr cloud
>
> Hi again Experts,
>
> I have a question related to load balancing in solr cloud.
>
> If we have 3 zookeeper nodes and 3 solr instances (1 leader, 2 
> secondary replicas and 1 shard), when the traffic comes in the primary 
> zookeeper server will be hammered, correct?
>
> I understand (or is it wrong) that zookeeper will load balance between 
> solr nodes but if we want to distribute the load between zookeeper 
> nodes as well, what is the best approach.
>
> Cost is a concern for us too.
>
> Thank you very much, in advance.
>
> --
> Regards
>
> Sadheera Vithanage
>



--
Regards

Sadheera Vithanage

Reply via email to