I just realized that I made an assumption about your initial question that may not be true.
Everything I've said has been based on handling requests to add/update documents during the indexing process. That process involves the "leader first" concept I've been mentioning. So to answer your original question on the query side.... > Actually, zookeeper really won't participate in the query process at all. > And the leader role for a core in a shard has no bearing whatsoever. > > ;-) Read ymonad's answer. ;-) The CloudSolrServer class has been renamed to > CloudSolrClient (or something similar) recently, but otherwise, I think his > answer is still basically correct. It's worth noting that even if the node that receives the request has a core that could participate in generating results, it might ask some other core of that same shard to return the results for that shard. The preferLocalShards parameter can be used to avoid that (near the bottom of https://cwiki.apache.org/confluence/display/solr/Distributed+Requests). In any case, if you have many shards, load balancing on the query side is definitely more important than on the indexing side. The query controller will have to merge the result sets (one from each shard), and initiate the second pass of requests to get stored fields, and then marshall all that data back through the HTTP response. That's more extra work then the controller has to do for an update request, which is basically just pass along whatever information the shard leader responded with. And load balancing for reliability purposes is always a good thing. >>> Also, for indexing, I think it's possible to control how many replicas need >>> to confirm to the leader before the response is supplied to the client, as >>> you can with say MongoDB replicas. Yes, that's possible. It's what I was thinking about when I mentioned "...general case flow". That capability is relatively new, and not the default, which is why I didn't mention it. -----Original Message----- From: hairymccla...@yahoo.com.INVALID [mailto:hairymccla...@yahoo.com.INVALID] Sent: Friday, October 21, 2016 4:07 AM To: solr-user@lucene.apache.org Subject: Re: Load balancing with solr cloud As I understand it for non-SolrCloud aware clients you have to manually load balance your searches, see ymonad's answer here: http://stackoverflow.com/questions/22523588/loadbalancer-and-solrcloud This is from 2014 so maybe this has changed now - would be interested to know as well. Also, for indexing, I think it's possible to control how many replicas need to confirm to the leader before the response is supplied to the client, as you can with say MongoDB replicas. On Friday, October 21, 2016 1:18 AM, Garth Grimm <garthgr...@averyranchconsulting.com> wrote: No matter where you send the update to initially, it will get sent to the leader of the shard first. The leader does a parsing of it to ensure it can be indexed, then it will send it to all the replicas in parallel. The replicas will do their parsing and report back that they have persisted the data to their tlogs. Once the leader hears back from all the replicas, the leader will reply back that the update is complete, and your client will receive it's HTTP response on the transaction. At least that's the general case flow. So it really won't matter how your load balancing is handled above the cloud. All the work is done the same way, with the leader having to do slightly more work than the replicas. If you can manage to initially send all the updates to the correct leader, you can skip one hop before the work starts, which may buy you a small performance boost compared to randomly picking a node to send the request to. But you'll need to be taxing the cloud pretty heavily before that difference becomes too noticeable. -----Original Message----- From: Sadheera Vithanage [mailto:sadhee...@gmail.com] Sent: Thursday, October 20, 2016 5:55 PM To: solr-user@lucene.apache.org Subject: Re: Load balancing with solr cloud Thank you very much John and Garth, I've tested it out and it works fine, I can send the updates to any of the solr nodes. If I am not using a zookeeper aware client and If I direct all my queries (read queries) always to the leader of the solr instances,does it automatically load balance between the replicas? Or do I have to hit each instance in a round robin way and have the load balanced through the code? Please advise the best way to do so.. Thank you very much again.. On Fri, Oct 21, 2016 at 9:18 AM, Garth Grimm < garthgr...@averyranchconsulting.com> wrote: > Actually, zookeeper really won't participate in the update process at all. > > If you're using a "zookeeper aware" client like SolrJ, the SolrJ > library will read the cloud configuration from zookeeper, but will > send all the updates to the leader of the shard that the document is meant to > go to. > > If you're not using a "zookeeper aware" client, you can send the > update to any of the solr nodes, and they will evaluate the cloud > configuration information they've already received from zookeeper, and > then forward the document to leader of the shard that will handle the > document update. > > In general, Zookeeper really only provides the cloud configuration > information once (at most) during all the updates, the actual document > update only gets sent to solr nodes. There's definitely no need to > distribute load between zookeepers for this situation. > > Regards, > Garth Grimm > > -----Original Message----- > From: Sadheera Vithanage [mailto:sadhee...@gmail.com] > Sent: Thursday, October 20, 2016 5:11 PM > To: solr-user@lucene.apache.org > Subject: Load balancing with solr cloud > > Hi again Experts, > > I have a question related to load balancing in solr cloud. > > If we have 3 zookeeper nodes and 3 solr instances (1 leader, 2 > secondary replicas and 1 shard), when the traffic comes in the primary > zookeeper server will be hammered, correct? > > I understand (or is it wrong) that zookeeper will load balance between > solr nodes but if we want to distribute the load between zookeeper > nodes as well, what is the best approach. > > Cost is a concern for us too. > > Thank you very much, in advance. > > -- > Regards > > Sadheera Vithanage > -- Regards Sadheera Vithanage