And they do the right thing far faster than a load balancer would. In one test I made, zookeeper updated th cluster state within 200ms. It may well have been less than that, I didn't check. I had requests going to a cluster in a loop, and my client (pysolr with PR #138) retried on connection failure after a 200ms wait, and requests always succeeded whenever I killed a node.
Load balancers I have worked with take 20s or so to spot a down server. Upayavira On Fri, Dec 18, 2015, at 03:37 PM, Erick Erickson wrote: > You're over-complicating it, the complexity is already in Solr ;)... > > First, if your using a SolrJ client (assuming you're accessing > Solr from your app), use the CloudSolrClient class. This takes > a ZK ensemble and does it's own load balancing via a software > load balancer. > > If you're not using SolrJ, then Markus' comment of > just using a load balancer is the way to go. > > Internally, for all the shard requests that your top-level > query generates, _those_ are load balanced as well > via a software load balancer by the individual nodes > receiving the top-level request. > > SolrJ and the nodes register themselves as listeners for changes in > cluster state and get notified by Zookeeper if a Solr node > goes down. At that point it is taken out of the rotation. > Likewise if a down node comes back up (or a new Solr instance > powers up) all listeners get a notification and "do the right thing". > > Best, > Erick > > > > On Fri, Dec 18, 2015 at 4:42 AM, Markus Jelsma > <markus.jel...@openindex.io> wrote: > > Hello - a simple load balancer will do just fine. Or more sophisticated > > tools such as Varnish, HAProxy or Nginx, which we use. A hardware > > loadbalancer would obviously also do the job > > Markus > > > > > > -----Original message----- > >> From:Andrej van der Zee <andrejvander...@gmail.com> > >> Sent: Friday 18th December 2015 13:20 > >> To: solr-user@lucene.apache.org > >> Subject: Load-balancing Solr instances > >> > >> Hi, > >> > >> Could someone please inform me about best practices when load-balancing > >> queries over Solr instances? We will have many shards each with multiple > >> replications. > >> > >> I understand that sending my request to one particular Solr instance will > >> be routed appropriately, but requests will still be sent to this single > >> instance. This instance might be busy routing while it should serve his > >> core. Moreover, it might even go down. > >> > >> Would it make sense to have multiple "dummy" Solr instances that only route > >> requests and do not serve any core, in order to minimize internal routing? > >> > >> Or, would it be better to install a reverse proxy in front of the Solr > >> Cloud? Moreover, would it make sense to update the load-balancer > >> dynamically watching Zookeeper, in order to minimize internal routing of > >> requests? > >> > >> Thanks, > >> Andrej > >>