And they do the right thing far faster than a load balancer would. In
one test I made, zookeeper updated th cluster state within 200ms. It may
well have been less than that, I didn't check. I had requests going to a
cluster in a loop, and my client (pysolr with PR #138) retried on
connection failure after a 200ms wait, and requests always succeeded
whenever I killed a node.

Load balancers I have worked with take 20s or so to spot a down server.

Upayavira

On Fri, Dec 18, 2015, at 03:37 PM, Erick Erickson wrote:
> You're over-complicating it, the complexity is already in Solr ;)...
> 
> First, if your using a SolrJ client (assuming you're accessing
> Solr from your app), use the CloudSolrClient class. This takes
> a ZK ensemble and does it's own load balancing via a software
> load balancer.
> 
> If you're not using SolrJ, then Markus' comment of
> just using a load balancer is the way to go.
> 
> Internally, for all the shard requests that your top-level
> query generates, _those_ are load balanced as well
> via a software load balancer by the individual nodes
> receiving the top-level request.
> 
> SolrJ and the nodes register themselves as listeners for changes in
> cluster state and get notified by Zookeeper if a Solr node
> goes down. At that point it is taken out of the rotation.
> Likewise if a down node comes back up (or a new Solr instance
> powers up) all listeners get a notification and "do the right thing".
> 
> Best,
> Erick
> 
> 
> 
> On Fri, Dec 18, 2015 at 4:42 AM, Markus Jelsma
> <markus.jel...@openindex.io> wrote:
> > Hello - a simple load balancer will do just fine. Or more sophisticated 
> > tools such as Varnish, HAProxy or Nginx, which we use. A hardware 
> > loadbalancer would obviously also do the job
> > Markus
> >
> >
> > -----Original message-----
> >> From:Andrej van der Zee <andrejvander...@gmail.com>
> >> Sent: Friday 18th December 2015 13:20
> >> To: solr-user@lucene.apache.org
> >> Subject: Load-balancing Solr instances
> >>
> >> Hi,
> >>
> >> Could someone please inform me about best practices when load-balancing
> >> queries over Solr instances? We will have many shards each with multiple
> >> replications.
> >>
> >> I understand that sending my request to one particular Solr instance will
> >> be routed appropriately, but requests will still be sent to this single
> >> instance. This instance might be busy routing while it should serve his
> >> core. Moreover, it might even go down.
> >>
> >> Would it make sense to have multiple "dummy" Solr instances that only route
> >> requests and do not serve any core, in order to minimize internal routing?
> >>
> >> Or, would it be better to install a reverse proxy in front of the Solr
> >> Cloud? Moreover, would it make sense to update the load-balancer
> >> dynamically watching Zookeeper, in order to minimize internal routing of
> >> requests?
> >>
> >> Thanks,
> >> Andrej
> >>

Reply via email to