Re Amazon elb. This is not exactly true. The ELB does load balancer internal IPs. But the ELB IP address must be external. Still a major issue unless you use authentication. Nginx and others can also do load balancing.
Bill Bell Sent from mobile On Jun 8, 2011, at 3:32 AM, "Upayavira" <u...@odoko.co.uk> wrote: > > > On Wed, 08 Jun 2011 10:42 +0300, "Dmitry Kan" <dmitry....@gmail.com> > wrote: >> Hello list, >> >> Thanks for attending to my previous questions so far, have learnt a lot. >> Here is another one, I hope it will be interesting to answer. >> >> >> >> We run our SOLR shards and front end SOLR on the Amazon high-end >> machines. >> Currently we have 6 shards with around 200GB in each. Currently we have >> only >> one front end SOLR which, given a client query, redirects it to all the >> shards. Our shards are constantly growing, data is at times reindexed (in >> batches, which is done by removing a decent chunk before replacing it >> with >> updated data), constant stream of new data is coming every hour (usually >> hits the latest shard in time, but can also hit other shards, which have >> older data). Since the front end SOLR has started to be a SPOF, we are >> thinking about setting up some sort of load balancer. >> >> 1) do you think ELB from Amazon is a good solution for starters? We don't >> need to maintain sessions between SOLR and client. >> 2) What other load balancers have been used specifically with SOLR? >> >> >> Overall: does SOLR scale to such size (200GB in an index) and what can be >> recommended as next step -- resharding (cutting existing shards to >> smaller >> chunks), replication? > > Really, it is going to be up to you to work out what works in your > situation. You may be reaching the limit of what a Lucene index can > handle, don't know. If your query traffic is low, you might find that > two 100Gb cores in a single instance performs better. But then, maybe > not! Or two 100Gb shards on smaller Amazon hosts. But then, maybe not! > :-) > > The principal issue with Amazon's load balancers (at least when I was > using them last year) is that the ports that they balance need to be > public. You can't use an Amazon load balancer as an internal service > within a security group. For a service such as Solr, that can be a bit > of a killer. > > If they've fixed that issue, then they'd work fine (I used them quite > happily in another scenario). > > When looking at resolving single points of failure, handling search is > pretty easy (as you say, stateless load balancer). You will need to give > more attention though to how you handle it regarding indexing. > > Hope that helps a bit! > > Upayavira > > > > > > --- > Enterprise Search Consultant at Sourcesense UK, > Making Sense of Open Source >