Re: master/slave failure scenario

Otis Gospodnetic Thu, 21 May 2009 10:02:11 -0700

Hi,

You should be able to do the following.
Put masters behind a load balancer (LB).
Create a LB VIP and a pool with 2 masters, masterA & masterB with a rule that 
all requests always go to A unless A is down.  If If A is down they go to B.
Bring up master instances A and B on 2 servers and make them point to the 
shared storage.


masterA ----\
                   \--> shared storage
                   /
masterB ----/

Your indexing client doesn't talk to the servers directly. It talks through the 
VIP you created in LB.
At any one time only one of the masters is active.
If A goes down, LB detects it and makes B active.
Your indexer may have to reconnect if it detects a failure, maybe it would need 
to reindex some number of documents if they didn't make it to disk before A 
died, maybe even some lock file cleanup might be needed, but the above should 
be doable with little effort.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: nk 11 <nick.cass...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Thursday, May 21, 2009 12:44:55 PM
> Subject: Re: master/slave failure scenario
> 
> You are right... I just don't like the idea of stopping the indexing process
> if the master fails until a new one is started (more or less by hand).
> 
> On Thu, May 21, 2009 at 6:49 PM, Bryan Talbot wrote:
> 
> > Indexing is usually much more expensive that replication so it won't scale
> > well as you add more servers.  Also, what would a client do if it was able
> > to send the update to only some of the servers because others were down (for
> > maintenance, etc)?
> >
> >
> >
> > -Bryan
> >
> >
> >
> >
> >
> > On May 21, 2009, at May 21, 6:04 AM, nk 11 wrote:
> >
> >  Just curious. What would be the disadvantages of a no replication / multi
> >> master (no slave) setup?
> >> The client code should do the updates for evey master ofc, but if one
> >> machine would fail then I can imediatly continue the indexing process and
> >> also I can query the index on any machine for a valid result.
> >> I might be missing something...
> >> On Thu, May 14, 2009 at 4:19 PM, nk 11 wrote:
> >>
> >>  wow! that was just a couple of days old!
> >>> thanks as lot!
> >>>  2009/5/14 Noble Paul നോബിള്‍ नोब्ळ् 
> >>>
> >>>  yeah there is a hack
> >>>>
> >>>>
> >>>> 
> https://issues.apache.org/jira/browse/SOLR-1154?focusedCommentId=12708316&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel
> >>>> #action_12708316
> >>>>
> >>>> On Thu, May 14, 2009 at 6:07 PM, nk 11 wrote:
> >>>>
> >>>>> sorry for the mail. I wanted to hit reply :(
> >>>>>
> >>>>> On Thu, May 14, 2009 at 3:37 PM, nk 11 wrote:
> >>>>>
> >>>>>>
> >>>>>> oh, so the configuration must be manualy changed?
> >>>>>> Can't something be passed at (re)start time?
> >>>>>>
> >>>>>> 2009/5/14 Noble Paul നോബിള്‍ नोब्ळ् 
> >>>>>>
> >>>>>>>
> >>>>>>> On Thu, May 14, 2009 at 4:07 PM, nk 11 
> >>>>>>>
> >>>>>> wrote:
> >>>>
> >>>>> Ok so the VIP will point to the new master. but what makes a slave
> >>>>>>>> promoted
> >>>>>>>> to a master? Only the fact that it will receive add/update requests?
> >>>>>>>> And I suppose that this "hot" promotion is possible only if the
> >>>>>>>>
> >>>>>>> slave
> >>>>
> >>>>> is
> >>>>>>>> convigured as master also...
> >>>>>>>>
> >>>>>>> right.. By default you can setup all slaves to be master also. It
> >>>>>>> does
> >>>>>>> not cost anything if it is not serving any requests.
> >>>>>>>
> >>>>>>> so , if you have such a setting you will have to disable that slave
> >>>>>>> to
> >>>>>>> be a slave and restart it and you will have to make the VIP point to
> >>>>>>> this new slave as master.
> >>>>>>>
> >>>>>>> so hot promotion is still not possible.
> >>>>>>>
> >>>>>>>>
> >>>>>>>> 2009/5/14 Noble Paul നോബിള്‍ नोब्ळ् 
> >>>>>>>>
> >>>>>>>>>
> >>>>>>>>> ideally , we don't do that.
> >>>>>>>>> you can just keep the master host behind a VIP so if you wish to
> >>>>>>>>> change the master make the VIP point to the new host
> >>>>>>>>>
> >>>>>>>>> On Wed, May 13, 2009 at 10:52 PM, nk 11 
> >>>>>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> This is more interesting.Such a procedure would involve taking
> >>>>>>>>>>
> >>>>>>>>> down
> >>>>
> >>>>> and
> >>>>>>>>>> reconfiguring the slave?
> >>>>>>>>>>
> >>>>>>>>>> On Wed, May 13, 2009 at 7:55 PM, Bryan Talbot
> >>>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>  Or ...
> >>>>>>>>>>>
> >>>>>>>>>>> 1. Promote existing slave to new master
> >>>>>>>>>>> 2. Add new slave to cluster
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> -Bryan
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>> On May 13, 2009, at May 13, 9:48 AM, Jay Hill wrote:
> >>>>>>>>>>>
> >>>>>>>>>>> - Migrate configuration files from old master (or backup) to
> >>>>>>>>>>>
> >>>>>>>>>> new
> >>>>
> >>>>> master.
> >>>>>>>>>>>
> >>>>>>>>>>>> - Replicate from a slave to the new master.
> >>>>>>>>>>>> - Resume indexing to new master.
> >>>>>>>>>>>>
> >>>>>>>>>>>> -Jay
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, May 13, 2009 at 4:26 AM, nk 11 
> >>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>  wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>> Nice.
> >>>>>>>>>>>>
> >>>>>>>>>>>>> What if the master fails permanently (like a disk crash...)
> >>>>>>>>>>>>>
> >>>>>>>>>>>> and
> >>>>
> >>>>> the
> >>>>>>>>>>>>> new
> >>>>>>>>>>>>> master is a clean machine?
> >>>>>>>>>>>>> 2009/5/13 Noble Paul നോബിള്‍ नोब्ळ् 
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Wed, May 13, 2009 at 12:10 PM, nk 11 <
> >>>>>>>>>>>>>
> >>>>>>>>>>>> nick.cass...@gmail.com>
> >>>>
> >>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  Hello
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> I'm kind of new to Solr and I've read about replication, and
> >>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>> fact
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  that a
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  node can act as both master and slave.
> >>>>>>>>>>>>>>> I a replica fails and then comes back on line I suppose that
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> it
> >>>>
> >>>>> will
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  resyncs
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  with the master.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  right
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> But what happnes if the master fails? A slave that is
> >>>>>>>>>>>>>>> configured as
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  master
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  will kick in? What if that slave is not yes fully sync'ed
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>> with
> >>>>
> >>>>> the
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  failed
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  master and has old data?
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  if the master fails you can't index the data. but the slaves
> >>>>>>>>>>>>>> will
> >>>>>>>>>>>>>> continue serving the requests with the last index. You an
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>> bring
> >>>>
> >>>>> back
> >>>>>>>>>>>>>> the master up and resume indexing.
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  What happens when the original master comes back on line? He
> >>>>>>>>>>>>>>> will
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>  remain
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>  a
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>  slave because there is another node with the master role?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Thank you!
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> --
> >>>>>>>>>>>>>> -----------------------------------------------------
> >>>>>>>>>>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> --
> >>>>>>>>> -----------------------------------------------------
> >>>>>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
> >>>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>> --
> >>>>>>> -----------------------------------------------------
> >>>>>>> Noble Paul | Principal Engineer| AOL | http://aol.com
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> -----------------------------------------------------
> >>>> Noble Paul | Principal Engineer| AOL | http://aol.com
> >>>>
> >>>>
> >>>
> >>>
> >

Re: master/slave failure scenario

Reply via email to