Currently I use an application connected to a queue containing incoming data which my indexer app turns into solr docs. I log everything to a log table and have never had an issue with losing anything. I can trace incoming docs exactly, and keep timing data in there also. If I added a second solr url for a second master and resent the same doc to master02 that I sent to master01, I would expect near 100% synchronization. The problem here is how to get the slave farm to start replicating from the second master if and when the first goes down. I can only see that as being a manual operation, repointing the slaves to master02 and restarting or reloading them etc...
-----Original Message----- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: Wednesday, March 09, 2011 8:52 AM To: solr-user@lucene.apache.org Subject: Re: True master-master fail-over without data gaps Hi, ----- Original Message ---- > From: Robert Petersen <rober...@buy.com> > To: solr-user@lucene.apache.org > Sent: Wed, March 9, 2011 11:40:56 AM > Subject: RE: True master-master fail-over without data gaps > > If you have a wrapper, like an indexer app which prepares solr docs and > sends them into solr, then it is simple. The wrapper is your 'tee' and > it can send docs to both (or N) masters. Doesn't this make it too easy for 2 masters to get out of sync even if the problem is not with them? e.g. something happens in this "tee" component and it indexes a doc to master A, but not master B. Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > -----Original Message----- > From: Michael Sokolov [mailto:soko...@ifactory.com] > Sent: Wednesday, March 09, 2011 4:14 AM > To: solr-user@lucene.apache.org > Cc: Jonathan Rochkind > Subject: Re: True master-master fail-over without data gaps > > Yes, I think this should be pushed upstream - insert a "tee" in the > document stream so that all documents go to both masters. > Then use a load balancer to make requests of the masters. > > The "tee" itself then becomes a possible single point of failure, but > you didn't say anything about the architecture of the document feed. Is > > that also fault-tolerant? > > -Mike > > On 3/9/2011 1:06 AM, Jonathan Rochkind wrote: > > I'd honestly think about buffer the incoming documents in some store > that's actually made for fail-over persistence reliability, maybe > CouchDB or something. And then that's taking care of not losing > anything, and the problem becomes how we make sure that our solr master > indexes are kept in sync with the actual persistent store; which I'm > still not sure about, but I'm thinking it's a simpler problem. The right > tool for the right job, that kind of failover persistence is not solr's > specialty. > > ________________________________________ > > From: Otis Gospodnetic [otis_gospodne...@yahoo.com] > > Sent: Tuesday, March 08, 2011 11:45 PM > > To: solr-user@lucene.apache.org > > Subject: True master-master fail-over without data gaps > > > > Hello, > > > > What are some common or good ways to handle indexing (master) > fail-over? > > Imagine you have a continuous stream of incoming documents that you > have to > > index without losing any of them (or with losing as few of them as > possible). > > How do you set up you masters? > > In other words, you can't just have 2 masters where the secondary is > the > > Repeater (or Slave) of the primary master and replicates the index > periodically: > > you need to have 2 masters that are in sync at all times! > > How do you achieve that? > > > > * Do you just put N masters behind a LB VIP, configure them both to > point to the > > index on some shared storage (e.g. SAN), and count on the LB to > fail-over to the > > secondary master when the primary becomes unreachable? > > If so, how do you deal with index locks? You use the Native lock and > count on > > it disappearing when the primary master goes down? That means you > count on the > > whole JVM process dying, which may not be the case... > > > > * Or do you use tools like DRBD, Corosync, Pacemaker, etc. to keep 2 > masters > > with 2 separate indices in sync, while making sure you write to only 1 > of them > > via LB VIP or otherwise? > > > > * Or ... > > > > > > This thread is on a similar topic, but is inconclusive: > > http://search-lucene.com/m/aOsyN15f1qd1 > > > > Here is another similar thread, but this one doesn't cover how 2 > masters are > > kept in sync at all times: > > http://search-lucene.com/m/aOsyN15f1qd1 > > > > Thanks, > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > >