Hi,
----- Original Message ---- > > Currently I use an application connected to a queue containing incoming > data which my indexer app turns into solr docs. I log everything to a > log table and have never had an issue with losing anything. Yeah, if everything goes through some storage that can be polled (either a DB or a durable JMS Topic or some such), then N masters could connect to it, not miss anything, and be more or less in near real-time sync. > I can trace > incoming docs exactly, and keep timing data in there also. If I added a > second solr url for a second master and resent the same doc to master02 > that I sent to master01, I would expect near 100% synchronization. The > problem here is how to get the slave farm to start replicating from the > second master if and when the first goes down. I can only see that as > being a manual operation, repointing the slaves to master02 and > restarting or reloading them etc... Actually, you can configure a LB to handle that, so that's less of a problem, I think. Otis ---- Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ > -----Original Message----- > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > Sent: Wednesday, March 09, 2011 8:52 AM > To: solr-user@lucene.apache.org > Subject: Re: True master-master fail-over without data gaps > > Hi, > > > ----- Original Message ---- > > From: Robert Petersen <rober...@buy.com> > > To: solr-user@lucene.apache.org > > Sent: Wed, March 9, 2011 11:40:56 AM > > Subject: RE: True master-master fail-over without data gaps > > > > If you have a wrapper, like an indexer app which prepares solr docs > and > > sends them into solr, then it is simple. The wrapper is your 'tee' > and > > it can send docs to both (or N) masters. > > Doesn't this make it too easy for 2 masters to get out of sync even if > the > problem is not with them? > e.g. something happens in this "tee" component and it indexes a doc to > master A, > but not master B. > > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > > -----Original Message----- > > From: Michael Sokolov [mailto:soko...@ifactory.com] > > Sent: Wednesday, March 09, 2011 4:14 AM > > To: solr-user@lucene.apache.org > > Cc: Jonathan Rochkind > > Subject: Re: True master-master fail-over without data gaps > > > > Yes, I think this should be pushed upstream - insert a "tee" in the > > document stream so that all documents go to both masters. > > Then use a load balancer to make requests of the masters. > > > > The "tee" itself then becomes a possible single point of failure, but > > > you didn't say anything about the architecture of the document feed. > Is > > > > that also fault-tolerant? > > > > -Mike > > > > On 3/9/2011 1:06 AM, Jonathan Rochkind wrote: > > > I'd honestly think about buffer the incoming documents in some > store > > that's actually made for fail-over persistence reliability, maybe > > CouchDB or something. And then that's taking care of not losing > > anything, and the problem becomes how we make sure that our solr > master > > indexes are kept in sync with the actual persistent store; which I'm > > still not sure about, but I'm thinking it's a simpler problem. The > right > > tool for the right job, that kind of failover persistence is not > solr's > > specialty. > > > ________________________________________ > > > From: Otis Gospodnetic [otis_gospodne...@yahoo.com] > > > Sent: Tuesday, March 08, 2011 11:45 PM > > > To: solr-user@lucene.apache.org > > > Subject: True master-master fail-over without data gaps > > > > > > Hello, > > > > > > What are some common or good ways to handle indexing (master) > > fail-over? > > > Imagine you have a continuous stream of incoming documents that you > > have to > > > index without losing any of them (or with losing as few of them as > > possible). > > > How do you set up you masters? > > > In other words, you can't just have 2 masters where the secondary > is > > the > > > Repeater (or Slave) of the primary master and replicates the index > > periodically: > > > you need to have 2 masters that are in sync at all times! > > > How do you achieve that? > > > > > > * Do you just put N masters behind a LB VIP, configure them both to > > point to the > > > index on some shared storage (e.g. SAN), and count on the LB to > > fail-over to the > > > secondary master when the primary becomes unreachable? > > > If so, how do you deal with index locks? You use the Native lock > and > > count on > > > it disappearing when the primary master goes down? That means you > > count on the > > > whole JVM process dying, which may not be the case... > > > > > > * Or do you use tools like DRBD, Corosync, Pacemaker, etc. to keep > 2 > > masters > > > with 2 separate indices in sync, while making sure you write to > only 1 > > of them > > > via LB VIP or otherwise? > > > > > > * Or ... > > > > > > > > > This thread is on a similar topic, but is inconclusive: > > > http://search-lucene.com/m/aOsyN15f1qd1 > > > > > > Here is another similar thread, but this one doesn't cover how 2 > > masters are > > > kept in sync at all times: > > > http://search-lucene.com/m/aOsyN15f1qd1 > > > > > > Thanks, > > > Otis > > > ---- > > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > > Lucene ecosystem search :: http://search-lucene.com/ > > > > > > > >