Hi Otis, Have you considered using Solandra with Quorum writes to achieve master/master with CA semantics?
-Jake On Wed, Mar 9, 2011 at 2:48 PM, Otis Gospodnetic <otis_gospodne...@yahoo.com > wrote: > Hi, > > ---- Original Message ---- > > > From: Robert Petersen <rober...@buy.com> > > > > Can't you skip the SAN and keep the indexes locally? Then you would > > have two redundant copies of the index and no lock issues. > > I could, but then I'd have the issue of keeping them in sync, which seems > more > fragile. I think SAN makes things simpler overall. > > > Also, Can't master02 just be a slave to master01 (in the master farm and > > separate from the slave farm) until such time as master01 fails? Then > > No, because it wouldn't be in sync. It would always be N minutes behind, > and > when the primary master fails, the secondary would not have all the docs - > data > loss. > > > master02 would start receiving the new documents with an indexes > > complete up to the last replication at least and the other slaves would > > be directed by LB to poll master02 also... > > Yeah, "complete up to the last replication" is the problem. It's a data > gap > that now needs to be filled somehow. > > Otis > ---- > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > Lucene ecosystem search :: http://search-lucene.com/ > > > > -----Original Message----- > > From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] > > Sent: Wednesday, March 09, 2011 9:47 AM > > To: solr-user@lucene.apache.org > > Subject: Re: True master-master fail-over without data gaps (choosing CA > > in CAP) > > > > Hi, > > > > > > ----- Original Message ---- > > > From: Walter Underwood <wun...@wunderwood.org> > > > > > On Mar 9, 2011, at 9:02 AM, Otis Gospodnetic wrote: > > > > > > > You mean it's not possible to have 2 masters that are in nearly > > real-time > > >sync? > > > > How about with DRBD? I know people use DRBD to keep 2 Hadoop NNs > > (their > > >edit > > > > > > > logs) in sync to avoid the current NN SPOF, for example, so I'm > > thinking > > >this > > > > > > > could be doable with Solr masters, too, no? > > > > > > If you add fault-tolerant, you run into the CAP Theorem. Consistency, > > > > >availability, partition: choose two. You cannot have it all. > > > > Right, so I'll take Consistency and Availability, and I'll put my 2 > > masters in > > the same rack (which has redundant switches, power supply, etc.) and > > thus > > minimize/avoid partitioning. > > Assuming the above actually works, I think my Q remains: > > > > How do you set up 2 Solr masters so they are in near real-time sync? > > DRBD? > > > > But here is maybe a simpler scenario that more people may be > > considering: > > > > Imagine 2 masters on 2 different servers in 1 rack, pointing to the same > > index > > on the shared storage (SAN) that also happens to live in the same rack. > > 2 Solr masters are behind 1 LB VIP that indexer talks to. > > The VIP is configured so that all requests always get routed to the > > primary > > master (because only 1 master can be modifying an index at a time), > > except when > > this primary is down, in which case the requests are sent to the > > secondary > > master. > > > > So in this case my Q is around automation of this, around Lucene index > > locks, > > around the need for manual intervention, and such. > > Concretely, if you have these 2 master instances, the primary master has > > the > > Lucene index lock in the index dir. When the secondary master needs to > > take > > over (i.e., when it starts receiving documents via LB), it needs to be > > able to > > write to that same index. But what if that lock is still around? One > > could use > > the Native lock to make the lock disappear if the primary master's JVM > > exited > > unexpectedly, and in that case everything *should* work and be > > completely > > transparent, right? That is, the secondary will start getting new docs, > > it will > > use its IndexWriter to write to that same shared index, which won't be > > locked > > for writes because the lock is gone, and everyone will be happy. Did I > > miss > > something important here? > > > > Assuming the above is correct, what if the lock is *not* gone because > > the > > primary master's JVM is actually not dead, although maybe unresponsive, > > so LB > > thinks the primary master is dead. Then the LB will route indexing > > requests to > > the secondary master, which will attempt to write to the index, but be > > denied > > because of the lock. So a human needs to jump in, remove the lock, and > > manually > > reindex failed docs if the upstream component doesn't buffer docs that > > failed to > > get indexed and doesn't retry indexing them automatically. Is this > > correct or > > is there a way to avoid humans here? > > > > Thanks, > > Otis > > ---- > > Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch > > Lucene ecosystem search :: http://search-lucene.com/ > > > -- http://twitter.com/tjake