Yes, I guess in this situation there's no guarantee that A has the latest data. I think that this is just an inherent limitation of the quorum based writes though. Unless you have three separate machines at geographically redundant sites, I don't think that you have true redundancy. cheers Cam
On Wed, Nov 6, 2013 at 9:17 AM, Alexander Shraer <[email protected]> wrote: > I don't think reconfiguration will help you here as it requires a > quorum of the old and a quorum of the new ensembles, and here you're > missing a quorum of the old one. > > The problem is that you may have some committed operations on the B > servers that A doesn't know about (writes are done to a quorum). > Moreover, B may just be slow and may be still operational. > > To solve the problem here I think you either need a tie breaker, a > reliable failure detection mechanism (such as when you're manually > doing this because you're sure that B is down) or some kind of > stronger synchrony assumptions (e.g., if A didn't hear from B for 3 > sec it means that B has crashed), this is something that ZK doesn't do > to be more robust to network delays. > > Since this scenario seems very common It may be interesting to > implement some kind of a tie breaker quorum system in zookeeper. > > Alex > > On Tue, Nov 5, 2013 at 12:44 PM, Cameron McKenzie > <[email protected]> wrote: > > I have a similar problem to you. I have more than 2 machines, but only 2 > > geographically redundant sites. > > > > In your situation, you could get some redundancy by running 2 instances > on > > one host, and 1 instance on the other host. This would protect you from > > temporary network glitches (because the machine with 2 instances can > still > > form a quorum), and will protect you from failure of the machine with the > > single instance. It will not help you if the machine with 2 instances > > crashes. > > > > In this situation, where the 2 instance machine dies, you can temporarily > > configure the 1 instance machine to be a single instance cluster, and > then > > when the 2 instance machine is recovered, you can reconfigure the single > > instance machine to be part of the 3 instance cluster again. This process > > is manual, and slightly dangerous, because if you restart nodes in the > > wrong order, you have potential to lose data. This is the approach that I > > have tested and seems to work, but I'd recommend testing it also. > > > > Machine A has ZK instance 1 > > Machine B has ZK instances 2 and 3 > > > > Machine B dies > > Reconfigure ZK instance 1 so that it only has itself in the cluster. This > > means that there is no redundancy at this point, but it can form a quorum > > as its the only instance in the cluster. > > Restart ZK instance 1 to pickup config changes > > Fix up Machine B > > Reconfigure ZK 1 instance to have ZK instances 2 and 3 in its > configuration > > Restart ZK instance 1 to pickup config changes > > Start ZK instance 2 on Machine B. > > Wait for ZK instance 1 on Machine A and ZK instance 2 on machine B form a > > quorum. This is vitally important. If you start instance 3 before a > quorum > > is formed it is possible that instances 2 and 3 will form a quorum. This > > will cause any updates that have occurred via instance 1 during the > outage > > of Machine B to be lost. > > Start ZK instance 3 on Machine B > > > > This process should become easier once dynamic reconfiguration is > > implemented (in ZK 3.5 I believe?) because restarts won't be required. > > cheers > > Cam > > > > > > > > > > > > > > > > > > > > > > > > On Tue, Nov 5, 2013 at 6:05 PM, erolagnab <[email protected]> wrote: > > > >> Thanks, I got the idea now. So is it fair to say that it is not > possible to > >> create ZK cluster providing some redundancy with 2 physical machines? If > >> so, > >> is there a way to make it happen? > >> > >> > >> > >> -- > >> View this message in context: > >> > http://zookeeper-user.578899.n2.nabble.com/Running-Zookeeper-in-2-machines-tp7579232p7579237.html > >> Sent from the zookeeper-user mailing list archive at Nabble.com. > >> >
