It doesn’t matter that communication between the service the client takes time. The client can determine that it is no longer the lock holder independently of the server.
-Jordan On July 15, 2015 at 3:55:40 PM, Ivan Kelly ([email protected]) wrote: > "at any snapshot in time no two clients think they hold the same lock” According to the ZK service. But communication between the service and client takes time. -Ivan On Wed, Jul 15, 2015 at 10:54 PM Ivan Kelly <[email protected]> wrote: > Jordan, imagine you have a node which is leader using the hbase example. A > client makes some request to the leader, which processes the request, lines > up a write to the state in hbase, and promptly goes into a 30 second gc > pause just before it flushes the socket. During the 30 second pause another > node takes over as leader and starts writing to the state. Now, when the > pause ends, what will stop the write from the first leader being flushed to > the socket and then hitting hbase? > > -Ivan > > On Wed, Jul 15, 2015 at 10:26 PM Jordan Zimmerman < > [email protected]> wrote: > >> I think we may be talking past each other here. My contention (and the ZK >> docs agree BTW) is that, properly written and configured, "at any >> snapshot in time no two clients think they hold the same lock”. How your >> application acts on that fact is another thing. You might need sequence >> numbers, you might not. >> >> -Jordan >> >> >> On July 15, 2015 at 3:15:16 PM, Alexander Shraer ([email protected]) >> wrote: >> >> Jordan, as Camille suggested, please read Sec 2.4 in the Chubby paper: >> link >> < >> http://static.googleusercontent.com/media/research.google.com/en//archive/chubby-osdi06.pdf> >> >> >> >> it suggests 2 ways in which the storage can support lock generations and >> proposes an alternative for the case where the storage can't be made >> aware >> of lock generations. >> >> On Wed, Jul 15, 2015 at 1:08 PM, Jordan Zimmerman < >> [email protected]> wrote: >> >> > Ivan, I just read the blog and I still don’t see how this can happen. >> > Sorry if I’m being dense. I’d appreciate a discussion on this. In your >> blog >> > you state: "when ZooKeeper tells you that you are leader, there’s no >> > guarantee that there isn’t another node that 'thinks' its the leader.” >> > However, given a long enough session time — I usually recommend 30–60 >> > seconds, I don’t see how this can happen. The client itself determines >> that >> > there is a network partition when there is no heartbeat success. The >> > heartbeat is a fraction of the session timeout. Once the heartbeat >> fails, >> > the client must assume it no longer has the lock. Another client cannot >> > take over the lock until, at minimum, session timeout. So, how then can >> > there be two leaders? >> > >> > -Jordan >> > >> > On July 15, 2015 at 2:23:12 PM, Ivan Kelly ([email protected]) wrote: >> > >> > I blogged about this exact problem a couple of weeks ago [1]. I give an >> > example of how split brain can happen in a resource under a zk lock >> (Hbase >> > in this case). As Camille says, sequence numbers ftw. I'll add that the >> > data store has to support them though, which not all do (in fact I've >> yet >> > to see one in the wild that does). I've implemented a prototype that >> works >> > with hbase[2] if you want to see what it looks like. >> > >> > -Ivan >> > >> > [1] >> > >> > >> https://medium.com/@ivankelly/reliable-table-writer-locks-for-hbase-731024295215 >> >> > [2] https://github.com/ivankelly/hbase-exclusive-writer >> > >> > On Wed, Jul 15, 2015 at 9:16 PM Vikas Mehta <[email protected]> >> wrote: >> > >> > > Jordan, I mean the client gives up the lock and stops working on the >> > shared >> > > resource. So when zookeeper is unavailable, no one is working on any >> > shared >> > > resource (because they cannot distinguish network partition from >> > zookeeper >> > > DEAD scenario). >> > > >> > > >> > > >> > > -- >> > > View this message in context: >> > > >> > >> http://zookeeper-user.578899.n2.nabble.com/locking-leader-election-and-dealing-with-session-loss-tp7581277p7581293.html >> >> > > Sent from the zookeeper-user mailing list archive at Nabble.com. >> > > >> > >> >>
