Thanks Jordan, > If client1's hearbeat fails its main watcher will get a Disconnect event
Suppose the network link betweens client1 and server is at very low quality (high packet loss rate?) but still fully functional. Client1 may be happily sending heart-beat-messages to server without notice anything; but ZK server could be unable to receive heart-beat-messages from client1 for a long period of time , which leads ZK server to timeout client1's session, and delete the ephemeral node. Thus, client's session could be timeouted by ZK server, without triggering a Disconnect event. >Well behaving ZK applications must watch for this and assume that it no longer >holds the lock and, thus, should delete its node. If client1 needs the lock >again it should try to re-acquire it from step 1 of the recipe. Further, well >behaving ZK applications must re-try node deletes if there is a connection >problem. Have a look at Curator's implementation for details. Thanks for pointing me the "Curator's implementation", I will dig into the source code. But I still feels that, no matter how well a ZK application behaves, if we use ephemeral node in the lock-recipe; we can not guarantee "at any snapshot in time no two clients think they hold the same lock", which is the fundamental requirement/constraint for a lock. Mr. Andrey Stepachev suggested that I should use a timer in client side to track session_timeout, that sounds reasonable; but I think this implicitly implies some constrains of clock drift - which I am not expected in a solution based on Zookeeper (ZK is supposed to keep the animals well). On Sat, Jan 12, 2013 at 4:20 AM, Jordan Zimmerman <jor...@jordanzimmerman.com> wrote: > > If client1's hearbeat fails its main watcher will get a Disconnect event. > Well behaving ZK applications must watch for this and assume that it no > longer holds the lock and, thus, should delete its node. If client1 needs the > lock again it should try to re-acquire it from step 1 of the recipe. Further, > well behaving ZK applications must re-try node deletes if there is a > connection problem. Have a look at Curator's implementation for details. > > -JZ > > On Jan 11, 2013, at 5:46 AM, Zhao Boran <hulunb...@gmail.com> wrote: > > > While reading the zookeeper's recipe for > > lock<http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks>, > > I get confused: > > > > Seems that this recipe-for-distributed-lock can not guarantee *"any > > snapshot in time no two clients think they hold the same lock"*. > > > > But since zookeeper is so widely adopted, if there were such mistakes in > > the reference doc, someone should have pointed it out long time ago. > > > > So, what did I misunderstand? please help me! > > > > Recipe-for-distributed-lock (from > > http://zookeeper.apache.org/doc/trunk/recipes.html#sc_recipes_Locks) > > > > Locks > > > > Fully distributed locks that are globally synchronous, *meaning at any > > snapshot in time no two clients think they hold the same lock*. These can > > be implemented using ZooKeeeper. As with priority queues, first define a > > lock node. > > > > 1. Call create( ) with a pathname of "*locknode*/guid-lock-" and the > > sequence and ephemeral flags set. > > 2. Call getChildren( ) on the lock node without setting the watch flag > > (this is important to avoid the herd effect). > > 3. If the pathname created in step 1 has the lowest sequence number > > suffix, the client has the lock and the client exits the protocol. > > 4. The client calls exists( ) with the watch flag set on the path in the > > lock directory with the next lowest sequence number. > > 5. if exists( ) returns false, go to step 2. Otherwise, wait for a > > notification for the pathname from the previous step before going to step > > 2. > > > > Considering the following case: > > > > - > > > > Client1 successfully acquired the lock(in step3), with zk node > > "locknode/guid-lock-0"; > > - > > > > Client2 created node "locknode/guid-lock-1", failed to acquire the lock, > > and watching "locknode/guid-lock-0"; > > - > > > > Later, for some reasons(network congestion?), client1 failed to send > > heart beat message to zk cluster on time, but client1 is still perfectly > > working, and assuming itself still holding the lock. > > - > > > > But, Zookeeper may think client1's session is timeouted, and then > > 1. deletes "locknode/guid-lock-0" > > 2. sends a notification to Client2 (or send the notification first?) > > 3. but can not send "session timeout" notification to client1 in time > > (due to network congestion?) > > > > > > - > > > > Client2 got the notification, goes to step 2, gets the only node > > ""locknode/guid-lock-1", which is created by itself; thus, client2 assumes > > it hold the lock. > > - > > > > But at the same time, client1 assumes it hold the lock. > > > > Is this a valid scenario? > > > > Thanks a lot! >