Actually... I'm looking at LockInternals and it already has a watcher set on 
the previous node before it calls wait(). That watcher will get called when 
there's a connection problem. It would be pretty easy to add something in that 
watcher to interrupt the waiting thread. This could be a pretty simple change 
actually.

Watcher set just before waiting: 
https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/LockInternals.java#L301
 
<https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/LockInternals.java#L301>
The Watcher: 
https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/LockInternals.java#L64
 
<https://github.com/apache/curator/blob/master/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/LockInternals.java#L64>

So, off the top of my head, an AtomicReference<Thread> field could be added to 
LockInternals. It gets set to the current thread just before waiting. If the 
Watcher is called with a connection event and the AtomicReference isn't empty, 
interrupt the thread. This will cause wait() to throw InterruptedException and 
then the lock node will be deleted.

-Jordan

> On Jul 30, 2021, at 9:11 AM, Jordan Zimmerman <[email protected]> 
> wrote:
> 
> Making lock code watch its lock node it has created
> 
> I don't think this is useful. Once the lock method returns the lock code 
> can't notify the caller/user that the lock node has disappeared. We'd need 
> some kind of additional class "lock watcher" or something that client code 
> would have to periodically call to ensure that it still has the lock. tbh - 
> this has been discussed a lot over the years on the ZooKeeper/Curator 
> channels. Just because you have a ZNode doesn't mean it's 100% valid. You 
> should still practice optimistic locks, etc. But, maybe we can reduce the 
> edge cases with this lock watcher or some other mechanism.
> 
> Session ID changes after wait exits it suggests the lock node should no 
> longer be trusted and it should be deleted and re-created
> 
> This is a simple change. But, I don't know if it would help much. It would be 
> better to interrupt waiting lockers when the connection is lost. If an 
> internal mechanism forced any locks blocked in wait() to throw an exception 
> then those lock threads will already delete their ZNodes and try again. This 
> would be the best solution maybe? So, before the lock code goes to wait() it 
> sets a ConnectionStateListener (or something similar) that will interrupt the 
> thread when there are connection problems.
> 
> Leader Latch code is well protected
> 
> Yes - the leader recipes are better at this. Maybe there's an opportunity to 
> merge/change code. We could consider deprecating the lock recipes in favor of 
> these?
> 
> -Jordan
> 
>> On Jul 29, 2021, at 2:35 AM, Viswanathan Rajagopal 
>> <[email protected] <mailto:[email protected]>> 
>> wrote:
>> 
>> Hi Jordan,
>>  
>> Thanks for the suggestions. Much helpful.
>>  
>> Quick summary of my below comments,
>> Added test case simulator to reproduce issue (though artificially)
>> Added a proposed code just to get your review and suggestions to see whether 
>> that would work
>> Please find my detailed comments inline,
>>  
>> > I think there's generally a hole in Curator's lock recipe. The lock code 
>> > does not watch the lock node it has created. So, another process or (as 
>> > you found) a race with the server might cause the lock node to disappear 
>> > underneath the lock instance after it thinks it has the lock. One thing we 
>> > can do is to check the session ID before waiting in 
>> > LockInternals.internalLockLoop(). If the session ID changes after wait 
>> > exits it suggests the lock node should no longer be trusted and it should 
>> > be deleted and re-created. That's 1 though anyway.
>>  
>> Yes, you are right. Currently lock code doesn’t watch the lock node it has 
>> created. That’s where we would like to hear your feedback on our proposal 
>> mentioned in our first mail thread 
>> “Curator lock code has makeRevocable() API that enables application to 
>> revoke lock anytime from application by triggering NODE_CHANGE event through 
>> Revoker.attemptToRevoke() utility. 
>> Proposal: Would it be nice to extend makeRevocable() API to handle Node 
>> delete event, which would allow application to register the watcher for Node 
>> delete event, thereby application can react to Node delete event by revoking 
>> the lock ?”. Tried the Code snippet 
>> (https://github.com/ViswaNXplore/curator/blob/curator-2.5.0/patch/double_locking_test_simulator/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/LockInternals.java#L60
>>  
>> <https://github.com/ViswaNXplore/curator/blob/curator-2.5.0/patch/double_locking_test_simulator/curator-recipes/src/main/java/org/apache/curator/framework/recipes/locks/LockInternals.java#L60>
>> ) to see the approach “Making lock code watch its lock node it has created” 
>> works and would like to get your thoughts on this and do you see any other 
>> impacts there? May need your advice.
>>  
>> I agree with you on that the session id approach, that would protect us from 
>> seeing this issue. The reason why I thought your first approach “making lock 
>> code watch it has created” could be more protective is this could protect us 
>> from any inconsistencies between original lock node and local lock state.
>>   
>> >  It would be nice to generate a test/simulation for this case so that it 
>> > can be properly dealt with.
>>  
>> Created a test case simulator 
>> https://github.com/ViswaNXplore/curator/blob/curator-2.5.0/patch/double_locking_test_simulator/curator-recipes/src/test/java/org/apache/curator/framework/recipes/CuratorRecipeTest.java
>>  
>> <https://github.com/ViswaNXplore/curator/blob/curator-2.5.0/patch/double_locking_test_simulator/curator-recipes/src/test/java/org/apache/curator/framework/recipes/CuratorRecipeTest.java>
>>  
>> Approach #1 (Artificial way of reproducing the issue)
>> Run the test case with (SIMULATE_ISSUE_BY_EXPLICIT_DELETE set to true). This 
>> would delete the lock node after a pause just to simulate the state where 
>> the zkClient had reconnected and it still happens to see the ephermal node 
>> just before the server deletes it since its session has expired, but the 
>> node is deleted afterwards by the server.
>> Approach #2 (A little manual interruption needed to reproduce the issue)
>> Run the test case in Debug mode (SIMULATE_ISSUE_BY_EXPLICIT_DELETE set to 
>> false)
>> Artificially delaying / pausing the ephemeral lock nodes deletion as part of 
>> session cleanup process in server code (ZookeeperServer.close() method)
>> After a pause (say 5s) to make one of the instance to acquire the lock, 
>> Artificially break the socket connection between client and server for 30s 
>> (by keeping breakpoint in ClientCnxnSocketNIO.doIO() method). After 30s, we 
>> would see session closing logs logged in server code
>> After 1min, remove breakpoint in ClientCnxnSocketNIO.doIO() and resume both 
>> Thread 2 and Thread 3
>> After that, resume server thread (thread name would be “SessionTracker”
>> Below Proposals discussed so far (3rd added now for your review)
>> Making lock code watch its lock node it has created
>> Session ID changes after wait exits it suggests the lock node should no 
>> longer be trusted and it should be deleted and re-created
>> Leader Latch code is well protected to cover this zookeeper race condition, 
>> because Leader Latch code internally handle the connection events (which 
>> they use to interrupt latch_acquire_state to reset its latch every time 
>> connection is reconnected), means it will explicitly release the latch and 
>> recreate new if there is a connection disconnect (may be this can be the 
>> approach that lock recipe could use to protect ?) 
>> Many Thanks,
>> Viswa.
>>  
>> From: Jordan Zimmerman <[email protected] 
>> <mailto:[email protected]>>
>> Date: Tuesday, 20 July 2021 at 21:11
>> To: Viswanathan Rajagopal <[email protected] 
>> <mailto:[email protected]>>
>> Cc: Sean Gibbons <[email protected] 
>> <mailto:[email protected]>>, [email protected] 
>> <mailto:[email protected]> <[email protected] 
>> <mailto:[email protected]>>, [email protected] 
>> <mailto:[email protected]> <[email protected] 
>> <mailto:[email protected]>>, [email protected] 
>> <mailto:[email protected]> <[email protected] 
>> <mailto:[email protected]>>, Donal Arundel <[email protected] 
>> <mailto:[email protected]>>, Francisco Goncalves 
>> <[email protected] <mailto:[email protected]>>, 
>> Zak Szadowski <[email protected] 
>> <mailto:[email protected]>>, Dandie Beloy <[email protected] 
>> <mailto:[email protected]>>, Marcelo Cenerino 
>> <[email protected] <mailto:[email protected]>>
>> Subject: Re: [External Sender] Double Locking Issue
>> 
>> In our case, it wouldn’t throw any exception because it had gone past 
>> “creating lock nodes” and was blocked on wait(), which would only then be 
>> interrupted when curator watcher notified on previous sequence node delete 
>> event.
>>  
>> So, you're using the version of acquire() without a timeout? In any event, 
>> this is a problem. When you receive SUSPENDED you really should interrupt 
>> any threads that are waiting on Curator. The Curator docs imply this even 
>> though it might not be obvious. This is likely the source of your problems. 
>> A simple solution is to use the version of acquire that has a timeout and 
>> repeatedly call it until success (though your problem may still occur in 
>> this case). Maybe we could improve the lock recipe (or something similar) so 
>> that locks inside of acquire are interrupted on a network partition. 
>>  
>> [snip of your remaining thoughtful analysis]
>>  
>> I think there's generally a hole in Curator's lock recipe. The lock code 
>> does not watch the lock node it has created. So, another process or (as you 
>> found) a race with the server might cause the lock node to disappear 
>> underneath the lock instance after it thinks it has the lock. 
>>  
>> One thing we can do is to check the session ID before waiting in 
>> LockInternals.internalLockLoop(). If the session ID changes after wait exits 
>> it suggests the lock node should no longer be trusted and it should be 
>> deleted and re-created. That's 1 though anyway. It would be nice to generate 
>> a test/simulation for this case so that it can be properly dealt with.
>>  
>> -Jordan
>> 
>> 
>> On Jul 20, 2021, at 3:53 PM, Viswanathan Rajagopal 
>> <[email protected] <mailto:[email protected]>> 
>> wrote:
>>  
>> Thanks Jordan for coming back on this.
>>  
>> Please find my inline comments. Also provided below few additional version 
>> info that we are using,
>> Curator version – 2.5.0
>> Zookeeper version – 3.5.3
>>  
>> >  I don't see how this is the case. If the client has received a network 
>> > partition it shouldn't not consider any locks as being currently held (see 
>> > the Tech Notes I mentioned in my previous email). If there is a partition 
>> > during a call to acquire(), acquire() would throw an exception (once the 
>> > retry policy has expired). BTW, what retry policy are you using?
>>  
>> In our case, it wouldn’t throw any exception because it had gone past 
>> “creating lock nodes” and was blocked on wait(), which would only then be 
>> interrupted when curator watcher notified on previous sequence node delete 
>> event.
>>  
>> > This isn't correct. Curator watches the previous node but lock acquisition 
>> > is always based on calling ZK's getChildren, sorting them, and seeing if 
>> > the lock's node is the first node in the sequence. If Ephemeral nodes 
>> > aren't cleaned up it wouldn't be a problem. Any phantom ephemeral nodes 
>> > would sort first and prevent Curator from believing it holds the lock.
>>  
>> Totally agreed on your point on lock acquisition. The problem that I was 
>> trying to explain is that the network partition occurs while the client is 
>> waiting for the lock (in the acquire()) but after the acquire code block has 
>> already written out the lock node successfully, so the client code will be 
>> blocking in the acquire and not get a connection exception. Once the 
>> previous lock node is deleted but not the lock node for this client then 
>> this client will leave the acquire thinking it has the lock but actually at 
>> this point the ZK server has expired the session and is in the process of 
>> deleting our clients lock node.
>>  
>> Trying to categorize the timeline events below to see if that would help us 
>> for better understanding,
>> [ CONNECTED ] :
>> Client A & Client B calls acquire creating Node N1 & N2 respectively
>> Client A acquire() -> holds the lock as its Node N1 is first node in sequence
>> Client B acquire() -> created Node N2, watches for previous sequence node 
>> (blocked on wait())
>>  
>> [ SUSPENDED CONNECTION OCCURS ] :
>> Network partition happens
>> Curator fires SUSPENDED
>> Client A, on receiving SUSPENDED event, attempts to release the lock
>> Client B, on receiving SUSPENDED event, does nothing as it was not holding 
>> any lock (as blocked on acquire() -> wait() )  
>>  
>> [ SESSION EXPIRED ] :
>> Client A attempts to release the lock (with retries)
>> Client B does nothing as it was not holding any lock (and is still blocked 
>> on acquire() -> wait() )
>> Server prepared to cleanup previous client session and deleting lock nodes 
>> created as part of previous client session
>>  
>> [ RECONNECTED ] :
>> Client A and Client B reconnected to another server with new session id 
>> (Note : This happens before server managed to cleanup / delete ephemeral 
>> nodes of previous client session)
>> Client A released the lock successfully (means it would delete its lock node 
>> N1) and attempts to acquire lock by creating lock node N3 and watches for 
>> previous sequence node (N2)
>> Client B who was blocked on acquire() -> wait() would be notified with 
>> previous sequence node (N1) deletion -> getChildren, sorting them, and 
>> seeing if the lock's node is the first node in the sequence. So, Client B 
>> sees its lock node N2 still (which I call it as about to be deleted node by 
>> server) and thereby acquires the lock
>>  
>> [ AFTER FEW SECONDS ] :
>> Server managed to delete the ephemeral node N2 as part of previous client 
>> session cleanup
>> Client A who was blocked on acquire() -> wait() would be notified with 
>> previous sequence node (N2) deletion -> getChildren, sorting them, and 
>> seeing if the lock's node is the first node in the sequence and thereby 
>> acquires the lock
>> Client B – its local lock thread data went stale (as its lock path N2 not 
>> has been deleted by Server)
>>  
>> >  SUSPENDED in Curator means only that the connect has been lost, not that 
>> > the session has ended.
>> LOST is the state that means the session has ended
>> Be aware of how GCs can affect Curator. See the Tech Note here: 
>> https://cwiki.apache.org/confluence/display/CURATOR/TN10 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CURATOR_TN10&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=mwSLlPO0Vtstmu1dce0TMFqf5lUxD2SPdNdc1k4NXjU&m=jQttOc88vMYbwNBFzWUO8TKGSmu_8P6EDmcD7jlrVys&s=BJywW80sEBAxOQEuTDkeP8tx0XX0s01sz44lNyWXIXs&e=>
>> <https://cwiki.apache.org/confluence/display/CURATOR/TN10 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CURATOR_TN10&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=mwSLlPO0Vtstmu1dce0TMFqf5lUxD2SPdNdc1k4NXjU&m=jQttOc88vMYbwNBFzWUO8TKGSmu_8P6EDmcD7jlrVys&s=BJywW80sEBAxOQEuTDkeP8tx0XX0s01sz44lNyWXIXs&e=>>
>> Also read this Tech Note on session handling: 
>> https://cwiki.apache.org/confluence/display/CURATOR/TN14 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CURATOR_TN14&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=mwSLlPO0Vtstmu1dce0TMFqf5lUxD2SPdNdc1k4NXjU&m=jQttOc88vMYbwNBFzWUO8TKGSmu_8P6EDmcD7jlrVys&s=UhQP5TNa_OMaYVSJX3b9p7XqWZ0fZtErLQwwbcQUUQA&e=>
>> <https://cwiki.apache.org/confluence/display/CURATOR/TN14 
>> <https://urldefense.proofpoint.com/v2/url?u=https-3A__cwiki.apache.org_confluence_display_CURATOR_TN14&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=mwSLlPO0Vtstmu1dce0TMFqf5lUxD2SPdNdc1k4NXjU&m=jQttOc88vMYbwNBFzWUO8TKGSmu_8P6EDmcD7jlrVys&s=UhQP5TNa_OMaYVSJX3b9p7XqWZ0fZtErLQwwbcQUUQA&e=>>
>>  
>> Very good information on the tech notes. Thanks for that. Agreed, it’s 
>> always recommended to clear the locks when it sees SUSPENDED event. And we 
>> are already following your recommendation. Our application would clear the 
>> lock when it sees SUSPENDED event.
>>  
>> To summarize my thoughts,
>> Ephemeral node created by previous session was  still seen by client that 
>> reconnected with new session id until server cleans that up the previous 
>> session ephemeral node. This could happen if client manage to reconnect with 
>> server with new session id before server cleans up the previous session. How 
>> it affects the curator lock recipe ? Explanation : The above explained race 
>> condition would make acquire() to hold the lock ( as its own lock node still 
>> seen ), eventually leading to inconsistent state (i.e. curator local lock 
>> state stale) when that lock node is being cleaned up by the server as part 
>> of previous session cleanup activities.
>> I am NOT seeing a Curator bug here, but looking out for suggestions / 
>> recommendations in handling this zookeeper race condition. Either can a 
>> feature be added to Curator to cover this case / any recommendations that 
>> clients should follow.
>>  
>> Many Thanks,
>> Viswa.
>>  
>> From: Sean Gibbons <[email protected] 
>> <mailto:[email protected]>>
>> Date: Tuesday, 20 July 2021 at 12:19
>> To: Viswanathan Rajagopal <[email protected] 
>> <mailto:[email protected]>>
>> Subject: FW: [External Sender] Re: Double Locking Issue
>> 
>> 
>> 
>> On 20/07/2021, 10:12, "Jordan Zimmerman" <[email protected] 
>> <mailto:[email protected]>> wrote:
>> 
>>     A few more things...
>> 
>>     > Based on our initial analysis and few test runs, we saw that Curator 
>> acquire() method acquires the lock based on “about to be deleted lock node 
>> of previous session”. Explanation : Ephemeral node created by previous 
>> session was  still seen by client that reconnected with new session id until 
>> server cleans that up. If this happens, Curator acquire() would hold the 
>> lock.
>> 
>>     This isn't correct. Curator watches the previous node but lock 
>> acquisition is always based on calling ZK's getChildren, sorting them, and 
>> seeing if the lock's node is the first node in the sequence. If Ephemeral 
>> nodes aren't cleaned up it wouldn't be a problem. Any phantom ephemeral 
>> nodes would sort first and prevent Curator from believing it holds the lock.
>> 
>>     >  *   On the above mentioned race condition, if client manage to 
>> reconnect to server with new session id before server cleans up the 
>> ephemeral nodes of client’s previous session,  Curator lock acquire() who is 
>> trying to acquire the lock will hold the lock as it still sees the lock node 
>> in zookeeper directory. Eventually server would be cleaning up the ephemeral 
>> nodes leaving the Curator local lock thread data stale giving the illusion 
>> that it still hold the lock while its ephemeral node is gone
>> 
>>     I don't see how this is the case. If the client has received a network 
>> partition it shouldn't not consider any locks as being currently held (see 
>> the Tech Notes I mentioned in my previous email). If there is a partition 
>> during a call to acquire(), acquire() would throw an exception (once the 
>> retry policy has expired). BTW, what retry policy are you using?
>> 
>>     So, to reiterate, I don't see how phantom/undeleted ephemeral nodes 
>> would cause a problem. The only problem it could case is that a given 
>> Curator client takes longer to acquire a lock as it waits for those 
>> ephemerals to finally get deleted.
>> 
>>     -JZ
>> 
>>     > On Jul 19, 2021, at 6:45 PM, Viswanathan Rajagopal 
>> <[email protected] 
>> <mailto:[email protected]>> wrote:
>>     > 
>>     > Hi Team,
>>     > 
>>     > Good day.
>>     > 
>>     > Recently came across “Double Locking Issue (i.e. two clients acquiring 
>> lock)” using Curator code ( InterProcessMutex lock APIs ) in our application
>>     > 
>>     > Our use case:
>>     > 
>>     >  *   Two clients attempts to acquire the zookeeper lock using Curator 
>> InterProcessMutex and whoever owns it would release it once sees the 
>> connection disconnect ( on receiving Connection.SUSPENDED / Connection.LOST 
>> Curator Connection Events from Connection listener)
>>     > 
>>     > Issue we noticed:
>>     > 
>>     >  *   After session expired & reconnected with new session, both client 
>> seems to have acquired the lock. Interesting thing that we found is that one 
>> of the clients still holds the lock while its lock node (ephemeral) was gone
>>     > 
>>     > Things we found:
>>     > 
>>     >  *   Based on our initial analysis and few test runs, we saw that 
>> Curator acquire() method acquires the lock based on “about to be deleted 
>> lock node of previous session”. Explanation : Ephemeral node created by 
>> previous session was  still seen by client that reconnected with new session 
>> id until server cleans that up. If this happens, Curator acquire() would 
>> hold the lock.
>>     > 
>>     > 
>>     > 
>>     >  *   Clearly we could see the race condition (in zookeeper code) 
>> between 1). Client reconnecting to server with new session id and 2). server 
>> deleting the ephemeral nodes of client’s previous session. We were able to 
>> reproduce this issue using the following approach,
>>     >     *   Artificially break the socket connection between client and 
>> server for 30s
>>     >     *   Artificially pausing the set of server codes for a min and 
>> unpause
>>     > 
>>     > 
>>     >  *   On the above mentioned race condition, if client manage to 
>> reconnect to server with new session id before server cleans up the 
>> ephemeral nodes of client’s previous session,  Curator lock acquire() who is 
>> trying to acquire the lock will hold the lock as it still sees the lock node 
>> in zookeeper directory. Eventually server would be cleaning up the ephemeral 
>> nodes leaving the Curator local lock thread data stale giving the illusion 
>> that it still hold the lock while its ephemeral node is gone
>>     > 
>>     > 
>>     >  *   Timeline events described below for better understanding,
>>     >     *   At t1, Client A and Client B establishes zookeeper session 
>> with session id A1 and B1 respectively
>>     >     *   At t2, Client A creates the lock node N1 & acquires the lock
>>     >     *   At t3, Client B creates the lock node N2 & blocked on 
>> acquire() to acquire the lock
>>     >     *   At t4, session timed out for both clients & server is about to 
>> clean up the old session • Client A trying to release the lock
>>     >     *   At t5, Client A and Client B reconnects to server with new 
>> session id A2 and B2 respectively before server deletes the ephemeral node 
>> N1 & N2 of previous client session. Client A releases the lock, deleting N1 
>> and trying to acquire it again by creating N3 node and Client B who is 
>> blocked on acquire() acquires the lock based on N2 (about to be deleted node 
>> created by previous session)
>>     >     *   At  t6, server cleans up the ephemeral node N2 created by 
>> Client B’s previous session. Client A acquires the lock with node N3 as its 
>> previous sequence N2 gets deleted, whereas Client B who incorrectly acquired 
>> the lock at t5 timeline still holds the lock
>>     > 
>>     > 
>>     > Note:
>>     > 
>>     >  *   We are not certain that this race condition that we noticed in 
>> zookeeper code is intentional design.
>>     > 
>>     > Questions:
>>     > 
>>     >  *   Given this race condition seen in zookeeper code, we would like 
>> to hear your recommendations / suggestions to avoid this issue while using 
>> Curator lock code?
>>     > 
>>     > 
>>     > 
>>     >  *   We also see that Interprocess Mutex has the makeRevocable() API 
>> that enables application to revoke lock, but that handles Node change event 
>> only but not for Node deleted event. I understand that this makes sense to 
>> handle Node change event alone, as to enable application user to revoke lock 
>> externally from application side. But would it be also to okay to have one 
>> for Node delete event, so as application can register the listener for Node 
>> delete event. I would like to hear your thoughts.
>>     > 
>>     > 
>>     > Many Thanks,
>>     > Viswa
>> 
> 

Reply via email to