Re: [External Sender] Re: Double Locking Issue

Viswanathan Rajagopal Tue, 20 Jul 2021 07:53:15 -0700

Thanks Jordan for coming back on this.

Please find my inline comments. Also provided below few additional version info 
that we are using,


  *   Curator version – 2.5.0
  *   Zookeeper version – 3.5.3

>  I don't see how this is the case. If the client has received a network 
> partition it shouldn't not consider any locks as being currently held (see 
> the Tech Notes I mentioned in my previous email). If there is a partition 
> during a call to acquire(), acquire() would throw an exception (once the 
> retry policy has expired). BTW, what retry policy are you using?

In our case, it wouldn’t throw any exception because it had gone past “creating 
lock nodes” and was blocked on wait(), which would only then be interrupted 
when curator watcher notified on previous sequence node delete event.

> This isn't correct. Curator watches the previous node but lock acquisition is 
> always based on calling ZK's getChildren, sorting them, and seeing if the 
> lock's node is the first node in the sequence. If Ephemeral nodes aren't 
> cleaned up it wouldn't be a problem. Any phantom ephemeral nodes would sort 
> first and prevent Curator from believing it holds the lock.

Totally agreed on your point on lock acquisition. The problem that I was trying 
to explain is that the network partition occurs while the client is waiting for 
the lock (in the acquire()) but after the acquire code block has already 
written out the lock node successfully, so the client code will be blocking in 
the acquire and not get a connection exception. Once the previous lock node is 
deleted but not the lock node for this client then this client will leave the 
acquire thinking it has the lock but actually at this point the ZK server has 
expired the session and is in the process of deleting our clients lock node.

Trying to categorize the timeline events below to see if that would help us for 
better understanding,
[ CONNECTED ] :

  *   Client A & Client B calls acquire creating Node N1 & N2 respectively
  *   Client A acquire() -> holds the lock as its Node N1 is first node in 
sequence
  *   Client B acquire() -> created Node N2, watches for previous sequence node 
(blocked on wait())

[ SUSPENDED CONNECTION OCCURS ] :

  *   Network partition happens
  *   Curator fires SUSPENDED
  *   Client A, on receiving SUSPENDED event, attempts to release the lock
  *   Client B, on receiving SUSPENDED event, does nothing as it was not 
holding any lock (as blocked on acquire() -> wait() )

[ SESSION EXPIRED ] :

  *   Client A attempts to release the lock (with retries)
  *   Client B does nothing as it was not holding any lock (and is still 
blocked on acquire() -> wait() )
  *   Server prepared to cleanup previous client session and deleting lock 
nodes created as part of previous client session

[ RECONNECTED ] :

  *   Client A and Client B reconnected to another server with new session id 
(Note : This happens before server managed to cleanup / delete ephemeral nodes 
of previous client session)
  *   Client A released the lock successfully (means it would delete its lock 
node N1) and attempts to acquire lock by creating lock node N3 and watches for 
previous sequence node (N2)
  *   Client B who was blocked on acquire() -> wait() would be notified with 
previous sequence node (N1) deletion -> getChildren, sorting them, and seeing 
if the lock's node is the first node in the sequence. So, Client B sees its 
lock node N2 still (which I call it as about to be deleted node by server) and 
thereby acquires the lock

[ AFTER FEW SECONDS ] :

  *   Server managed to delete the ephemeral node N2 as part of previous client 
session cleanup
  *   Client A who was blocked on acquire() -> wait() would be notified with 
previous sequence node (N2) deletion -> getChildren, sorting them, and seeing 
if the lock's node is the first node in the sequence and thereby acquires the 
lock
  *   Client B – its local lock thread data went stale (as its lock path N2 not 
has been deleted by Server)


>  SUSPENDED in Curator means only that the connect has been lost, not that the 
> session has ended.
LOST is the state that means the session has ended
Be aware of how GCs can affect Curator. See the Tech Note here: 
https://cwiki.apache.org/confluence/display/CURATOR/TN10
<https://cwiki.apache.org/confluence/display/CURATOR/TN10>
Also read this Tech Note on session handling: 
https://cwiki.apache.org/confluence/display/CURATOR/TN14
<https://cwiki.apache.org/confluence/display/CURATOR/TN14>

Very good information on the tech notes. Thanks for that. Agreed, it’s always 
recommended to clear the locks when it sees SUSPENDED event. And we are already 
following your recommendation. Our application would clear the lock when it 
sees SUSPENDED event.

To summarize my thoughts,

  *   Ephemeral node created by previous session was  still seen by client that 
reconnected with new session id until server cleans that up the previous 
session ephemeral node. This could happen if client manage to reconnect with 
server with new session id before server cleans up the previous session. How it 
affects the curator lock recipe ? Explanation : The above explained race 
condition would make acquire() to hold the lock ( as its own lock node still 
seen ), eventually leading to inconsistent state (i.e. curator local lock state 
stale) when that lock node is being cleaned up by the server as part of 
previous session cleanup activities.
  *   I am NOT seeing a Curator bug here, but looking out for suggestions / 
recommendations in handling this zookeeper race condition. Either can a feature 
be added to Curator to cover this case / any recommendations that clients 
should follow.


Many Thanks,
Viswa.

From: Sean Gibbons <[email protected]>
Date: Tuesday, 20 July 2021 at 12:19
To: Viswanathan Rajagopal <[email protected]>
Subject: FW: [External Sender] Re: Double Locking Issue


On 20/07/2021, 10:12, "Jordan Zimmerman" <[email protected]> wrote:

    A few more things...

    > Based on our initial analysis and few test runs, we saw that Curator 
acquire() method acquires the lock based on “about to be deleted lock node of 
previous session”. Explanation : Ephemeral node created by previous session was 
 still seen by client that reconnected with new session id until server cleans 
that up. If this happens, Curator acquire() would hold the lock.

    This isn't correct. Curator watches the previous node but lock acquisition 
is always based on calling ZK's getChildren, sorting them, and seeing if the 
lock's node is the first node in the sequence. If Ephemeral nodes aren't 
cleaned up it wouldn't be a problem. Any phantom ephemeral nodes would sort 
first and prevent Curator from believing it holds the lock.

    >  *   On the above mentioned race condition, if client manage to reconnect 
to server with new session id before server cleans up the ephemeral nodes of 
client’s previous session,  Curator lock acquire() who is trying to acquire the 
lock will hold the lock as it still sees the lock node in zookeeper directory. 
Eventually server would be cleaning up the ephemeral nodes leaving the Curator 
local lock thread data stale giving the illusion that it still hold the lock 
while its ephemeral node is gone

    I don't see how this is the case. If the client has received a network 
partition it shouldn't not consider any locks as being currently held (see the 
Tech Notes I mentioned in my previous email). If there is a partition during a 
call to acquire(), acquire() would throw an exception (once the retry policy 
has expired). BTW, what retry policy are you using?

    So, to reiterate, I don't see how phantom/undeleted ephemeral nodes would 
cause a problem. The only problem it could case is that a given Curator client 
takes longer to acquire a lock as it waits for those ephemerals to finally get 
deleted.

    -JZ

    > On Jul 19, 2021, at 6:45 PM, Viswanathan Rajagopal 
<[email protected]> wrote:
    >
    > Hi Team,
    >
    > Good day.
    >
    > Recently came across “Double Locking Issue (i.e. two clients acquiring 
lock)” using Curator code ( InterProcessMutex lock APIs ) in our application
    >
    > Our use case:
    >
    >  *   Two clients attempts to acquire the zookeeper lock using Curator 
InterProcessMutex and whoever owns it would release it once sees the connection 
disconnect ( on receiving Connection.SUSPENDED / Connection.LOST Curator 
Connection Events from Connection listener)
    >
    > Issue we noticed:
    >
    >  *   After session expired & reconnected with new session, both client 
seems to have acquired the lock. Interesting thing that we found is that one of 
the clients still holds the lock while its lock node (ephemeral) was gone
    >
    > Things we found:
    >
    >  *   Based on our initial analysis and few test runs, we saw that Curator 
acquire() method acquires the lock based on “about to be deleted lock node of 
previous session”. Explanation : Ephemeral node created by previous session was 
 still seen by client that reconnected with new session id until server cleans 
that up. If this happens, Curator acquire() would hold the lock.
    >
    >
    >
    >  *   Clearly we could see the race condition (in zookeeper code) between 
1). Client reconnecting to server with new session id and 2). server deleting 
the ephemeral nodes of client’s previous session. We were able to reproduce 
this issue using the following approach,
    >     *   Artificially break the socket connection between client and 
server for 30s
    >     *   Artificially pausing the set of server codes for a min and unpause
    >
    >
    >  *   On the above mentioned race condition, if client manage to reconnect 
to server with new session id before server cleans up the ephemeral nodes of 
client’s previous session,  Curator lock acquire() who is trying to acquire the 
lock will hold the lock as it still sees the lock node in zookeeper directory. 
Eventually server would be cleaning up the ephemeral nodes leaving the Curator 
local lock thread data stale giving the illusion that it still hold the lock 
while its ephemeral node is gone
    >
    >
    >  *   Timeline events described below for better understanding,
    >     *   At t1, Client A and Client B establishes zookeeper session with 
session id A1 and B1 respectively
    >     *   At t2, Client A creates the lock node N1 & acquires the lock
    >     *   At t3, Client B creates the lock node N2 & blocked on acquire() 
to acquire the lock
    >     *   At t4, session timed out for both clients & server is about to 
clean up the old session • Client A trying to release the lock
    >     *   At t5, Client A and Client B reconnects to server with new 
session id A2 and B2 respectively before server deletes the ephemeral node N1 & 
N2 of previous client session. Client A releases the lock, deleting N1 and 
trying to acquire it again by creating N3 node and Client B who is blocked on 
acquire() acquires the lock based on N2 (about to be deleted node created by 
previous session)
    >     *   At  t6, server cleans up the ephemeral node N2 created by Client 
B’s previous session. Client A acquires the lock with node N3 as its previous 
sequence N2 gets deleted, whereas Client B who incorrectly acquired the lock at 
t5 timeline still holds the lock
    >
    >
    > Note:
    >
    >  *   We are not certain that this race condition that we noticed in 
zookeeper code is intentional design.
    >
    > Questions:
    >
    >  *   Given this race condition seen in zookeeper code, we would like to 
hear your recommendations / suggestions to avoid this issue while using Curator 
lock code?
    >
    >
    >
    >  *   We also see that Interprocess Mutex has the makeRevocable() API that 
enables application to revoke lock, but that handles Node change event only but 
not for Node deleted event. I understand that this makes sense to handle Node 
change event alone, as to enable application user to revoke lock externally 
from application side. But would it be also to okay to have one for Node delete 
event, so as application can register the listener for Node delete event. I 
would like to hear your thoughts.
    >
    >
    > Many Thanks,
    > Viswa

Re: [External Sender] Re: Double Locking Issue

Reply via email to