That really isn't the problem.

First, the statement that "no two machines ..." is probably a bit strong.  I
would say that if two clients call lock() at the same time, then only one
will proceed.  Locks can be lost subsequently.  When this happens, Client1
should normally get two notifications.  The first will be connection lost.
 The second will be session expiration.  The client should treat connection
loss as a signal to stop acting as if it has the lock until either the
connection is reestablished or a session expiration event is received.
 Given that some time has to pass between the connection loss and the
session expiration, you should be safe if you stop acting as if you hold the
lock after connection loss.  GC can change timings a lot, of course.

Codewise, you are correct.  All you can do is interrupt a thread holding a
lock or trust it to check often that it still holds the lock.  Interrupting
is safest, of course, but not very convenient.  This is definitely not as
nice as a synchronized block, but I don't think that you can make a
synchronized block work correctly in a distributed setting subject to time
skew, node failures and partitions.



On Wed, Jul 20, 2011 at 2:07 PM, Yang <[email protected]> wrote:

> could be the limitation that I just discussed about a few days ago:
>
> http://zookeeper-user.578899.n2.nabble.com/help-on-Zookeeper-code-walk-through-tp6589163p6595442.html
>
> On Wed, Jul 20, 2011 at 2:03 PM, Will Johnson
> <[email protected]> wrote:
> > The Lock recipe has a overview description of "Fully distributed locks
> that
> > are globally synchronous, meaning at any snapshot in time no two clients
> > think they hold the same lock."  We've implemented this pattern but we've
> > run into an issue handling zookeeper errors that seem to violate the
> > semantics of 'no two clients think they have the lock.'  for example:
> >
> > Thread1.Client1.lock();
> > Thread2.Client2.lock();
> >
> > // client1 gets the lock so he starts some work
> > Thread1.client1.doWork();
> >
> > // but now i get a session timeout
> > // in the worst case it's because the doWork() method caused a full GC
> that
> > took > sessionTimeout
> > // my client then has to reconnect with a new session ID
> > Thread1.client1.reconnect();
> >
> > But now my question is, how have people handled this case to notify
> > Thread1.client1 that he is no longer holding the lock?  Without a lot of
> > pedantic calls to Thread1.client1.doIStillHaveTheLock() inside the
> doWork()
> > method it seems like 2 clients both think they have the lock.  Even if
> you
> > make repeated calls to check the state of your lock you still have small
> > windows of time where 2 clients are in the lock.  i could interrupt
> Thread1
> > when reconnecting but if you're using the lock for multithreaded
> > synchronization that won't help.
> >
> > I realize the limitations of zookeeper in this case but i also hope
> someone
> > else has solved this problem intelligently before.
> >
> > - will
> >
>

Reply via email to