Thanks Ted, > And in general, you can't have precise distributed lock control. There > will always be a bit of slop.
Yes, I agree with you. > So decide which penalty is easier to pay. Do you want "at-most-one" or > "at-least-one" or something in between? You can't have "exactly-one" and > still deal with expected problems like partition or node failure. Yes again, I feel the same way. IMHO, a lock(basic lock, not R/W lock) should be exclusive by nature. *If* really there was such flaw in the recipe, imho, they should not claim "at any snapshot in time no two clients think they hold the same lock" , at least with some notes; it is ... misleading. On Tue, Jan 15, 2013 at 12:05 AM, Ted Dunning <ted.dunn...@gmail.com> wrote: > Yes. > > And in general, you can't have precise distributed lock control. There > will always be a bit of slop. > > So decide which penalty is easier to pay. Do you want "at-most-one" or > "at-least-one" or something in between? You can't have "exactly-one" and > still deal with expected problems like partition or node failure. > > > On Mon, Jan 14, 2013 at 7:38 AM, Vitalii Tymchyshyn <tiv...@gmail.com>wrote: > >> There are two events: disconnected and session expired. The ephemeral nodes >> are removed after the second one. The client receives both. So to >> implement "at most one lock holder" scheme, client owning lock must think >> it've lost lock ownership since it've received disconnected event. So, >> there is period of time between disconnect and session expired when noone >> should have the lock. It's "safety" time to accomodate for time shifts, >> network latencies, lock ownership recheck interval (in case when client >> can't stop using resource immediatelly and simply checks regulary if it >> still holds the lock). >> >> >> >> 2013/1/14 Hulunbier <hulunb...@gmail.com> >> >> > Hi Vitalii, >> > >> > > I don't see why clock must be in sync. >> > >> > I don't see any reason to precisely sync the clocks either (but if we >> > could ... that would be wonderful.). >> > >> > By *some constrains of clock drift*, I mean : >> > >> > "Every node has a clock, and all clocks increase at the same rate" >> > or >> > "the server’s clock advance no faster than a known constant factor >> > faster than the client’s.". >> > >> > >> > >Also note the difference between disconnected and session >> > > expired events. This time difference is when client knows "something's >> > > wrong", but another client did not get a lock yet. >> > >> > sorry, but I failed to get your idea well; would you please give me >> > some further explanation? >> > >> > >> > On Mon, Jan 14, 2013 at 6:37 PM, Vitalii Tymchyshyn <tiv...@gmail.com> >> > wrote: >> > > I don't see why clock must be in sync. They are counting time periods >> > > (timeouts). Also note the difference between disconnected and session >> > > expired events. This time difference is when client knows "something's >> > > wrong", but another client did not get a lock yet. You will have >> problems >> > > if client can't react (and release resources) between this two events. >> > > >> > > Best regards, Vitalii Tymchyshyn >> > > >> > > >> > > 2013/1/13 Hulunbier <hulunb...@gmail.com> >> > > >> > >> Thanks Jordan, >> > >> >> > >> > Assuming the clocks are in sync between all participants… >> > >> >> > >> imho, perfect clock synchronization in a distributed system is very >> > >> hard (if it can be). >> > >> >> > >> > Someone with better understanding of ZK internals can correct me, >> but >> > >> this is my understanding. >> > >> >> > >> I think I might have missed some very important and subtile(or >> > >> obvious?) points of the recipe / ZK protocol. >> > >> >> > >> I just can not believe that, there could be such type of a flaw in the >> > >> lock-recipe, for so long time, without anybody has pointed it out. >> > >> >> > >> On Sun, Jan 13, 2013 at 9:31 AM, Jordan Zimmerman >> > >> <jor...@jordanzimmerman.com> wrote: >> > >> > On Jan 12, 2013, at 2:30 AM, Hulunbier <hulunb...@gmail.com> wrote: >> > >> > >> > >> >> Suppose the network link betweens client1 and server is at very low >> > >> >> quality (high packet loss rate?) but still fully functional. >> > >> >> >> > >> >> Client1 may be happily sending heart-beat-messages to server >> without >> > >> >> notice anything; but ZK server could be unable to receive >> > >> >> heart-beat-messages from client1 for a long period of time , which >> > >> >> leads ZK server to timeout client1's session, and delete the >> > ephemeral >> > >> >> node >> > >> > >> > >> > I believe the heartbeats go both ways. Thus, if the client doesn't >> > hear >> > >> from the server it will post a Disconnected event. >> > >> > >> > >> >> But I still feels that, no matter how well a ZK application >> behaves, >> > >> >> if we use ephemeral node in the lock-recipe; we can not guarantee >> "at >> > >> >> any snapshot in time no two clients think they hold the same lock", >> > >> >> which is the fundamental requirement/constraint for a lock. >> > >> > >> > >> > Assuming the clocks are in sync between all participants… The server >> > and >> > >> the client that holds the lock should determine that there is a >> > >> disconnection at nearly the same time. I imagine that there is a >> certain >> > >> amount of time (a few milliseconds) overlap here. But, the next client >> > >> wouldn't get the notification immediately anyway. Further, when the >> next >> > >> client gets the notification, it still needs to execute a >> getChildren() >> > >> command, process the results, etc. before it can determine that it has >> > the >> > >> lock. That two clients would think they have the lock at the same time >> > is a >> > >> vanishingly small possibility. Even if it did happen it would only be >> > for a >> > >> few milliseconds at most. >> > >> > >> > >> > Someone with better understanding of ZK internals can correct me, >> but >> > >> this is my understanding. >> > >> > >> > >> > -Jordan >> > >> >> > > >> > > >> > > >> > > -- >> > > Best regards, >> > > Vitalii Tymchyshyn >> > >> >> >> >> -- >> Best regards, >> Vitalii Tymchyshyn >>