Ted, Sorry to trouble you on this one. I do understand the difference, but at some point I did not. :)
Your question inspired me to look deeper at our code (to see if we were confused) and I found one case that was triggering our reconnect response from Disconnected event. Everywhere else we only do this in response to a SessionExpiredException. Thanks for the quick response and your work on ZooKeeper in general! I have also run into the "can't create ephemeral yet case" and our code generally loops until successful. -Martin > -----Original Message----- > From: Ted Dunning [mailto:[email protected]] > > Martin, > > From your email, it sounds like there might be a bit of confusion between > disconnection and session expiration. Are you sure you are clear on the > difference between these? > > Also, I have seen cases in my own code where I confused myself by trying to > re-create ephemeral files after a client program crashed. I knew that the > client had crashed as soon as it happened, but the Zookeeper servers could > only determine this after a bit of time. My new program tried to recreate the > ephemerals to indicate that it was back but since the old ephemerals were > still there, that failed. Then a short time later when the ZK cluster > understood that the old client was gone, the ephemerals disappeared even > though the new client was humming along nicely. My solution was to delete > the ephemerals when creating them. > > Is it possible you have a similar confusion? > > On Tue, Sep 13, 2011 at 11:25 AM, Martin Serrano <[email protected]> > wrote: > > > Hi, > > > > We have added code to our application to reconnect and re-establish > > watches when we receive a Disconnected event. I am running tests on a > > heavily loaded system where the zookeeper server and clients are all > > impacted. On this test system we regularly experience session > > timeouts and appropriately react to reconnect and set up our watches. > > There is an uncommon case that I am having trouble puzzling out. When > > running one of our tests in a loop about 1% of the time we hit a case where > on the client side we think the > > session has expired but on the server side it has been renewed. We will > > then fail to be able to create an ephemeral node because it already > > exists and does not ever get cleaned up (since the previous session is > > still valid). I'm trying to figure out if we are misusing the API or if we > > have > > encountered a bug. I'm happy to provide more details. One thing I am > > wondering is if it is inappropriate to create a new session within the > > event thread of another session which has received the disconnected > event. > > > > Thanks, > > Martin Serrano > > ...
