Okay, I can buy that it's not in keeping with the ZooKeeper design philosophy to use timeouts in the way that I am describing. I'm guessing this is so that it avoids situations where clients preemptively time themselves out and leave sessions hanging?
I guess my objection would be that the API is making a promise that it can only deliver part of the time. If the client can't reconnect to ZooKeeper, then the client hasn't expired, which is an unusual state to find oneself in, and in leader-election systems like mine could result in having two practical leaders, while ZooKeeper is insisting that there is only one. This kind of split-brain scenario seems unavoidable in the absence of probabilistic failure checking (like timeouts). The FAQ, I've noticed, does make mention of this phenomenon. Perhaps something should be indicated there regarding the why and not just the mechanics. Otherwise, developers such as myself might find themselves unduly confused by it :) Thanks for all your help, Scott On Thu, Apr 21, 2011 at 11:51 PM, Ted Dunning <[email protected]> wrote: > I like philosophical design points. This is a good one. > > On Thu, Apr 21, 2011 at 5:46 PM, Benjamin Reed <[email protected]> wrote: > > > i think the perspective to have is that zookeeper tries to deal with > > facts, and when it doesn't have the facts, it tells you so. > > >
