FYI 

Curator now has a staged connection notification mechanism for dealing with 
issues like this. When the Curator managed connection receives a Disconnect, it 
posts a message to listeners that the connection is SUSPENDED. If the 
connection can be re-established (via a background sync() using the current 
retry policy) the listeners receive RECONNECTED otherwise they receive LOST. 
Thus, users of the Curator LeaderSelector can know if they should pause their 
leader activity and/or stop leader activity.

-JZ
________________________________________
From: Ted Dunning [[email protected]]
Sent: Monday, November 14, 2011 6:24 PM
To: [email protected]
Subject: Re: Missing session state handling in most Leader Election 
implementations

On Mon, Nov 14, 2011 at 2:41 PM, Jordan Zimmerman <[email protected]>wrote:

> It turns out that this is tricky to solve. When the server you're
> connected to goes down, you get a Watcher.Event.KeeperState.Disconnected.
> However, it could be that you are able to reconnect to another server so
> the disconnected event should be ignored.


The event should not be ignored.  The master should pause in being a
master, but not unload any major data structures.  If it reconnects
instantly, then it should continue as if nothing had happened.  You can
also have a time limit for how long you wait before you decide to pause
operation as master.  As you increase that time, you increase the
probability of two masters existing at the same time.  If the reconnect
happens before the timeout, you don't need to both the master.

Reply via email to