On Thu, Feb 4, 2010 at 2:20 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:
> There's no way to "hand over" responsibility for an ephemeral znode, right? > Right. > We have solr nodes create ephemeral znodes (name based on host and port). > The ephemeral znode takes some time to remove of course, so what > happens is that if I bounce a solr server (containing a zk client) the > ephemeral node will still exist when the server comes back up. > This problem comes up with any system that has hysteresis and needs a single point of control. > What's the best way to handle this situation? Delete and re-create? > Watch it and re-create when it does disappear? > I think you need to handle the problem of multiple search nodes coming up on the same machine, possibly because the old one may have hung up. So... I would recommend a) if the ephemeral still exists, wait for a few more seconds to see if it disappears (20?) b) if it goes away, create a new one and continue as normal c) if it doesn't go away take additional action to determine if service is still running (i.e. panic and run in circles).