[ https://issues.apache.org/jira/browse/ZOOKEEPER-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13049725#comment-13049725 ]
Hari A V commented on ZOOKEEPER-1080: ------------------------------------- hi Sameer, How about handling of "Disconnected" and "Expired" events from Zookeeper? Here in this case there will not be any exception propagated from the Zookeper server, Instead it will notify through watcher as KeeperState.Disconnected (0). Please see the following case: Let's say Process1 is Leader and process2 is Ready state. Now, the network of Process1 goes down[Disconnected Event] for morethan sessiontimeout period. Then Process2 will get the NodeDeleted event and becomes Active. So finally Process1 & Process2 both will be in Active state. [Multiple Active processes will leads to inconsistencies if we use this framework to provide HA for NameNode.] > Provide a Leader Election framework based on Zookeeper receipe > -------------------------------------------------------------- > > Key: ZOOKEEPER-1080 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1080 > Project: ZooKeeper > Issue Type: New Feature > Components: contrib > Affects Versions: 3.3.2 > Reporter: Hari A V > Attachments: LeaderElectionService.pdf, ZOOKEEPER-1080.patch, > zkclient-0.1.0.jar, zookeeper-leader-0.0.1.tar.gz > > > Currently Hadoop components such as NameNode and JobTracker are single point > of failure. > If Namenode or JobTracker goes down, there service will not be available > until they are up and running again. If there was a Standby Namenode or > JobTracker available and ready to serve when Active nodes go down, we could > have reduced the service down time. Hadoop already provides a Standby > Namenode implementation which is not fully a "hot" Standby. > The common problem to be addressed in any such Active-Standby cluster is > Leader Election and Failure detection. This can be done using Zookeeper as > mentioned in the Zookeeper recipes. > http://zookeeper.apache.org/doc/r3.3.3/recipes.html > +Leader Election Service (LES)+ > Any Node who wants to participate in Leader Election can use this service. > They should start the service with required configurations. The service will > notify the nodes whether they should be started as Active or Standby mode. > Also they intimate any changes in the mode at runtime. All other complexities > can be handled internally by the LES. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira