[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13048949#comment-13048949
 ] 

Henry Robinson commented on ZOOKEEPER-1080:
-------------------------------------------

Hey Eric - this looks good. Protocol looks solid at the first pass. Some 
comments, based on a quick look:

* I wouldn't try and delete the root node at STOP time. It seems prone to 
problems if you stop one node while others are starting / in a failed state and 
don't have ephemerals yet registered. Sequence numbers are a fairly abundant 
resource, and if it's possible to run out of them across several runs, it's 
definitely possible to run out of them in a single run. 
* That tuple support class is, imho, kinda gross. It would be clearer to use 
specific struct-type classes whose names correspond to the fields they're 
intended to hold. 
* 'Observers' is already a meaningful noun in ZK land, so it might be clearer 
to call them something else. Paxos uses Learners, but that's also taken inside 
ZK. Listeners?
* Not a big deal, but I think you can break out of the for loop at the end of 
determineElectionStatus once the offer corresponding to the local node has been 
found. 
* I think addObserver / removeObserver probably need to synchronize on 
observers if you think you need to sync in dispatchEvent as well. 
* Is there any way to actually determine who the leader is (if not the local 
process)? Seems like this would be useful.

> Provide a Leader Election framework based on Zookeeper receipe
> --------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1080
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1080
>             Project: ZooKeeper
>          Issue Type: New Feature
>          Components: contrib
>    Affects Versions: 3.3.2
>            Reporter: Hari A V
>         Attachments: LeaderElectionService.pdf, zookeeper-leader-0.0.1.tar.gz
>
>
> Currently Hadoop components such as NameNode and JobTracker are single point 
> of failure.
> If Namenode or JobTracker goes down, there service will not be available 
> until they are up and running again. If there was a Standby Namenode or 
> JobTracker available and ready to serve when Active nodes go down, we could 
> have reduced the service down time. Hadoop already provides a Standby 
> Namenode implementation which is not fully a "hot" Standby. 
> The common problem to be addressed in any such Active-Standby cluster is 
> Leader Election and Failure detection. This can be done using Zookeeper as 
> mentioned in the Zookeeper recipes.
> http://zookeeper.apache.org/doc/r3.3.3/recipes.html
> +Leader Election Service (LES)+
> Any Node who wants to participate in Leader Election can use this service. 
> They should start the service with required configurations. The service will 
> notify the nodes whether they should be started as Active or Standby mode. 
> Also they intimate any changes in the mode at runtime. All other complexities 
> can be handled internally by the LES.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to