[ https://issues.apache.org/jira/browse/HADOOP-8163?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13236966#comment-13236966 ]
Bikas Saha commented on HADOOP-8163: ------------------------------------ bq. But since we only have one user of this framework at this point (HDFS) and we currently only support a single standby node, I would prefer to punt these changes to another JIRA as additional improvements. I would disagree here. The suggestion does not have much to do with HDFS or single standby or generality of the framework. It is about keeping fencing inside FailoverController instead of being shared with the elector. Clear separation of responsibilities. I agree that the NN work is more important and without knowing more about the FailoverController/Automatic NN HA I cannot say how much work it would take to change the control flow as described above. My guess is that it would not be big but I might be wrong. In my experience API's once made are hard to change. It would be hard for someone to change the control flow later once important services like NN HA depend on the current flow. So punting it for the future would be quite a distant future indeed :P bq. Doing blocking calls in the callbacks will not result in lost ZK leases, etc. To quote from the ZK programmer's guide: I agree. The IO updates will be processed but the callback notification to the client might be impeded if the client is already blocking on the previous callbacks. I was more concerned about the later. That is why I was suggesting to not do fencing on the client callback. Though I agree that in the current patch these calls have to be made synchronously for correctness. > Improve ActiveStandbyElector to provide hooks for fencing old active > -------------------------------------------------------------------- > > Key: HADOOP-8163 > URL: https://issues.apache.org/jira/browse/HADOOP-8163 > Project: Hadoop Common > Issue Type: Improvement > Components: ha > Affects Versions: 0.23.3, 0.24.0 > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Attachments: hadoop-8163.txt, hadoop-8163.txt, hadoop-8163.txt, > hadoop-8163.txt > > > When a new node becomes active in an HA setup, it may sometimes have to take > fencing actions against the node that was formerly active. This JIRA extends > the ActiveStandbyElector which adds an extra non-ephemeral node into the ZK > directory, which acts as a second copy of the active node's information. > Then, if the active loses its ZK session, the next active to be elected may > easily locate the unfenced node to take the appropriate actions. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira