[ https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168999#comment-13168999 ]
Todd Lipcon commented on HDFS-1972: ----------------------------------- STONITH is one possible fencing mechanism, but requires special hardware support (eg a remotely controllable PDU or ILOM-like capability on the machine). This addresses the namenode side of fencing: how do we make sure that a previously active NN can no longer write to the shared edits storage (ie ensure exclusive access to the new active). With many storage types there are less drastic fencing methods available - eg filers often support an operation to fence off a particular IP from a given volume. Software systems like bookkeeper might support a "lease revoke" operation of sorts (just a guess). So we shouldn't design STONITH as the only option if we can use other options with less custom hardware necessary. However, the above NN fencing methods don't deal with the races described here -- the issue is that the standby necessarily has a stale view of pending deletions in the cluster. We need to essentially "flush" all deletions from the cluster before the new NN can make appropriate deletion decisions. This is because block replication decisions are not persisted to the shared storage. The issues mentioned here are important even in the case of manual transition from one NN to another. > HA: Datanode fencing mechanism > ------------------------------ > > Key: HDFS-1972 > URL: https://issues.apache.org/jira/browse/HDFS-1972 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: data-node, ha, name-node > Reporter: Suresh Srinivas > Assignee: Todd Lipcon > Attachments: hdfs-1972-v1.txt, hdfs-1972.txt > > > In high availability setup, with an active and standby namenode, there is a > possibility of two namenodes sending commands to the datanode. The datanode > must honor commands from only the active namenode and reject the commands > from standby, to prevent corruption. This invariant must be complied with > during fail over and other states such as split brain. This jira addresses > issues related to this, design of the solution and implementation. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira