[ 
https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13168999#comment-13168999
 ] 

Todd Lipcon commented on HDFS-1972:
-----------------------------------

STONITH is one possible fencing mechanism, but requires special hardware 
support (eg a remotely controllable PDU or ILOM-like capability on the 
machine). This addresses the namenode side of fencing: how do we make sure that 
a previously active NN can no longer write to the shared edits storage (ie 
ensure exclusive access to the new active).

With many storage types there are less drastic fencing methods available - eg 
filers often support an operation to fence off a particular IP from a given 
volume. Software systems like bookkeeper might support a "lease revoke" 
operation of sorts (just a guess). So we shouldn't design STONITH as the only 
option if we can use other options with less custom hardware necessary.

However, the above NN fencing methods don't deal with the races described here 
-- the issue is that the standby necessarily has a stale view of pending 
deletions in the cluster. We need to essentially "flush" all deletions from the 
cluster before the new NN can make appropriate deletion decisions. This is 
because block replication decisions are not persisted to the shared storage. 
The issues mentioned here are important even in the case of manual transition 
from one NN to another.

                
> HA: Datanode fencing mechanism
> ------------------------------
>
>                 Key: HDFS-1972
>                 URL: https://issues.apache.org/jira/browse/HDFS-1972
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, ha, name-node
>            Reporter: Suresh Srinivas
>            Assignee: Todd Lipcon
>         Attachments: hdfs-1972-v1.txt, hdfs-1972.txt
>
>
> In high availability setup, with an active and standby namenode, there is a 
> possibility of two namenodes sending commands to the datanode. The datanode 
> must honor commands from only the active namenode and reject the commands 
> from standby, to prevent corruption. This invariant must be complied with 
> during fail over and other states such as split brain. This jira addresses 
> issues related to this, design of the solution and implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to