[ 
https://issues.apache.org/jira/browse/HDFS-1972?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13160618#comment-13160618
 ] 

Andrew Purtell commented on HDFS-1972:
--------------------------------------

I will go back to lurking on this issue right away but kindly allow me to +1 
this notion:

  bq. Persisting the txid in the DN disks actually has another nice property 
for non-HA clusters -- if you accidentally restart the NN from an old snapshot 
of the filesystem state, the DNs can refuse to connect, or refuse to process 
deletions. Currently, in this situation, the DNs would connect and then delete 
all of the newer blocks.

Encountering this scenario through a series of accidents has been a concern. 
Disallowing block deletion as proposed would be enough to give the operators a 
chance to recover from their mistake before permanent damage.
                
> HA: Datanode fencing mechanism
> ------------------------------
>
>                 Key: HDFS-1972
>                 URL: https://issues.apache.org/jira/browse/HDFS-1972
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: data-node, name-node
>            Reporter: Suresh Srinivas
>            Assignee: Todd Lipcon
>
> In high availability setup, with an active and standby namenode, there is a 
> possibility of two namenodes sending commands to the datanode. The datanode 
> must honor commands from only the active namenode and reject the commands 
> from standby, to prevent corruption. This invariant must be complied with 
> during fail over and other states such as split brain. This jira addresses 
> issues related to this, design of the solution and implementation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to