[ 
https://issues.apache.org/jira/browse/HDFS-2781?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13204924#comment-13204924
 ] 

Todd Lipcon commented on HDFS-2781:
-----------------------------------

There's some interaction with fencing, here, though... one likely reason that 
the NN will lose touch with the shared storage is that another node has 
requested that the NAS device fence the host. Then, after the failover, the 
administrator might unfence the host from the NAS, and we don't want the NN to 
automatically "come back to life" at this point.
                
> Add client protocol and DFSadmin for command to restore failed storage
> ----------------------------------------------------------------------
>
>                 Key: HDFS-2781
>                 URL: https://issues.apache.org/jira/browse/HDFS-2781
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Eli Collins
>            Assignee: Eli Collins
>
> Per HDFS-2769, it's important that an admin be able to ask the NN to try to 
> restore failed storage since we may drop into SM until the shared edits dir 
> is restored (w/o having to wait for the next checkpoint). There's currently 
> an API (and usage in DFSAdmin) to flip the flag indicating whether the NN 
> should try to restore failed storage but not that it should actually attempt 
> to do so. This jira is to add one. This is useful outside HA but doing as an 
> HDFS-1623 sub-task since it's motivated by HA.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to