[ 
https://issues.apache.org/jira/browse/HDFS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13452697#comment-13452697
 ] 

Suresh Srinivas commented on HDFS-3540:
---------------------------------------

Error toleration is a useful feature when namenode continues to function 
without manual intervention. This helps in HA setup. That said, I think 
recovery mode may be useful if a cluster admin choses to make it manual.

>From what I understand based on previous comments, it allows an operator to 
>continue with corrupt editlog or abort. Not sure if abort is really a choice. 
>What would one do after abort? To that end, Nicholas, some of the information 
>you print such as the editlog length, corruption lenght and padding length 
>etc. should be printed in recovery mode. This information will be useful when 
>one wants to continue ignoring the corrupt part of the editlog.

Given this, I would leave recovery mode alone and not remove it. Perhaps we 
should consider printing more information during recovery to help an admin 
understand the state of the editlog. Is that possible?
                
> Further improvement on recovery mode and edit log toleration in branch-1
> ------------------------------------------------------------------------
>
>                 Key: HDFS-3540
>                 URL: https://issues.apache.org/jira/browse/HDFS-3540
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 1.2.0
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Tsz Wo (Nicholas), SZE
>
> *Recovery Mode*: HDFS-3479 backported HDFS-3335 to branch-1.  However, the 
> recovery mode feature in branch-1 is dramatically different from the recovery 
> mode in trunk since the edit log implementations in these two branch are 
> different.  For example, there is UNCHECKED_REGION_LENGTH in branch-1 but not 
> in trunk.
> *Edit Log Toleration*: HDFS-3521 added this feature to branch-1 to remedy 
> UNCHECKED_REGION_LENGTH and to tolerate edit log corruption.
> There are overlaps between these two features.  We study potential further 
> improvement in this issue.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to