[ https://issues.apache.org/jira/browse/HDFS-3540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13453222#comment-13453222 ]
Colin Patrick McCabe commented on HDFS-3540: -------------------------------------------- bq. From what I understand based on previous comments, it allows an operator to continue with corrupt editlog or abort. Not sure if abort is really a choice. What would one do after abort? The first thing to try is moving aside the edit log directory that had the problem and seeing if you can reload with another one of the directories. If it's a random I/O corruption, normally only one of the copies of the edit log stored on disk will be bad. Since there's no edit log failover in branch-1, you have to do it yourself. If all the copies are corrupt, it may be necessary to use a hex editor on the edit log, or a similar technique. The offset of the failure is provided so you can check it out manually. bq. Perhaps we should consider printing more information during recovery to help an admin understand the state of the editlog. Is that possible? Nicholas mentioned earlier that it might be helpful to print out how many bytes are left in the log-- even though this can be computed from the information provided, it could be helpful to be more explicit about it. There may be other information that can be printed out too-- I'll take a look. > Further improvement on recovery mode and edit log toleration in branch-1 > ------------------------------------------------------------------------ > > Key: HDFS-3540 > URL: https://issues.apache.org/jira/browse/HDFS-3540 > Project: Hadoop HDFS > Issue Type: Bug > Components: name-node > Affects Versions: 1.2.0 > Reporter: Tsz Wo (Nicholas), SZE > Assignee: Tsz Wo (Nicholas), SZE > > *Recovery Mode*: HDFS-3479 backported HDFS-3335 to branch-1. However, the > recovery mode feature in branch-1 is dramatically different from the recovery > mode in trunk since the edit log implementations in these two branch are > different. For example, there is UNCHECKED_REGION_LENGTH in branch-1 but not > in trunk. > *Edit Log Toleration*: HDFS-3521 added this feature to branch-1 to remedy > UNCHECKED_REGION_LENGTH and to tolerate edit log corruption. > There are overlaps between these two features. We study potential further > improvement in this issue. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira