[ 
https://issues.apache.org/jira/browse/HDFS-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12923567#action_12923567
 ] 

Allen Wittenauer commented on HDFS-903:
---------------------------------------

> I think it's better to put in VERSION file since then you can use a command 
> line "md5sum" utility to check for corruption. 

+1

This is much more operations friendly.  If an alternative is picked--which is 
fine--just keep in mind we'll need a tool built to go with this change.

> NN should verify images and edit logs on startup
> ------------------------------------------------
>
>                 Key: HDFS-903
>                 URL: https://issues.apache.org/jira/browse/HDFS-903
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>            Reporter: Eli Collins
>            Assignee: Hairong Kuang
>            Priority: Critical
>             Fix For: 0.22.0
>
>
> I was playing around with corrupting fsimage and edits logs when there are 
> multiple dfs.name.dirs specified. I noticed that:
>  * As long as your corruption does not make the image invalid, eg changes an 
> opcode so it's an invalid opcode HDFS doesn't notice and happily uses a 
> corrupt image or applies the corrupt edit.
> * If the first image in dfs.name.dir is "valid" it replaces the other copies 
> in the other name.dirs, even if they are different, with this first image, ie 
> if the first image is actually invalid/old/corrupt metadata than you've lost 
> your valid metadata, which can result in data loss if the namenode garbage 
> collects blocks that it thinks are no longer used.
> How about we maintain a checksum as part of the image and edit log and check 
> those on startup and refuse to startup if they are different. Or at least 
> provide a configuration option to do so if people are worried about the 
> overhead of maintaining checksums of these files. Even if we assume 
> dfs.name.dir is reliable storage this guards against operator errors.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to