[
https://issues.apache.org/jira/browse/HDFS-1800?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13014824#comment-13014824
]
Todd Lipcon commented on HDFS-1800:
-----------------------------------
I can think of three different ways to do this:
1) Next to each fsimage_N file, we also write a file fsimage_N.md5sum which
contains the checksum.
2) In the image directory, we write out a single file (eg "md5sums") that has
the format:
{code}
<md5sum> fsimage_A
<md5sum> fsimage_B
...
{code}
This would be usable with {{md5sum -c}} for example to verify all the files in
the directory. When a new one needed to be appended, the file would have to be
read and re-written.
3) We incorporate the md5sum into the fsimage format itself as a tag at the end
of the file. I don't like this option much, because there's no way for ops to
verify it, but including it as an option for completeness.
Any preference between option #1 and #2? Or a 4th option that I didn't consider?
> Extend image checksumming to function with multiple fsimage files
> -----------------------------------------------------------------
>
> Key: HDFS-1800
> URL: https://issues.apache.org/jira/browse/HDFS-1800
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: name-node
> Affects Versions: Edit log branch (HDFS-1073)
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Fix For: Edit log branch (HDFS-1073)
>
>
> HDFS-903 added the MD5 checksum of the fsimage to the VERSION file in each
> image directory. This allows it to verify that the FSImage didn't get
> corrupted or accidentally replaced on disk.
> With HDFS-1073, there may be multiple fsimage_N files in a storage directory
> corresponding to different checkpoints. So having a single MD5 in the VERSION
> file won't suffice. Instead we need to store an MD5 per image file.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira