[ 
https://issues.apache.org/jira/browse/HDFS-955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12836067#action_12836067
 ] 

Konstantin Shvachko commented on HDFS-955:
------------------------------------------

>> The criteria that IMAGE_NEW was written completely and successfully is the 
>> existence of EDITS_NEW
> I think you misspoke here - EDITS_NEW exists before IMAGE_NEW is saved. 

This is what I meant. During start up the NN decides on whether to discard or 
to keep IMAGE_NEW (and rename it to IMAGE) based on the existence of EDITS_NEW. 
If EDITS_NEW exists then it simply removes IMAGE_NEW. This means that the NN 
failure occurred before IMAGE_NEW was completed. If EDITS_NEW is not present, 
but IMAGE_NEW is, this means that the NN failure occurred after IMAGE_NEW was 
successfully written, and therefore the NN need just to complete the checkpoint 
by renaming IMAGE_NEW to IMAGE and purging edits.

> You may be able to know that info from the state of some other files, but why 
> not be explicit about it to avoid some classes of errors?

We want to be able to know about failure without reading contents of the image 
file. The contents may be corrupted during failures, it is not safe to rely on 
reading the data from image or edits files.

> FSImage.saveFSImage can lose edits
> ----------------------------------
>
>                 Key: HDFS-955
>                 URL: https://issues.apache.org/jira/browse/HDFS-955
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.20.1, 0.21.0, 0.22.0
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Blocker
>         Attachments: hdfs-955-moretests.txt, hdfs-955-unittest.txt, 
> PurgeEditsBeforeImageSave.patch
>
>
> This is a continuation of a discussion from HDFS-909. The FSImage.saveFSImage 
> function (implementing dfsadmin -saveNamespace) can corrupt the NN storage 
> such that all current edits are lost.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to