[ 
https://issues.apache.org/jira/browse/HDFS-2026?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13052837#comment-13052837
 ] 

Eli Collins commented on HDFS-2026:
-----------------------------------

Looks great.

Some small stuff:
* Can we remove Checkpointer#uploadCheckpoint commented out? (mark TODO if 
addressed in follow-on)
* testReformatNNBetweenCheckpoints method comment is missing a period.
* The new call to sd.read in SecondaryNameNode#recoverCreate could use a 
comment (not clear why we need to read the version file there). As an aside, 
readVersionFile would be a better name for that method. 
* Not you change would be good to add a comment to uploadImageFromStorage 
indicating it doesn't actually post an image but the 2NN posts to the NN asking 
it to get an image.


> 1073: 2NN needs to handle case of reformatted NN better
> -------------------------------------------------------
>
>                 Key: HDFS-2026
>                 URL: https://issues.apache.org/jira/browse/HDFS-2026
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: name-node
>    Affects Versions: Edit log branch (HDFS-1073)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>             Fix For: Edit log branch (HDFS-1073)
>
>         Attachments: hdfs-2026.txt
>
>
> Currently in the 1073 branch, the following steps ends up with a very 
> confused 2NN:
> - format NN, run NN
> - start 2NN, perform some checkpoints
> - reformat NN, start NN on new namespace
> - restart same 2NN
> The 2NN currently saves the new VERSION info into its local storage directory 
> but doesn't clear out the old checkpoint or edits files. This is obviously 
> wrong and might lead to a corrupt checkpoint getting uploaded. 
> If the 2NN has storage directories with VERSION info, and connects to an NN 
> with different VERSION info, there are two alternatives:
> a) refuse to perform any checkpoints until the operator issues a 
> "secondarynamenode -format" command (this is similar to how the 
> backupnode/checkpointnode works)
> b) clear the current contents of the storage directory and save the new NN's 
> VERSION info.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to