[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

Todd Lipcon (Commented) (JIRA) Tue, 20 Dec 2011 22:19:56 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13173895#comment-13173895
 ]


Todd Lipcon commented on HDFS-2291:
-----------------------------------

I plan to start working on this tomorrow. My thinking is to have a checkpoint 
thread which wakes up on the checkpoint interval, stops the edit log tailer 
thread, enters safe mode, creates a checkpoint, and comes back out of safemode. 
If at any point the SB needs to process a failover, it will cancel the 
checkpoint (using the HDFS-2507 feature) and proceed as usual.

The remaining question I've yet to figure out is whether it should (a) save the 
checkpoints into the shared edits directory, or (b) save in its own and then 
upload the checkpoints to the primary via HTTP just like the 2NN does today.

"b" is probably preferable since the shared edits directory may in fact be BK 
or some other journal plugin in the future, whereas "a" would break the 
abstraction.

If anyone has any strong opinions please shout now :)
                
> HA: Checkpointing in an HA setup
> --------------------------------
>
>                 Key: HDFS-2291
>                 URL: https://issues.apache.org/jira/browse/HDFS-2291
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Aaron T. Myers
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>
> We obviously need to create checkpoints when HA is enabled. One thought is to 
> use a third, dedicated checkpointing node in addition to the active and 
> standby nodes. Another option would be to make the standby capable of also 
> performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-2291) HA: Checkpointing in an HA setup

Reply via email to