[ 
https://issues.apache.org/jira/browse/HDFS-1073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12856636#action_12856636
 ] 

Sanjay Radia commented on HDFS-1073:
------------------------------------

Todd, thanks for the design. Before you move forward too much on the patch, I 
would like to get consensus on the 2 alternate designs. 
Further I think we need to add some items to the design doc. (Given that we may 
go through 2, 3 versions of the doc would it  be better to attach it rather 
then post in inline? ).

Please add the following items to your design doc:
* BNN restarts - how does it sync up? What if we have multiple BNNs?
* Checkpoint:
** Concurrent checkpoints (saveImage and checkpointer)
** Checkpoint done in the BNN which is also applying the edits stream to its 
state - Does the notion of spooling in the current design change?)
** Explore the notion of having checkpoints done offline - this is not targeted 
for the next release but something that
  we may want down the road; we need to evaluate the designs against this. (of 
course we also need to evaluate whether or not offline checkpoints are a good 
idea in the first place.)
* Managing edits and images in an HA environment. Here the idea is to move the 
image and edits to shared storage and treat the
  NN as "diskless". This is esp useful for federation when there are mulitple 
NNs. Moving/writing the image to shared storage is not difficult and it avoids 
the need to send the image back to the primary NN. Moving the edits to share 
storage is hard because of the latency requirements. Here book-keeper can come 
to the rescue; I don't see any other solutions so far.

I am *not* proposing  very detailed design of the above items since we don't 
have the resources to do all that. However as we evaluate the 2 alternate 
design lets use the above items to guide us.

> Simpler model for Namenode's fs Image and edit Logs 
> ----------------------------------------------------
>
>                 Key: HDFS-1073
>                 URL: https://issues.apache.org/jira/browse/HDFS-1073
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Sanjay Radia
>            Assignee: Todd Lipcon
>
> The naming and handling of  NN's fsImage and edit logs can be significantly 
> improved resulting simpler and more robust code.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators: 
https://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to