[ 
https://issues.apache.org/jira/browse/HDFS-2291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13179188#comment-13179188
 ] 

Aaron T. Myers commented on HDFS-2291:
--------------------------------------

Thanks a lot for providing this patch, Todd. What's below are mostly nits. I 
agree that there could be a few more comments for the new public methods, so I 
didn't include that in my feedback.

# {{dfs.namenode.standby.checkpoints}} - perhaps include ".ha" in there to make 
it clear that this option is only applicable in an HA setup?
# Might as well make the members of {{CheckpointConf}} final.
# {{LOG.info("Counted txns in " + file + ": " + val.getNumTransactions());}} - 
Either should be removed or should not be info level.
# {{prepareStopStandbyServices}} is kind of a weird name. Perhaps 
"prepareToStopStandbyServices" ?
# "// TODO interface audience" in {{TransferFsImage}}
# Does it not seem strange to you that the order of operations when setting a 
state is "prepareExit -> prepareEnter -> exit -> enter," instead of 
"prepareExit -> exit -> prepareEnter -> enter" ? i.e. I don't think there's a 
correctness issue here, but if I were designing a system where this set of 
events is triggered, I'd go with the latter.
# What's the point of the changes in {{EditLogTailer}}?
# "TODO: need to cancel the savenamespace operation if it's in flight" - I 
think this comment is no longer applicable to this patch, right?
# {{LOG.info("Time for a checkpoint !");}} - while strictly accurate, this 
doesn't seem to be the most helpful log message.
# Can we make {{CheckpointerThread}} a static inner class?
# {{e.printStackTrace();}} in {{CheckpointerThread}} should probably be tossed.
# Nit: in {{CheckpointerThread#doWork}}: 
"if(UserGroupInformation.isSecurityEnabled())" - space between "if" and "(", 
and curly braces around body of "if".
# You use "System.currentTimeMillis" in a bunch of places. How about replacing 
with "o.a.h.hdfs.server.common.Util#now" ?
# Does it make sense to explicitly disallow the SBN from allowing checkpoints 
to be uploaded to it? I realize the case when both nodes are in standby is 
already handled by this patch, since you don't allow checkpoints if the node 
already has a checkpoint for a given txid, but I mean from a principled 
perspective. It seems kind of odd to me that two nodes both sitting in standby 
would be doing checkpoint transfers at all.

                
> HA: Checkpointing in an HA setup
> --------------------------------
>
>                 Key: HDFS-2291
>                 URL: https://issues.apache.org/jira/browse/HDFS-2291
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Aaron T. Myers
>            Assignee: Todd Lipcon
>             Fix For: HA branch (HDFS-1623)
>
>         Attachments: hdfs-2291.txt, hdfs-2291.txt
>
>
> We obviously need to create checkpoints when HA is enabled. One thought is to 
> use a third, dedicated checkpointing node in addition to the active and 
> standby nodes. Another option would be to make the standby capable of also 
> performing the function of checkpointing.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to