[ https://issues.apache.org/jira/browse/HDFS-6440?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14531297#comment-14531297 ]
Jesse Yates commented on HDFS-6440: ----------------------------------- More comments, as I actually get back into the code: {quote} In StandbyCheckpointer#doCheckpoint, unless I'm missing something, I don't think the variable "ie" can ever be non-null, and yet we check for whether or not it's null later in the method to determine if we should shut down. {quote} It can either be an InterruptedException or an IOException when transfering the checkpoint. Interrupted ("ie") thrown if we are interrupted while waiting the any checkpoint to complete. IOE if there is an execution exception when doing the checkpoint. After we get out of waiting for the uploads, if we got an "ioe" or an "ie" then we force the rest of the threads that we started for the image transfer to quit by shutting down the threadpool (and then forcibly shutting it down shortly after that). We do checks again for each exception to ensure we throw the right one back up. We could wrap the exceptions into a parent exception and then just throw that back up to the caller (resulting in less checks), but I didn't want to change the method signature b/c the interrupted means something very different from ioe. Can do whatever you want there though, don't really matter to me. We need to make sure either exception is rethrown > Support more than 2 NameNodes > ----------------------------- > > Key: HDFS-6440 > URL: https://issues.apache.org/jira/browse/HDFS-6440 > Project: Hadoop HDFS > Issue Type: New Feature > Components: auto-failover, ha, namenode > Affects Versions: 2.4.0 > Reporter: Jesse Yates > Assignee: Jesse Yates > Attachments: Multiple-Standby-NameNodes_V1.pdf, > hdfs-6440-cdh-4.5-full.patch, hdfs-6440-trunk-v1.patch, > hdfs-6440-trunk-v1.patch, hdfs-multiple-snn-trunk-v0.patch > > > Most of the work is already done to support more than 2 NameNodes (one > active, one standby). This would be the last bit to support running multiple > _standby_ NameNodes; one of the standbys should be available for fail-over. > Mostly, this is a matter of updating how we parse configurations, some > complexity around managing the checkpointing, and updating a whole lot of > tests. -- This message was sent by Atlassian JIRA (v6.3.4#6332)