[ https://issues.apache.org/jira/browse/HDFS-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177367#comment-13177367 ]
Todd Lipcon commented on HDFS-2692: ----------------------------------- bq. In FSEditLogLoader#loadFSEdits, should we really be unconditionally calling FSNamesystem#notifyGenStampUpdate in the finally block? What if an error occurs and maxGenStamp is never updated in FSEditLogLoader#loadEditRecords This should be OK -- we'll just call it with the argument 0, which won't cause any problem (0 is lower than any possible queued gen stamp) bq. sp. "Initiatling" in TestHASafeMode#testComplexFailoverIntoSafemode fixed bq. In FSNamesystem#notifyGenStampUpdate, could be a better log message, and the log level should probably not be info: LOG.info("=> notified of genstamp update for: " + gs); Fixed and changed to DEBUG level bq. Why is SafeModeInfo#doConsistencyCheck costly? It doesn't seem like it should be. If it's not in fact expensive, we might as well make it run regardless of whether or not asserts are enabled You're right that it's not super expensive, but this code gets called on every block being reported during startup, which is a fair amount.. so I chose to maintain the current behavior, of only running the checks when asserts are enabled. bq. Is there really no better way to check if assertions are enabled? Not that I've ever found! :( bq. seems like they should all be made member methods and moved to MiniDFSCluster... Also seems like TestEditLogTailer#waitForStandbyToCatchUp should be moved to MiniDFSCluster. I'd like to move a bunch of these methods into a new {{HATestUtil}} class... can I do that in a follow-up JIRA? Eli said: bq. Nice change and tests. Nit, I'd add a comment in TestHASafeMode#restartStandby where the safemode extension is set indicating the rationale, it looked like the asserts at the end were racy because I missed this Fixed > HA: Bugs related to failover from/into safe-mode > ------------------------------------------------ > > Key: HDFS-2692 > URL: https://issues.apache.org/jira/browse/HDFS-2692 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: HA branch (HDFS-1623) > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Critical > Attachments: hdfs-2692.txt, hdfs-2692.txt > > > In testing I saw an AssertionError come up several times when I was trying to > do failover between two NNs where one or the other was in safe-mode. Need to > write some unit tests to try to trigger this -- hunch is it has something to > do with the treatment of "safe block count" while tailing edits in safemode. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira