[ 
https://issues.apache.org/jira/browse/HDFS-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13177367#comment-13177367
 ] 

Todd Lipcon commented on HDFS-2692:
-----------------------------------

bq. In FSEditLogLoader#loadFSEdits, should we really be unconditionally calling 
FSNamesystem#notifyGenStampUpdate in the finally block? What if an error occurs 
and maxGenStamp is never updated in FSEditLogLoader#loadEditRecords

This should be OK -- we'll just call it with the argument 0, which won't cause 
any problem (0 is lower than any possible queued gen stamp)

bq. sp. "Initiatling" in TestHASafeMode#testComplexFailoverIntoSafemode
fixed

bq. In FSNamesystem#notifyGenStampUpdate, could be a better log message, and 
the log level should probably not be info: LOG.info("=> notified of genstamp 
update for: " + gs);
Fixed and changed to DEBUG level

bq. Why is SafeModeInfo#doConsistencyCheck costly? It doesn't seem like it 
should be. If it's not in fact expensive, we might as well make it run 
regardless of whether or not asserts are enabled
You're right that it's not super expensive, but this code gets called on every 
block being reported during startup, which is a fair amount.. so I chose to 
maintain the current behavior, of only running the checks when asserts are 
enabled.

bq. Is there really no better way to check if assertions are enabled?
Not that I've ever found! :(

bq. seems like they should all be made member methods and moved to 
MiniDFSCluster... Also seems like TestEditLogTailer#waitForStandbyToCatchUp 
should be moved to MiniDFSCluster.
I'd like to move a bunch of these methods into a new {{HATestUtil}} class... 
can I do that in a follow-up JIRA?

Eli said:
bq. Nice change and tests. Nit, I'd add a comment in 
TestHASafeMode#restartStandby where the safemode extension is set indicating 
the rationale, it looked like the asserts at the end were racy because I missed 
this
Fixed
                
> HA: Bugs related to failover from/into safe-mode
> ------------------------------------------------
>
>                 Key: HDFS-2692
>                 URL: https://issues.apache.org/jira/browse/HDFS-2692
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: ha, name-node
>    Affects Versions: HA branch (HDFS-1623)
>            Reporter: Todd Lipcon
>            Assignee: Todd Lipcon
>            Priority: Critical
>         Attachments: hdfs-2692.txt, hdfs-2692.txt
>
>
> In testing I saw an AssertionError come up several times when I was trying to 
> do failover between two NNs where one or the other was in safe-mode. Need to 
> write some unit tests to try to trigger this -- hunch is it has something to 
> do with the treatment of "safe block count" while tailing edits in safemode.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to