[ https://issues.apache.org/jira/browse/HDFS-3010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13215806#comment-13215806 ]
Todd Lipcon commented on HDFS-3010: ----------------------------------- There are three possible components to the perf issue, I think: 1) DN now sends RBW replicas to both NNs as soon as a block starts to be created. This adds 3 RPCs to each block creation (though they don't write to the edit logs) 2) When blocks are allocated, we now log the full block list of that file. This creates a much bigger edit log, so of course takes more time. 3) When HA is enabled, these new edit log entries are fsynced, which makes it even slower. I'm hoping to set up a cluster to test each of these in isolation by commenting out the related code from the HA branch and measuring a write benchmark. Once we identify which is the worst issue we can tackle it. > Get performance on HA branch to match trunk > ------------------------------------------- > > Key: HDFS-3010 > URL: https://issues.apache.org/jira/browse/HDFS-3010 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: ha, name-node > Affects Versions: HA branch (HDFS-1623) > Reporter: Todd Lipcon > Assignee: Todd Lipcon > Priority: Critical > > As described in [this > comment|https://issues.apache.org/jira/browse/HDFS-1623?focusedCommentId=13215309&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13215309] > the performance of the HA branch for writes is significantly reduced > compared to trunk. We need to dig a bit and optimize whatever it is that's > hurting us in order to get back to the same performance numbers. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira