[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968317#comment-14968317 ]
Uma Maheswara Rao G commented on HDFS-9198: ------------------------------------------- Thank Daryn for the Nice work here. This is interesting to me. I have just review the patch. Following are my comments: # runBlockOp: how about naming it as runBlockReportOp ? # nit: {code} while (namesystem.isRunning()) { + NameNodeMetrics metrics = NameNode.getNameNodeMetrics(); {code} May be we can take metrics outside loop and use it? # I think we need to handle throwable for this BR processing thread? incase of any unexpected errors, this thread should not die silently as its one of the important processing thread… ? we may have to terminate the system in such cases. minor suggestion: method names in BM could be like runBlockReportOpSync and runBlockReportAsync ? # code format missed for this lines: {code} metrics.setBlockOpsQueued(queue.size()+1); metrics.addBlockOpsBatched(processed-1); {code} # Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR processing. But now we handle Exceptions as NN itself and can not pass to DN as due to async. So now we sendImmdeiateIBR happens only for IPC level exceptions. Have you thought about it. Missing such info would have to wait until next BR right? # Tests looking great to me. minor suggestion is could you please ass javadoc for tests? > Coalesce IBR processing in the NN > --------------------------------- > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.0-alpha > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)