[ 
https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14968317#comment-14968317
 ] 

Uma Maheswara Rao G commented on HDFS-9198:
-------------------------------------------

Thank Daryn for the Nice work here. This is interesting to me.
I have just review the patch. Following are my comments:

# runBlockOp: how about naming it as runBlockReportOp ?
# nit: {code}
while (namesystem.isRunning()) {
+        NameNodeMetrics metrics = NameNode.getNameNodeMetrics();
{code}
May be we can take metrics outside loop and use it?
# I think we need to handle throwable for this BR processing thread? incase of 
any unexpected errors, this thread should not die silently as its one of the 
important processing thread… ? we may have to terminate the system in such 
cases.
minor suggestion: method names in BM could be like runBlockReportOpSync and 
runBlockReportAsync ? 
# code format missed for this lines:
{code}
metrics.setBlockOpsQueued(queue.size()+1);
metrics.addBlockOpsBatched(processed-1);
{code}
# Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR 
processing. But now we handle Exceptions as NN itself and can not pass to DN as 
due to async. So now we sendImmdeiateIBR happens only for IPC level exceptions. 
Have you thought about it. Missing such info would have to wait until next BR 
right?
# Tests looking great to me. minor suggestion is could you please ass javadoc 
for tests?


> Coalesce IBR processing in the NN
> ---------------------------------
>
>                 Key: HDFS-9198
>                 URL: https://issues.apache.org/jira/browse/HDFS-9198
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: namenode
>    Affects Versions: 2.0.0-alpha
>            Reporter: Daryn Sharp
>            Assignee: Daryn Sharp
>         Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, 
> HDFS-9198-trunk.patch, HDFS-9198-trunk.patch
>
>
> IBRs from thousands of DNs under load will degrade NN performance due to 
> excessive write-lock contention from multiple IPC handler threads.  The IBR 
> processing is quick, so the lock contention may be reduced by coalescing 
> multiple IBRs into a single write-lock transaction.  The handlers will also 
> be freed up faster for other operations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to