[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14987330#comment-14987330 ]
Daryn Sharp commented on HDFS-9198: ----------------------------------- # I named it more generically since it can be used for more, but I'll rename if you feel strongly. # I originally had it outside the loop but something failed. Probably a test. Since it's a class static it should be cheap. # The NN is terminated when the queue stops accepting offers and the thread isn't running, but I'll add a try/catch around the run() loop too. # Will look at the style again. # Yes, I compensated for async IBR failures by making the next heartbeat trigger a re-registration. The only exception we've seen from an IBR is dead/unregistered node so the new code is a no-op but I added it as a safety net. # Ok. > Coalesce IBR processing in the NN > --------------------------------- > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode > Affects Versions: 2.0.0-alpha > Reporter: Daryn Sharp > Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)