[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726723#comment-16726723 ] Wei-Chiu Chuang commented on HDFS-9198: --- The existing approach ensures the info message is printed at most once per 4 seconds. Otherwise you'd see the same message overwhelming the NN log. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Major > Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1 > > Attachments: HDFS-9198-Branch-2-withamend.diff, > HDFS-9198-Branch-2.8-withamend.diff, HDFS-9198-branch-2.7.patch, > HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16726664#comment-16726664 ] He Xiaoqiao commented on HDFS-9198: --- Thanks [~daryn] for the nice work, and it is very useful for me. I am confused about the following code, LOG means block report queue is full now, however the IF statement is about how long from lastFull to now. Why not use queue size to estimate if queue is full or not directly? {code:java} void enqueue(Runnable action) throws InterruptedException { if (!queue.offer(action)) { if (!isAlive() && namesystem.isRunning()) { ExitUtil.terminate(1, getName() + " is not running"); } long now = Time.monotonicNow(); if (now - lastFull > 4000) { lastFull = now; LOG.info("Block report queue is full"); } queue.put(action); } } {code} If I missed something, please correct me. Thanks again. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp >Priority: Major > Fix For: 2.8.0, 2.7.3, 3.0.0-alpha1 > > Attachments: HDFS-9198-Branch-2-withamend.diff, > HDFS-9198-Branch-2.8-withamend.diff, HDFS-9198-branch-2.7.patch, > HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174214#comment-15174214 ] Kihwal Lee commented on HDFS-9198: -- Relevant tests are all passing with this patch. {noformat} --- T E S T S --- OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.datanode.TestIncrementalBrVariations Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.301 sec - in org.apache.hadoop.hdfs.server.datanode.TestIncrementalBrVariations OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.83 sec - in org.apache.hadoop.hdfs.server.namenode.TestDeadDatanode OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager Tests run: 17, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 27.238 sec - in org.apache.hadoop.hdfs.server.blockmanagement.TestBlockManager OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 33.094 sec - in org.apache.hadoop.hdfs.server.blockmanagement.TestPendingReplication OpenJDK 64-Bit Server VM warning: ignoring option MaxPermSize=768m; support was removed in 8.0 Running org.apache.hadoop.hdfs.TestDatanodeRegistration Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 18.187 sec - in org.apache.hadoop.hdfs.TestDatanodeRegistration Results : Tests run: 32, Failures: 0, Errors: 0, Skipped: 0 {noformat} Just committed to branch-2.7. Thanks, Vinay! > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0, 2.7.3 > > Attachments: HDFS-9198-Branch-2-withamend.diff, > HDFS-9198-Branch-2.8-withamend.diff, HDFS-9198-branch-2.7.patch, > HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15174190#comment-15174190 ] Kihwal Lee commented on HDFS-9198: -- +1 for the branch-2.7 patch. It is code-wise identical to what we are running internally. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0 > > Attachments: HDFS-9198-Branch-2-withamend.diff, > HDFS-9198-Branch-2.8-withamend.diff, HDFS-9198-branch-2.7.patch, > HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121147#comment-15121147 ] Vinayakumar B commented on HDFS-9198: - [~daryn], Is it worth merging this to 2.7.3? > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0 > > Attachments: HDFS-9198-Branch-2-withamend.diff, > HDFS-9198-Branch-2.8-withamend.diff, HDFS-9198-branch2.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15121517#comment-15121517 ] Kihwal Lee commented on HDFS-9198: -- We have applied this internally to 2.7 and run for quite some time. So I would say it is safe. HDFS-8999 and HDFS-9710 will make it less important, but since these are more invasive and risky, HDFS-9198 in 2.7 might be a better alternative for now. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Fix For: 2.8.0 > > Attachments: HDFS-9198-Branch-2-withamend.diff, > HDFS-9198-Branch-2.8-withamend.diff, HDFS-9198-branch2.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061148#comment-15061148 ] Hadoop QA commented on HDFS-9198: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 6 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 39s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 51s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 19s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 0s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 16s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s {color} | {color:green} trunk passed with JDK v1.7.0_91 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 7m 13s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdk1.8.0_66 with JDK v1.8.0_66 generated 3 new issues (was 32, now 32). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 54s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 8m 9s {color} | {color:red} hadoop-hdfs-project_hadoop-hdfs-jdk1.7.0_91 with JDK v1.7.0_91 generated 3 new issues (was 34, now 34). {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 5 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs (total was 384, now 384). {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 24s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 2m 11s {color} | {color:green} the patch passed with JDK v1.7.0_91 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 66m 43s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 48s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_91. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 25s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 155m 53s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.TestDFSUpgradeFromImage | | | hadoop.hdfs.server.datanode.TestBlockScanner | | |
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061331#comment-15061331 ] Uma Maheswara Rao G commented on HDFS-9198: --- +1 on the latest patch. Failures seems unrelated. Committing. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15061377#comment-15061377 ] Hudson commented on HDFS-9198: -- FAILURE: Integrated in Hadoop-trunk-Commit #8982 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/8982/]) HDFS-9198. Coalesce IBR processing in the NN. (Daryn Sharp via (umamahesh: rev f741476146574550a1a208d58ef8be76639e5ddc) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/TestDeadDatanode.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/BlockReportTestBase.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/TestDatanodeRegistration.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/CacheManager.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestIncrementalBrVariations.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeDescriptor.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/NameNodeRpcServer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeStorageInfo.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestPendingReplication.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestBlockManager.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/metrics/NameNodeMetrics.java > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058467#comment-15058467 ] Uma Maheswara Rao G commented on HDFS-9198: --- Hi Daryn, Apologies for the delay on this. Let me finish it today my final pass on this. Thanks. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15059188#comment-15059188 ] Uma Maheswara Rao G commented on HDFS-9198: --- [~daryn], Latest patch looks good to me. Could you please update patch on the latest trunk code. Seems it is failing to apply cleanly. Sorry for the in time review on this. Also it will get chance to run the jenkins on latest code base. Thanks > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15058216#comment-15058216 ] Daryn Sharp commented on HDFS-9198: --- Comments? Been running successfully with this feature since early Nov. Will update patch if necessary if it's currently acceptable. Seem to recall test failures were general build instability. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987804#comment-14987804 ] Hadoop QA commented on HDFS-9198: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 7s {color} | {color:blue} docker + precommit patch detected. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 6 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 12s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 3s {color} | {color:red} hadoop-hdfs-project/hadoop-hdfs in trunk cannot run convertXmlToText from findbugs {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} | {color:green} trunk passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 56s {color} | {color:green} trunk passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 34s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 34s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 0m 16s {color} | {color:red} Patch generated 5 new checkstyle issues in hadoop-hdfs-project/hadoop-hdfs (total was 410, now 410). {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 14s {color} | {color:green} the patch passed with JDK v1.8.0_60 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 59s {color} | {color:green} the patch passed with JDK v1.7.0_79 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 62m 1s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_60. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 60m 45s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_79. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 20s {color} | {color:red} Patch generated 58 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 143m 55s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_60 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark | | | hadoop.hdfs.server.namenode.TestCacheDirectives | | | hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA | | | hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | hadoop.hdfs.TestDFSClientRetries | | | hadoop.hdfs.server.datanode.TestBlockScanner | | JDK v1.7.0_79 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestNodeCount | | | hadoop.hdfs.TestRollingUpgrade | | | hadoop.hdfs.server.namenode.TestNNThroughputBenchmark | | | hadoop.hdfs.server.datanode.TestBlockScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.7.1 Server=1.7.1 Image:test-patch-base-hadoop-date2015-11-03 | | JIRA Patch URL |
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14987330#comment-14987330 ] Daryn Sharp commented on HDFS-9198: --- # I named it more generically since it can be used for more, but I'll rename if you feel strongly. # I originally had it outside the loop but something failed. Probably a test. Since it's a class static it should be cheap. # The NN is terminated when the queue stops accepting offers and the thread isn't running, but I'll add a try/catch around the run() loop too. # Will look at the style again. # Yes, I compensated for async IBR failures by making the next heartbeat trigger a re-registration. The only exception we've seen from an IBR is dead/unregistered node so the new code is a no-op but I added it as a safety net. # Ok. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968345#comment-14968345 ] Uma Maheswara Rao G commented on HDFS-9198: --- Ok, I have just looked your first comments which is trying to answer #5 > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968317#comment-14968317 ] Uma Maheswara Rao G commented on HDFS-9198: --- Thank Daryn for the Nice work here. This is interesting to me. I have just review the patch. Following are my comments: # runBlockOp: how about naming it as runBlockReportOp ? # nit: {code} while (namesystem.isRunning()) { +NameNodeMetrics metrics = NameNode.getNameNodeMetrics(); {code} May be we can take metrics outside loop and use it? # I think we need to handle throwable for this BR processing thread? incase of any unexpected errors, this thread should not die silently as its one of the important processing thread… ? we may have to terminate the system in such cases. minor suggestion: method names in BM could be like runBlockReportOpSync and runBlockReportAsync ? # code format missed for this lines: {code} metrics.setBlockOpsQueued(queue.size()+1); metrics.addBlockOpsBatched(processed-1); {code} # Currently DN sets the flag to trigger sendImmediateIBR on failure of IBR processing. But now we handle Exceptions as NN itself and can not pass to DN as due to async. So now we sendImmdeiateIBR happens only for IPC level exceptions. Have you thought about it. Missing such info would have to wait until next BR right? # Tests looking great to me. minor suggestion is could you please ass javadoc for tests? > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14965531#comment-14965531 ] Hadoop QA commented on HDFS-9198: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | pre-patch | 20m 31s | Pre-patch trunk has 1 extant Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 6 new or modified test files. | | {color:green}+1{color} | javac | 9m 31s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 11m 25s | There were no new javadoc warning messages. | | {color:green}+1{color} | release audit | 0m 25s | The applied patch does not increase the total number of release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 33s | The applied patch generated 6 new checkstyle issues (total was 414, now 415). | | {color:green}+1{color} | whitespace | 0m 8s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 36s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:green}+1{color} | findbugs | 2m 41s | The patch does not introduce any new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 34s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 66m 27s | Tests failed in hadoop-hdfs. | | | | 118m 30s | | \\ \\ || Reason || Tests || | Failed unit tests | hadoop.fs.loadGenerator.TestLoadGenerator | | | hadoop.fs.TestFcHdfsCreateMkdir | | | hadoop.hdfs.server.namenode.ha.TestDNFencing | | | hadoop.hdfs.TestRecoverStripedFile | | | hadoop.fs.TestSWebHdfsFileContextMainOperations | | | hadoop.fs.TestUrlStreamHandler | | | hadoop.fs.viewfs.TestViewFileSystemAtHdfsRoot | | | hadoop.fs.viewfs.TestViewFsAtHdfsRoot | | | hadoop.fs.permission.TestStickyBit | | | hadoop.hdfs.web.TestWebHDFS | | Timed out tests | org.apache.hadoop.fs.TestEnhancedByteBufferAccess | | | org.apache.hadoop.fs.TestUrlStreamHandlerFactory | | | org.apache.hadoop.fs.TestHDFSFileContextMainOperations | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12767620/HDFS-9198-trunk.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 9cb5d35 | | Pre-patch Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/13083/artifact/patchprocess/trunkFindbugsWarningshadoop-hdfs.html | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/13083/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/13083/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/13083/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf909.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/13083/console | This message was automatically generated. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959752#comment-14959752 ] Hadoop QA commented on HDFS-9198: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:red}-1{color} | patch | 0m 0s | The patch command could not apply the patch during dryrun. | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12766881/HDFS-9198-trunk.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / 8d2d3eb | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/13013/console | This message was automatically generated. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch, > HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959448#comment-14959448 ] Daryn Sharp commented on HDFS-9198: --- Most of the test failures are a race condition from IBRs not be sync now. Will update shortly. All you watchers, any comments on the approach? > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9198) Coalesce IBR processing in the NN
[ https://issues.apache.org/jira/browse/HDFS-9198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14944117#comment-14944117 ] Hadoop QA commented on HDFS-9198: - \\ \\ | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | pre-patch | 18m 6s | Pre-patch trunk compilation is healthy. | | {color:green}+1{color} | @author | 0m 0s | The patch does not contain any @author tags. | | {color:green}+1{color} | tests included | 0m 0s | The patch appears to include 3 new or modified test files. | | {color:green}+1{color} | javac | 8m 4s | There were no new javac warning messages. | | {color:green}+1{color} | javadoc | 10m 22s | There were no new javadoc warning messages. | | {color:red}-1{color} | release audit | 0m 15s | The applied patch generated 1 release audit warnings. | | {color:red}-1{color} | checkstyle | 1m 25s | The applied patch generated 6 new checkstyle issues (total was 424, now 425). | | {color:green}+1{color} | whitespace | 0m 6s | The patch has no lines that end in whitespace. | | {color:green}+1{color} | install | 1m 31s | mvn install still works. | | {color:green}+1{color} | eclipse:eclipse | 0m 34s | The patch built with eclipse:eclipse. | | {color:red}-1{color} | findbugs | 2m 33s | The patch appears to introduce 1 new Findbugs (version 3.0.0) warnings. | | {color:green}+1{color} | native | 3m 16s | Pre-build of native portion | | {color:red}-1{color} | hdfs tests | 200m 1s | Tests failed in hadoop-hdfs. | | | | 246m 16s | | \\ \\ || Reason || Tests || | FindBugs | module:hadoop-hdfs | | Failed unit tests | hadoop.hdfs.web.TestWebHdfsTimeouts | | | hadoop.hdfs.security.TestDelegationTokenForProxyUser | | | hadoop.hdfs.TestParallelRead | | | hadoop.hdfs.server.namenode.TestDeadDatanode | | | hadoop.hdfs.web.TestWebHdfsContentLength | | | hadoop.hdfs.server.blockmanagement.TestPendingReplication | | | hadoop.hdfs.server.datanode.TestIncrementalBrVariations | \\ \\ || Subsystem || Report/Notes || | Patch URL | http://issues.apache.org/jira/secure/attachment/12765032/HDFS-9198-trunk.patch | | Optional Tests | javadoc javac unit findbugs checkstyle | | git revision | trunk / b925cf1 | | Release Audit | https://builds.apache.org/job/PreCommit-HDFS-Build/12794/artifact/patchprocess/patchReleaseAuditProblems.txt | | checkstyle | https://builds.apache.org/job/PreCommit-HDFS-Build/12794/artifact/patchprocess/diffcheckstylehadoop-hdfs.txt | | Findbugs warnings | https://builds.apache.org/job/PreCommit-HDFS-Build/12794/artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html | | hadoop-hdfs test log | https://builds.apache.org/job/PreCommit-HDFS-Build/12794/artifact/patchprocess/testrun_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/12794/testReport/ | | Java | 1.7.0_55 | | uname | Linux asf904.gq1.ygridcore.net 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/12794/console | This message was automatically generated. > Coalesce IBR processing in the NN > - > > Key: HDFS-9198 > URL: https://issues.apache.org/jira/browse/HDFS-9198 > Project: Hadoop HDFS > Issue Type: Improvement > Components: namenode >Affects Versions: 2.0.0-alpha >Reporter: Daryn Sharp >Assignee: Daryn Sharp > Attachments: HDFS-9198-branch2.patch, HDFS-9198-trunk.patch > > > IBRs from thousands of DNs under load will degrade NN performance due to > excessive write-lock contention from multiple IPC handler threads. The IBR > processing is quick, so the lock contention may be reduced by coalescing > multiple IBRs into a single write-lock transaction. The handlers will also > be freed up faster for other operations. -- This message was sent by Atlassian JIRA (v6.3.4#6332)