[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226067#comment-14226067 ] Hudson commented on HDFS-7097: -- FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #17 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/17/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226087#comment-14226087 ] Hudson commented on HDFS-7097: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #755 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/755/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226238#comment-14226238 ] Hudson commented on HDFS-7097: -- FAILURE: Integrated in Hadoop-Hdfs-trunk #1945 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1945/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226248#comment-14226248 ] Hudson commented on HDFS-7097: -- FAILURE: Integrated in Hadoop-Hdfs-trunk-Java8 #17 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk-Java8/17/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226302#comment-14226302 ] Hudson commented on HDFS-7097: -- FAILURE: Integrated in Hadoop-Mapreduce-trunk #1969 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1969/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226319#comment-14226319 ] Hudson commented on HDFS-7097: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk-Java8 #17 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/17/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14226450#comment-14226450 ] Akira AJISAKA commented on HDFS-7097: - The patch was committed to branch-2 and trunk. Updating the status. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Fix For: 2.7.0 Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225393#comment-14225393 ] Andrew Wang commented on HDFS-7097: --- +1 LGTM as well, thanks Kihwal for the patch, ATM, Vinay, and Ming for reviewing. I'll commit this shortly. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225424#comment-14225424 ] Hudson commented on HDFS-7097: -- FAILURE: Integrated in Hadoop-trunk-Commit #6605 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/6605/]) HDFS-7097. Allow block reports to be processed during checkpointing on standby name node. (kihwal via wang) (wang: rev f43a20c529ac3f104add95b222de6580757b3763) * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/StandbyCheckpointer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSImageFormat.java * hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/ha/EditLogTailer.java * hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSNamesystem.java * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/namenode/ha/TestStandbyCheckpoints.java Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14225476#comment-14225476 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12683658/HDFS-7097.ultimate.trunk.patch against trunk revision 78f7cdb. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.fs.TestEnhancedByteBufferAccess {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8835//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8835//console This message is automatically generated. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.ultimate.trunk.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14223893#comment-14223893 ] Andrew Wang commented on HDFS-7097: --- This patch has unfortunately gone a bit stale due to the RetryCache refactor. Kihwal, do you mind rebasing? I promise to +1/commit in a timely fashion :) Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14220234#comment-14220234 ] Aaron T. Myers commented on HDFS-7097: -- Sorry for the delay, Kihwal. Yes, the latest patch looks good to me. +1 Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14191884#comment-14191884 ] Kihwal Lee commented on HDFS-7097: -- [~atm], are you okay with the latest patch? Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188409#comment-14188409 ] Kihwal Lee commented on HDFS-7097: -- Thanks for the review, Aaron. The checkpoint lock is only meaningful on standby namenode. Some of the namenode methods that can be called on both active and standby do acquire the lock, but the locking has no correctness implications on active namenode. On active namenode, the regular namesystem lock (fsLock) takes care of everything. 1. {{FSNamesystem#rollEditLog}} : This is only allowed on active namenodes. 2. {{FSNamesystem#(start|end)Checkpoint}} : This is only for backup node. 3. I am renaming it to {{cpLock}}. 4. I believe [~daryn] is eager to do that once this is in. :) Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188700#comment-14188700 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12677893/HDFS-7097.patch against trunk revision ec63a3f. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.qjournal.client.TestQuorumJournalManager The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.TestPread {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8580//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8580//console This message is automatically generated. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14188946#comment-14188946 ] Kihwal Lee commented on HDFS-7097: -- The test case failed because : {{java.net.BindException: Port in use: localhost:40123}} We are getting this sort of failures more often nowadays from precommit. Both test cases pass when run on my machine. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165215#comment-14165215 ] Kihwal Lee commented on HDFS-7097: -- I am not sure about calling triggerBlockReports(). The block finalization makes the IBR to be sent out right away, so this extra call will actually do nothing. It's a bug, if NNs are not getting it. But your concern is valid. Depending on the testing environment, things can get slow down and the delay of 1 second may not be enough. I can make it periodically check for a longer period of time. The test will terminate sooner when it succeeds, but will take an extra time when fails. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165342#comment-14165342 ] Ming Ma commented on HDFS-7097: --- Thanks, Kihwal. LGTM. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165527#comment-14165527 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12673911/HDFS-7097.patch against trunk revision 8d7c549. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:red}-1 javac{color}. The applied patch generated 1272 javac compiler warnings (more than the trunk's current 1267 warnings). {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:red}-1 release audit{color}. The applied patch generated 1 release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.ha.TestDNFencingWithReplication org.apache.hadoop.hdfs.TestDecommission org.apache.hadoop.hdfs.server.balancer.TestBalancer The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.datanode.TestBlockHasMultipleReplicasOTests org.apache.hadoop.hdfs.Tests org.apache.hadoop.hdfs.server.blockmanagement.TestCorruptReplicaInfo org.apache.hadoop.hTests org.apache.hadoopTests org.apache.hadoop.hdfs.server.Tests org.apache.hadoop.hdfs.sTests org.apache.hadooTests org.apache.hadoop.hdfs.qjournal.client.TestIPCLoggerChanTests org.apache.hadoop.traciTests org.apache.hadoop.hdfs.TestFileCreationClient {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//testReport/ Release audit warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//artifact/patchprocess/patchReleaseAuditProblems.txt Javac warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//artifact/patchprocess/diffJavacWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8378//console This message is automatically generated. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14165623#comment-14165623 ] Aaron T. Myers commented on HDFS-7097: -- The patch looks pretty good, and in thinking about it a fair bit I think it won't regress the issue I was trying to address in HDFS-5064, though Kihwal I would appreciate if you could confirm that as well. A few small comments: # Does {{FSNamesystem#rollEditLog}} need to take the nsLock as well? Seems like it might, given that tailing edits no longer is taking the normal FSNS rw lock. # Similarly for {{FSNamesystem#(start|end)Checkpoint}}, though that's less obvious to me. # Seems a little strange to me to be calling this new lock the nsLock, when that's also what we've been calling the main FSNS rw lock all this time. I'd suggest renaming this to the checkpoint lock or something, to more clearly distinguish its purpose. # I think you can now remove some of the other stuff added as part of HDFS-5064, e.g. the entire {{longReadLock}} I believe was only actually being locked for read during checkpointing. Thanks a lot for working on this, Kihwal. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14162180#comment-14162180 ] Ming Ma commented on HDFS-7097: --- In the unit test, to get rid of possible flaky situation where SBN might not have received incremental BR after the file has been closed, maybe it can call MiniDFSCluster.triggerBlockReports? Otherwise, the patch looks good. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14156884#comment-14156884 ] Kihwal Lee commented on HDFS-7097: -- bq. Does FSNameSystem.saveNamespace still need to take its readLock if it is called in Standby? If we want external invocations of {{saveNamespace()}} to be remain as blocking, then yes. I don't think we want it like this, but then it becomes a bit complicated and something beyond the scope of this jira. - External invocations of saveNamespace should be cancellable. - May be block report processing should be allowed even on active when saveNamespace is going on. Otherwise the rpc handlers will all be tied up up quickly and even if it acquires the read lock, no read ops (e.g. heartbeat) will go through. Incremental BRs are not much of a problem on active NN, since it needs to be in safe mode first. Incremental BR will settle down in a few seconds after going into safe mode. Full block reports can be a problem depending on the number of nodes, time it takes to save the name space and the full block report interval. The rpc does timeout on DataNode, so full block reports can be resent multiple times if saveNamespace takes long. They will all take up a different handler or a call queue slot. - Ongoing external invocation of saveNamespace should be cancelled during HA transition. Queued (in call queue or waiting on the lock) saveNamespace requests should also be rejected for a while just like checkpointing is not allowed for some time after a cancellation. - Locking will be a bit tricky. I agree that we need to address the saveNamespace issue, but prefer doing it in a separate jira. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14157172#comment-14157172 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12672615/HDFS-7097.patch against trunk revision a56f3ec. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:red}-1 findbugs{color}. The patch appears to introduce 1 new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The test build failed in hadoop-hdfs-project/hadoop-hdfs {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8303//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HDFS-Build/8303//artifact/patchprocess/newPatchFindbugsWarningshadoop-hdfs.html Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8303//console This message is automatically generated. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch, HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14144584#comment-14144584 ] Vinayakumar B commented on HDFS-7097: - Thanks Kihwal, Changes looks good. I too have the same question as [~mingma]. 1. though its not frequent, saveNamespace RPC on standby will create the same problem as mentioned in description. can we have different locks for saveNamespace based on HA state? Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143585#comment-14143585 ] Ming Ma commented on HDFS-7097: --- Thanks, Kihwal. This is quite useful. Good idea to use another lock for coordination. 1. Does FSNameSystem.saveNamespace still need to take its readLock if it is called in Standby? 2. It will be useful to add some unit test to verify BR can still be processed when checkpoint is in progress. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140717#comment-14140717 ] Kihwal Lee commented on HDFS-7097: -- On a standby name node, the persistent state (i.e. things saved to fsimage) is updated only by replaying edits once it is started. - During checkpointing do not allow any state changes that will be persisted. (I.e. edit log replaying) - During checkpointing do not allow any RPC calls that may affect checkpointing itself. (saveNamespace, restoreFailedStorage, etc.) A new {{ReentrantLock}},{{nsLock}}, is introduced to coordinate checkpointing and other activities. Any thing that requires both {{nsLock}} and {{fsLock}}, {{nsLock}} is to be locked first. Otherwise it can block other RPC calls that do not require {{nsLock}}, not to mention deadlock. These locks are all locked interruptibly, so that the threads can be stopped during HA transition. Also improved {{FSImageFormat.Saver}} on cancellation checking, by making the check interval counter a class-level variable and following the PB format's threshold variable name. This only matters if the legacy oiv image saving is enabled. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140735#comment-14140735 ] Hadoop QA commented on HDFS-7097: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12670009/HDFS-7097.patch against trunk revision bf27b9c. {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The following test timeouts occurred in hadoop-hdfs-project/hadoop-hdfs-nfs: org.apache.hadoop.hdfs.nfs.nfs3.TTests org.apache.hadoop.hdfs.nfs.nfs3.TestOpenFilTests {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/8106//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/8106//console This message is automatically generated. Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-7097) Allow block reports to be processed during checkpointing on standby name node
[ https://issues.apache.org/jira/browse/HDFS-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14140861#comment-14140861 ] Kihwal Lee commented on HDFS-7097: -- The pre-commit didn't seem to do the right thing. It did apply the patch, but somehow thought the change is in hadoop-hdfs-nfs. {panel} Running tests in hadoop-hdfs-project/hadoop-hdfs-nfs /home/jenkins/tools/maven/latest/bin/mvn clean install -fn -Pnative -Drequire.test.libhadoop -DHadoopPatchProcess {panel} Allow block reports to be processed during checkpointing on standby name node - Key: HDFS-7097 URL: https://issues.apache.org/jira/browse/HDFS-7097 Project: Hadoop HDFS Issue Type: Bug Reporter: Kihwal Lee Assignee: Kihwal Lee Priority: Critical Attachments: HDFS-7097.patch On a reasonably busy HDFS cluster, there are stream of creates, causing data nodes to generate incremental block reports. When a standby name node is checkpointing, RPC handler threads trying to process a full or incremental block report is blocked on the name system's {{fsLock}}, because the checkpointer acquires the read lock on it. This can create a serious problem if the size of name space is big and checkpointing takes a long time. All available RPC handlers can be tied up very quickly. If you have 100 handlers, it only takes 34 file creates. If a separate service RPC port is not used, HA transition will have to wait in the call queue for minutes. Even if a separate service RPC port is configured, hearbeats from datanodes will be blocked. A standby NN with a big name space can lose all data nodes after checkpointing. The rpc calls will also be retransmitted by data nodes many times, filling up the call queue and potentially causing listen queue overflow. Since block reports are not modifying any state that is being saved to fsimage, I propose letting them through during checkpointing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)