[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13907020#comment-13907020 ] Hudson commented on HDFS-5483: -- SUCCESS: Integrated in Hadoop-Mapreduce-trunk #1704 (See [https://builds.apache.org/job/Hadoop-Mapreduce-trunk/1704/]) HDFS-5483. NN should gracefully handle multiple block replicas on same DN. (Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570040) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0, 2.4.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906969#comment-13906969 ] Hudson commented on HDFS-5483: -- SUCCESS: Integrated in Hadoop-Hdfs-trunk #1679 (See [https://builds.apache.org/job/Hadoop-Hdfs-trunk/1679/]) HDFS-5483. NN should gracefully handle multiple block replicas on same DN. (Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570040) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0, 2.4.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906864#comment-13906864 ] Hudson commented on HDFS-5483: -- SUCCESS: Integrated in Hadoop-Yarn-trunk #487 (See [https://builds.apache.org/job/Hadoop-Yarn-trunk/487/]) HDFS-5483. NN should gracefully handle multiple block replicas on same DN. (Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570040) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0, 2.4.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906584#comment-13906584 ] Hadoop QA commented on HDFS-5483: - {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12629929/h5483.04.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 core tests{color}. The patch failed these unit tests in hadoop-hdfs-project/hadoop-hdfs: org.apache.hadoop.hdfs.server.namenode.TestCacheDirectives {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6186//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6186//console This message is automatically generated. > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0, 2.4.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906527#comment-13906527 ] Hudson commented on HDFS-5483: -- SUCCESS: Integrated in Hadoop-trunk-Commit #5193 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/5193/]) HDFS-5483. NN should gracefully handle multiple block replicas on same DN. (Arpit Agarwal) (arp: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1570040) * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/BlockManager.java * /hadoop/common/trunk/hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockHasMultipleReplicasOnSameDN.java > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0, 2.4.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906442#comment-13906442 ] Hadoop QA commented on HDFS-5483: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12629883/h5483.03.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. There were no new javadoc warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/6182//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/6182//console This message is automatically generated. > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906351#comment-13906351 ] Arpit Agarwal commented on HDFS-5483: - Thanks Chris, will commit on Jenkins +1! > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal >Assignee: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch, h5483.03.patch, h5483.04.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906079#comment-13906079 ] Chris Nauroth commented on HDFS-5483: - Hi Arpit, This patch looks good. Just one minor comment on {{TestBlockHasMultipleReplicasOnSameDN#startUpCluster}}. There is a visible-for-testing {{DistributedFileSystem#getClient}} method that returns the underlying {{DFSClient}} instance. I'm wondering if the test initialization code can be reduced to {{client = fs.getClient()}}. > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch, h5483.03.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867939#comment-13867939 ] Eric Sirianni commented on HDFS-5483: - Arpit - I applied the patch in our environment and it is indeed fixing the assertion failure. Is it possible for another committer to "+1" this patch so that you can commit it to trunk and branch-2? In parallel, we will also work towards fixing our {{FsDatasetSpi}} plugin to avoid sending such block reports. > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867102#comment-13867102 ] Eric Sirianni commented on HDFS-5483: - OK - thanks, missed that guard. {code} boolean addBlock(BlockInfo b) { if(!b.addStorage(this)) return false; {code} > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867075#comment-13867075 ] Arpit Agarwal commented on HDFS-5483: - {{BlockInfo#addStorage}} checks for it. {code} boolean addStorage(DatanodeStorageInfo storage) { int idx = findDatanode(storage.getDatanodeDescriptor()); ... // The block is on the DN but belongs to a different storage. // Update our state. {code} > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13867031#comment-13867031 ] Eric Sirianni commented on HDFS-5483: - This {{BLOCK_RECEIVED}} code path appears to modify the {{BlockInfo}}} list directly: {noformat} BlockInfo.listInsert(BlockInfo, DatanodeStorageInfo) line: 308 DatanodeStorageInfo.addBlock(BlockInfo) line: 208 DatanodeDescriptor.addBlock(String, BlockInfo) line: 168 BlockManager.addStoredBlock(BlockInfo, DatanodeDescriptor, String, DatanodeDescriptor, boolean) line: 2215 BlockManager.processAndHandleReportedBlock(DatanodeDescriptor, String, Block, HdfsServerConstants$ReplicaState, DatanodeDescriptor) line: 2720 BlockManager.addBlock(DatanodeDescriptor, String, Block, String) line: 2695 BlockManager.processIncrementalBlockReport(DatanodeID, String, StorageReceivedDeletedBlocks) line: 2769 FSNamesystem.processIncrementalBlockReport(DatanodeID, String, StorageReceivedDeletedBlocks) line: 5285 NameNodeRpcServer.blockReceivedAndDeleted(DatanodeRegistration, String, StorageReceivedDeletedBlocks[]) line: 993 {noformat} Couldn't this corrupt the {{BlockInfo}} list if a datanode sent two {{BLOCK_RECEIVED}}s for two different storages? > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865965#comment-13865965 ] Arpit Agarwal commented on HDFS-5483: - Eric, the blockreceived path won't assert since it doesn't try to manipulate the BlockInfo list directly. However looking at it some more I think we can eliminate the findDatanode routine, or at least make it 'private'. I'll file a separate Jira for it. > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13865782#comment-13865782 ] Eric Sirianni commented on HDFS-5483: - Arpit - I noticed that the supplied patch only ignores the extra replica in the full Block Report code path ({{processReport()}}). Doesn't this leave the assertion still exposed on the {{BLOCK_RECEIVED}} ({{processIncrementalReportedBlock()}}) path? It seems like this code might need to be changed to search based on storage ID also: {code} if (reportedState == ReplicaState.FINALIZED && (storedBlock.findDatanode(dn) < 0 || corruptReplicas.isReplicaCorrupt(storedBlock, dn))) { toAdd.add(storedBlock); } {code} > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.5#6160)
[jira] [Commented] (HDFS-5483) NN should gracefully handle multiple block replicas on same DN
[ https://issues.apache.org/jira/browse/HDFS-5483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13852552#comment-13852552 ] Hadoop QA commented on HDFS-5483: - {color:green}+1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12619448/h5483.02.patch against trunk revision . {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified test files. {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 eclipse:eclipse{color}. The patch built with eclipse:eclipse. {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 1.3.9) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 core tests{color}. The patch passed unit tests in hadoop-hdfs-project/hadoop-hdfs. {color:green}+1 contrib tests{color}. The patch passed contrib unit tests. Test results: https://builds.apache.org/job/PreCommit-HDFS-Build/5763//testReport/ Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/5763//console This message is automatically generated. > NN should gracefully handle multiple block replicas on same DN > -- > > Key: HDFS-5483 > URL: https://issues.apache.org/jira/browse/HDFS-5483 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: namenode >Affects Versions: Heterogeneous Storage (HDFS-2832) >Reporter: Arpit Agarwal > Fix For: 3.0.0 > > Attachments: h5483.02.patch > > > {{BlockManager#reportDiff}} can cause an assertion failure in > {{BlockInfo#moveBlockToHead}} if the block report shows the same block as > belonging to more than one storage. > The issue is that {{moveBlockToHead}} assumes it will find the > DatanodeStorageInfo for the given block. > Exception details: > {code} > java.lang.AssertionError: Index is out of bound > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.setNext(BlockInfo.java:152) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockInfo.moveBlockToHead(BlockInfo.java:351) > at > org.apache.hadoop.hdfs.server.blockmanagement.DatanodeStorageInfo.moveBlockToHead(DatanodeStorageInfo.java:243) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.reportDiff(BlockManager.java:1841) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1709) > at > org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.processReport(BlockManager.java:1637) > at > org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.blockReport(NameNodeRpcServer.java:984) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure.testVolumeFailure(TestDataNodeVolumeFailure.java:165) > {code} -- This message was sent by Atlassian JIRA (v6.1.4#6159)