[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15184170#comment-15184170 ] Lin Yiqun commented on HDFS-9865: - Thanks [~iwasakims] for commit! > TestBlockReplacement fails intermittently in trunk > -- > > Key: HDFS-9865 > URL: https://issues.apache.org/jira/browse/HDFS-9865 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Fix For: 2.8.0, 2.7.3 > > Attachments: HDFS-9865.001.patch, HDFS-9865.002.patch > > > I found the testcase {{TestBlockReplacement}} will be failed sometimes in > testing. And I looked the unit log, always I will found these infos: > {code} > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement > testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement) > Time elapsed: 8.764 sec <<< FAILURE! > java.lang.AssertionError: The block should be only on 1 datanode > expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436) > {code} > Finally I found the reason is that not deleting block completely in > testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. > And the time to wait FsDatasetAsyncDsikService to delete the block is not a > accurate value. > {code} > LOG.info("replaceBlock: " + replaceBlock(block, > (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc, > (DatanodeInfo)destDnDesc)); > // Waiting for the FsDatasetAsyncDsikService to delete the block > Thread.sleep(3000); > {code} > When I adjust this time to 1 seconds, it will be always failed. Also the 3 > seconds in test is not a accurate value too. We should adjust these code's > logic to a better way such as waiting for the block to be replicated in > testDecommision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1518#comment-1518 ] Hudson commented on HDFS-9865: -- FAILURE: Integrated in Hadoop-trunk-Commit #9436 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/9436/]) HDFS-9865. TestBlockReplacement fails intermittently in trunk (Lin Yiqun (iwasakims: rev d718fc1ee5aee3628e105339ee3ea183b6242409) * hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/datanode/TestBlockReplacement.java > TestBlockReplacement fails intermittently in trunk > -- > > Key: HDFS-9865 > URL: https://issues.apache.org/jira/browse/HDFS-9865 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Fix For: 2.8.0, 2.7.3 > > Attachments: HDFS-9865.001.patch, HDFS-9865.002.patch > > > I found the testcase {{TestBlockReplacement}} will be failed sometimes in > testing. And I looked the unit log, always I will found these infos: > {code} > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement > testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement) > Time elapsed: 8.764 sec <<< FAILURE! > java.lang.AssertionError: The block should be only on 1 datanode > expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436) > {code} > Finally I found the reason is that not deleting block completely in > testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. > And the time to wait FsDatasetAsyncDsikService to delete the block is not a > accurate value. > {code} > LOG.info("replaceBlock: " + replaceBlock(block, > (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc, > (DatanodeInfo)destDnDesc)); > // Waiting for the FsDatasetAsyncDsikService to delete the block > Thread.sleep(3000); > {code} > When I adjust this time to 1 seconds, it will be always failed. Also the 3 > seconds in test is not a accurate value too. We should adjust these code's > logic to a better way such as waiting for the block to be replicated in > testDecommision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15183061#comment-15183061 ] Hadoop QA commented on HDFS-9865: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 11m 55s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 40s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 4s {color} | {color:green} trunk passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 39s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 6s {color} | {color:green} the patch passed with JDK v1.8.0_74 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 64m 56s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_74. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 57m 27s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 26s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 159m 32s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_74 Failed junit tests | hadoop.hdfs.TestFileAppend | | | hadoop.hdfs.server.namenode.TestDecommissioningStatus | | JDK v1.8.0_74 Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | | JDK v1.7.0_95 Failed junit tests | hadoop.hdfs.shortcircuit.TestShortCircuitCache | | | hadoop.hdfs.server.datanode.TestDataNodeLifeline | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12791747/HDFS-9865.002.patch | | JIRA Issue | HDFS-9865 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname
[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182887#comment-15182887 ] Lin Yiqun commented on HDFS-9865: - Thanks [~iwasakims] for review. Update the latest patch for addressing comments. > TestBlockReplacement fails intermittently in trunk > -- > > Key: HDFS-9865 > URL: https://issues.apache.org/jira/browse/HDFS-9865 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9865.001.patch, HDFS-9865.002.patch > > > I found the testcase {{TestBlockReplacement}} will be failed sometimes in > testing. And I looked the unit log, always I will found these infos: > {code} > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement > testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement) > Time elapsed: 8.764 sec <<< FAILURE! > java.lang.AssertionError: The block should be only on 1 datanode > expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436) > {code} > Finally I found the reason is that not deleting block completely in > testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. > And the time to wait FsDatasetAsyncDsikService to delete the block is not a > accurate value. > {code} > LOG.info("replaceBlock: " + replaceBlock(block, > (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc, > (DatanodeInfo)destDnDesc)); > // Waiting for the FsDatasetAsyncDsikService to delete the block > Thread.sleep(3000); > {code} > When I adjust this time to 1 seconds, it will be always failed. Also the 3 > seconds in test is not a accurate value too. We should adjust these code's > logic to a better way such as waiting for the block to be replicated in > testDecommision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15182627#comment-15182627 ] Masatake Iwasaki commented on HDFS-9865: Thanks for working on this, [~linyiqun]. {code} 415 int tries = 0; 416 while (tries++ < 20) { {code} {{for (int tries = 0; tries < 20; tries++)}} should be better to limit the scope of {{retreis}}. {code} 418 // Triggering the incremental block report to report the deleted block 419 // to namnemode 420 cluster.getDataNodes().get(0).triggerBlockReport( 421 new BlockReportOptions.Factory().setIncremental(true).build()); {code} Can you replace {{triggerBlockReport}} with {{DataNodeTestUtils#triggerDeletionReport}}? Though {{triggerBlockReport}} was in the original code, sending just IBR is appropriate and faster. > TestBlockReplacement fails intermittently in trunk > -- > > Key: HDFS-9865 > URL: https://issues.apache.org/jira/browse/HDFS-9865 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9865.001.patch > > > I found the testcase {{TestBlockReplacement}} will be failed sometimes in > testing. And I looked the unit log, always I will found these infos: > {code} > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement > testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement) > Time elapsed: 8.764 sec <<< FAILURE! > java.lang.AssertionError: The block should be only on 1 datanode > expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436) > {code} > Finally I found the reason is that not deleting block completely in > testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. > And the time to wait FsDatasetAsyncDsikService to delete the block is not a > accurate value. > {code} > LOG.info("replaceBlock: " + replaceBlock(block, > (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc, > (DatanodeInfo)destDnDesc)); > // Waiting for the FsDatasetAsyncDsikService to delete the block > Thread.sleep(3000); > {code} > When I adjust this time to 1 seconds, it will be always failed. Also the 3 > seconds in test is not a accurate value too. We should adjust these code's > logic to a better way such as waiting for the block to be replicated in > testDecommision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170268#comment-15170268 ] Lin Yiqun commented on HDFS-9865: - The failed test is not related, thanks review. > TestBlockReplacement fails intermittently in trunk > -- > > Key: HDFS-9865 > URL: https://issues.apache.org/jira/browse/HDFS-9865 > Project: Hadoop HDFS > Issue Type: Bug > Components: test >Affects Versions: 2.7.1 >Reporter: Lin Yiqun >Assignee: Lin Yiqun > Attachments: HDFS-9865.001.patch > > > I found the testcase {{TestBlockReplacement}} will be failed sometimes in > testing. And I looked the unit log, always I will found these infos: > {code} > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement > testDeletedBlockWhenAddBlockIsInEdit(org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement) > Time elapsed: 8.764 sec <<< FAILURE! > java.lang.AssertionError: The block should be only on 1 datanode > expected:<1> but was:<2> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.hdfs.server.datanode.TestBlockReplacement.testDeletedBlockWhenAddBlockIsInEdit(TestBlockReplacement.java:436) > {code} > Finally I found the reason is that not deleting block completely in > testDeletedBlockWhenAddBlockIsInEdit cause the datanode's num not correct. > And the time to wait FsDatasetAsyncDsikService to delete the block is not a > accurate value. > {code} > LOG.info("replaceBlock: " + replaceBlock(block, > (DatanodeInfo)sourceDnDesc, (DatanodeInfo)sourceDnDesc, > (DatanodeInfo)destDnDesc)); > // Waiting for the FsDatasetAsyncDsikService to delete the block > Thread.sleep(3000); > {code} > When I adjust this time to 1 seconds, it will be always failed. Also the 3 > seconds in test is not a accurate value too. We should adjust these code's > logic to a better way such as waiting for the block to be replicated in > testDecommision. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9865) TestBlockReplacement fails intermittently in trunk
[ https://issues.apache.org/jira/browse/HDFS-9865?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169072#comment-15169072 ] Hadoop QA commented on HDFS-9865: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 10s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 6m 51s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 50s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 13s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 53s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} trunk passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 45s {color} | {color:green} trunk passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 35s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 38s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 38s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 18s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s {color} | {color:green} the patch passed with JDK v1.8.0_72 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 43s {color} | {color:green} the patch passed with JDK v1.7.0_95 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 52m 32s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_72. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 51m 12s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_95. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s {color} | {color:green} Patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 128m 20s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_72 Failed junit tests | hadoop.hdfs.server.blockmanagement.TestReconstructStripedBlocksWithRackAwareness | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12790114/HDFS-9865.001.patch | | JIRA Issue | HDFS-9865 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux dc563ed9f8a2 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality |