[ https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15407242#comment-15407242 ]
Rakesh R commented on HDFS-10434: --------------------------------- I think I got the cause of the failure. It is wrongly finding out the datanode to be corrupted from the block locations. Instead of finding out a datanode which is used in the block locations it is simply getting a datanode from the cluster, which may not be a datanode present in the block locations. {code} byte[] indices = lastBlock.getBlockIndices(); //corrupt the first block DataNode toCorruptDn = cluster.getDataNodes().get(indices[0]); {code} For example, datanodes in the {{cluster.getDataNodes()}} array indexed like, 0->Dn1, 1->Dn2, 2->Dn3, 3->Dn4, 4->Dn5, 5->Dn6, 6->Dn7, 7->Dn8, 8->Dn9, 9->Dn10 Assume the datanodes which are part of block location is => Dn2, Dn3, Dn4, Dn5, Dn6, Dn7, Dn8, Dn9, Dn10. Now, in the failed scenario, it is getting the corrupted datanode as {{cluster.getDataNodes().get(0)}} which will be Dn1 and corruption of this datanode will not result in ECWork and is failing the tests. Ideally, the test should find a datanode from the block locations. Basically there are two problems in this test case. First one was fixed as part of this jira. For the second part, I think will raise another jira and fix it as there is no relation between first and second. > Fix intermittent test failure of TestDataNodeErasureCodingMetrics > ----------------------------------------------------------------- > > Key: HDFS-10434 > URL: https://issues.apache.org/jira/browse/HDFS-10434 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Rakesh R > Assignee: Rakesh R > Fix For: 3.0.0-alpha1 > > Attachments: HDFS-10434-00.patch, HDFS-10434-01.patch > > > This jira is to fix the test case failure. > Reference : > [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/] > {code} > Error Message > Bad value for metric EcReconstructionTasks expected:<1> but was:<0> > Stacktrace > java.lang.AssertionError: Bad value for metric EcReconstructionTasks > expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org