[ https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296157#comment-15296157 ]
Rakesh R commented on HDFS-10434: --------------------------------- Thanks [~libo-intel], we are updating the dn metrics at the finally block of {{StripedReconstructor}} thread as shown below. The failure occurs because {{StripedFileTestUtil.waitForReconstructionFinished()}} is waiting for the block recovery but not waiting to finish executing the StripedReconstructor#run() finally block section. Probably can try debugging the failed test case {{TestDataNodeErasureCodingMetrics#testEcTasks}} by putting a break point at the finally block and you can see {{StripedFileTestUtil.waitForReconstructionFinished(file, fs, GROUPSIZE);}} is coming out and failing the test case. To fix this, I added grace period so that the thread will get a chance to execute the finally block to update the metrics data. StripedReconstructor.java {code} } finally { datanode.decrementXmitsInProgress(); datanode.getMetrics().incrECReconstructionTasks(); stripedReader.close(); stripedWriter.close(); {code} > Fix intermittent test failure of TestDataNodeErasureCodingMetrics > ----------------------------------------------------------------- > > Key: HDFS-10434 > URL: https://issues.apache.org/jira/browse/HDFS-10434 > Project: Hadoop HDFS > Issue Type: Sub-task > Reporter: Rakesh R > Assignee: Rakesh R > Attachments: HDFS-10434-00.patch > > > This jira is to fix the test case failure. > Reference : > [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/] > {code} > Error Message > Bad value for metric EcReconstructionTasks expected:<1> but was:<0> > Stacktrace > java.lang.AssertionError: Bad value for metric EcReconstructionTasks > expected:<1> but was:<0> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org