[ 
https://issues.apache.org/jira/browse/HDFS-10434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15296157#comment-15296157
 ] 

Rakesh R commented on HDFS-10434:
---------------------------------

Thanks [~libo-intel], we are updating the dn metrics at the finally block of 
{{StripedReconstructor}} thread as shown below. The failure occurs because 
{{StripedFileTestUtil.waitForReconstructionFinished()}} is waiting for the 
block recovery but not waiting to finish executing the 
StripedReconstructor#run() finally block section. Probably can try debugging 
the failed test case {{TestDataNodeErasureCodingMetrics#testEcTasks}} by 
putting a break point at the finally block and you can see 
{{StripedFileTestUtil.waitForReconstructionFinished(file, fs, GROUPSIZE);}} is 
coming out and failing the test case. To fix this, I added grace period so that 
the thread will get a chance to execute the finally block to update the metrics 
data.

StripedReconstructor.java
{code}
    } finally {
      datanode.decrementXmitsInProgress();
      datanode.getMetrics().incrECReconstructionTasks();
      stripedReader.close();
      stripedWriter.close();
{code}


> Fix intermittent test failure of TestDataNodeErasureCodingMetrics
> -----------------------------------------------------------------
>
>                 Key: HDFS-10434
>                 URL: https://issues.apache.org/jira/browse/HDFS-10434
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Rakesh R
>            Assignee: Rakesh R
>         Attachments: HDFS-10434-00.patch
>
>
> This jira is to fix the test case failure.
> Reference : 
> [Build15485_TestDataNodeErasureCodingMetrics_testEcTasks|https://builds.apache.org/job/PreCommit-HDFS-Build/15485/testReport/org.apache.hadoop.hdfs.server.datanode/TestDataNodeErasureCodingMetrics/testEcTasks/]
> {code}
> Error Message
> Bad value for metric EcReconstructionTasks expected:<1> but was:<0>
> Stacktrace
> java.lang.AssertionError: Bad value for metric EcReconstructionTasks 
> expected:<1> but was:<0>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:743)
>       at org.junit.Assert.assertEquals(Assert.java:118)
>       at org.junit.Assert.assertEquals(Assert.java:555)
>       at 
> org.apache.hadoop.test.MetricsAsserts.assertCounter(MetricsAsserts.java:228)
>       at 
> org.apache.hadoop.hdfs.server.datanode.TestDataNodeErasureCodingMetrics.testEcTasks(TestDataNodeErasureCodingMetrics.java:92)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to