[ https://issues.apache.org/jira/browse/HDFS-15308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17221711#comment-17221711 ]
Ahmed Hussein commented on HDFS-15308: -------------------------------------- Hi [~hemanthboyina], I have a question related to the configuration {{DFSConfigKeys.DFS_NAMENODE_RECONSTRUCTION_PENDING_TIMEOUT_SEC_KEY}}. There is another unit test that was failing, and I wonder if it is caused by the same flag: https://github.com/apache/hadoop/pull/2408#issuecomment-717278256 {quote}though in my system the test was never failing now, can you check once if in your local the test was failing ?{quote} have you tried to run the Unit test inside a loop? I recommend you run it in a loop from the command line (not the IDE). {code:bash} # go the module directory cd hadoop-hdfs-project/hadoop-hdfs # run the test one at least mvn test -Dtest=TestReconstructStripedFile # run the test class inside a loop. This will break once the test unit fails. while :;do mvn surefire:test -Dtest= TestReconstructStripedFile || break;done {code} Let's assume that a single test execution takes 2 minutes. Then if you manage to run it without failure for 120 minutes (60 times), then I think we should go ahead with the fix. Please, check [~inigoiri] review to address them in your patch. > TestReconstructStripedFile#testNNSendsErasureCodingTasks fails intermittently > ----------------------------------------------------------------------------- > > Key: HDFS-15308 > URL: https://issues.apache.org/jira/browse/HDFS-15308 > Project: Hadoop HDFS > Issue Type: Sub-task > Components: erasure-coding > Affects Versions: 3.3.0 > Reporter: Toshihiko Uchida > Assignee: Hemanth Boyina > Priority: Minor > Labels: flaky-test > Attachments: HDFS-15308.001.patch > > > In HDFS-14353, TestReconstructStripedFile.testNNSendsErasureCodingTasks > failed once due to pending reconstruction timeout as follows. > {code} > java.lang.AssertionError: Found 4 timeout pending reconstruction tasks > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.assertTrue(Assert.java:41) > at > org.apache.hadoop.hdfs.TestReconstructStripedFile.testNNSendsErasureCodingTasks(TestReconstructStripedFile.java:502) > at > org.apache.hadoop.hdfs.TestReconstructStripedFile.testNNSendsErasureCodingTasks(TestReconstructStripedFile.java:458) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) > at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:47) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:298) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:292) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at java.lang.Thread.run(Thread.java:748) > {code} > The error occurred on the following assertion. > {code} > // Make sure that all pending reconstruction tasks can be processed. > while (ns.getPendingReconstructionBlocks() > 0) { > long timeoutPending = ns.getNumTimedOutPendingReconstructions(); > assertTrue(String.format("Found %d timeout pending reconstruction tasks", > timeoutPending), timeoutPending == 0); > Thread.sleep(1000); > } > {code} > The failure could not be reproduced in the reporter's docker environment > (start-build-environment.sh). -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org