[ https://issues.apache.org/jira/browse/HDFS-16213?focusedWorklogId=648242&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-648242 ]
ASF GitHub Bot logged work on HDFS-16213: ----------------------------------------- Author: ASF GitHub Bot Created on: 08/Sep/21 21:52 Start Date: 08/Sep/21 21:52 Worklog Time Spent: 10m Work Description: LeonGao91 commented on pull request #3386: URL: https://github.com/apache/hadoop/pull/3386#issuecomment-915598330 Thanks @virajjasani for reporting this issue! Seems like this happens consistently when running the same test multiple times, but doesn't fail when running it the first time (happy case), like you mentioned, this can reproduce it consistently: @Test public void t1() throws Exception { testDnRestartWithHardLink(); testDnRestartWithHardLink(); } Based on this I am not sure if the root cause is the rare race condition you mentioned. I suspect it is due to the cache behavior when rerun the same code the replica loading changed a little bit. I can spend some more time investigating in the next few days. Let's try to fix this in the unit test itself -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org Issue Time Tracking ------------------- Worklog Id: (was: 648242) Time Spent: 3.5h (was: 3h 20m) > Flaky test TestFsDatasetImpl#testDnRestartWithHardLink > ------------------------------------------------------ > > Key: HDFS-16213 > URL: https://issues.apache.org/jira/browse/HDFS-16213 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Viraj Jasani > Assignee: Viraj Jasani > Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > Failure case: > [here|https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-3359/4/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt] > {code:java} > [ERROR] > testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) > Time elapsed: 7.768 s <<< FAILURE![ERROR] > testDnRestartWithHardLink(org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl) > Time elapsed: 7.768 s <<< FAILURE!java.lang.AssertionError at > org.junit.Assert.fail(Assert.java:87) at > org.junit.Assert.assertTrue(Assert.java:42) at > org.junit.Assert.assertTrue(Assert.java:53) at > org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestFsDatasetImpl.testDnRestartWithHardLink(TestFsDatasetImpl.java:1344) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:498) at > org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:59) > at > org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:12) > at > org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:56) > at > org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:17) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:299) > at > org.junit.internal.runners.statements.FailOnTimeout$CallableStatement.call(FailOnTimeout.java:293) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) at > java.lang.Thread.run(Thread.java:748) > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org