[ https://issues.apache.org/jira/browse/HDFS-11511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15906369#comment-15906369 ]
Arpit Agarwal commented on HDFS-11511: -------------------------------------- Comments on the unit tests: # Can you please add timeouts on TestDatasetVolumeCheckerTimeout? You can either add {{@timeout}} annotations on each test method or add an rule at the class level. e.g. {code} @Rule public final Timeout testTimeout = new Timeout(300_000); {code} # Instead of calling {{Thread.sleep(DISK_CHECK_TIME)}}, the check callback of the slow volume can wait on a mutex. testDiskCheckTimeout should unblock the waiting thread by signalling the mutex before exiting. # Looks like TestDatasetVolumeCheckerTimeout#IGNORED_CONTEXT can be removed. # Nitpick: need space after {{//}} at {//Assert}}. Also at {{//Delay}} and {{//Wait}}. Two tests failed for me: {code} testDiskCheckTimeout(org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout) Time elapsed: 0.474 sec <<< FAILURE! java.lang.AssertionError: Expected: is <1L> but: was <0L> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.junit.Assert.assertThat(Assert.java:865) at org.junit.Assert.assertThat(Assert.java:832) at org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout.testDiskCheckTimeout(TestDatasetVolumeCheckerTimeout.java:121) testTimeoutExceptionIsThrown(org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout) Time elapsed: 0.04 sec <<< FAILURE! java.lang.AssertionError: Expected: is <1> but: was <0> at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20) at org.junit.Assert.assertThat(Assert.java:865) at org.junit.Assert.assertThat(Assert.java:832) at org.apache.hadoop.hdfs.server.datanode.checker.TestDatasetVolumeCheckerTimeout.testTimeoutExceptionIsThrown(TestDatasetVolumeCheckerTimeout.java:162) {code} The test coverage looks good. One suggestion - we can move the new unit tests to a TestThrottledAsyncCheckerWithTimeout class, and retain one unit test in TestDatasetVolumeCheckerTimeout to verify that the DFS_DATANODE_DISK_CHECK_TIMEOUT_KEY setting is respected. > Support Timeout when checking single disk > ----------------------------------------- > > Key: HDFS-11511 > URL: https://issues.apache.org/jira/browse/HDFS-11511 > Project: Hadoop HDFS > Issue Type: Improvement > Components: hdfs > Reporter: Hanisha Koneru > Assignee: Hanisha Koneru > Attachments: HDFS-11511.000.patch, HDFS-11511.001.patch > > > HDFS-11149 introduces parallel checking of FsVolumes by Datanode. Disk checks > should have a configurable timeout so that bad disks do not hang and cause > long running checks. -- This message was sent by Atlassian JIRA (v6.3.15#6346) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org