[ https://issues.apache.org/jira/browse/HDFS-10960?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15546654#comment-15546654 ]
Manoj Govindassamy commented on HDFS-10960: ------------------------------------------- Looking at the code, remove volumes at DataNode can potentially interrupt BlockReceiver and if the BlockReceiver happens to be in some IO operations like flushing or setting channel position for the new checksum then it can throw IOException. {{BlockReceiver}} on getting IOexception, starts a thread to check for disk errors. TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten verification fails if the DataNode ever started a disk error check thread. This verification doesn't seem to be fruitful as we already have another verification for checking the block replication factor. So, the proposal here is to replace this not so useful verification with another verification to check for if the disk removal happened successfully and if the replication factor of the block caught up even after the volume removal. > TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails at disk error > verification after volume remove > ------------------------------------------------------------------------------------------------------------ > > Key: HDFS-10960 > URL: https://issues.apache.org/jira/browse/HDFS-10960 > Project: Hadoop HDFS > Issue Type: Bug > Components: hdfs > Affects Versions: 3.0.0-alpha2 > Reporter: Manoj Govindassamy > Assignee: Manoj Govindassamy > Priority: Minor > > TestDataNodeHotSwapVolumes#testRemoveVolumeBeingWritten fails occasionally in > the following verification. > {code} > 700 // If an IOException thrown from BlockReceiver#run, it triggers > 701 // DataNode#checkDiskError(). So we can test whether > checkDiskError() is called, > 702 // to see whether there is IOException in BlockReceiver#run(). > 703 assertEquals(lastTimeDiskErrorCheck, dn.getLastDiskErrorCheck()); > 704 > {code} > {noformat} > Error Message > expected:<0> but was:<6498109> > Stacktrace > java.lang.AssertionError: expected:<0> but was:<6498109> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at org.junit.Assert.assertEquals(Assert.java:542) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWrittenForDatanode(TestDataNodeHotSwapVolumes.java:703) > at > org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes.testRemoveVolumeBeingWritten(TestDataNodeHotSwapVolumes.java:620) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org