[ https://issues.apache.org/jira/browse/HDFS-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981701#comment-13981701 ]
Chen He commented on HDFS-6250: ------------------------------- I think this undeleted data block is caused by a race condition: The testBalancerWithRackLocality uses getDfsUsed() methods to count the HDFS usage. When the test code checks tdatanode usage, getDfsUsed() has not updated yet. Here is whole process. 1) ReplicationMonitor runs every 5 seconds to check data blocks that need to be updated and send them to NN. We need 10 seconds to guarantee any information reaches NN; 2) Once NN gets operations, it sends them to corresponding DNs. It needs 6 seconds (2 heartbeat intervals, assume heartbeat interval is 3 seconds) to let DNs finish these operations and report updates back to NN; 3) if we consider other processing time, it will be safe to get latest information in 20 seconds. Based on analysis above, I attached my patch. > TestBalancerWithNodeGroup.testBalancerWithRackLocality fails > ------------------------------------------------------------ > > Key: HDFS-6250 > URL: https://issues.apache.org/jira/browse/HDFS-6250 > Project: Hadoop HDFS > Issue Type: Bug > Reporter: Kihwal Lee > Assignee: Chen He > Attachments: test_log.txt > > > It was seen in https://builds.apache.org/job/PreCommit-HDFS-Build/6669/ > {panel} > java.lang.AssertionError: expected:<1800> but was:<1810> > at org.junit.Assert.fail(Assert.java:93) > at org.junit.Assert.failNotEquals(Assert.java:647) > at org.junit.Assert.assertEquals(Assert.java:128) > at org.junit.Assert.assertEquals(Assert.java:147) > at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup > .testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:253) > {panel} -- This message was sent by Atlassian JIRA (v6.2#6252)