[ 
https://issues.apache.org/jira/browse/HDFS-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981701#comment-13981701
 ] 

Chen He commented on HDFS-6250:
-------------------------------

I think this undeleted data block is caused by a race condition:
The testBalancerWithRackLocality uses getDfsUsed() methods to count the HDFS 
usage.  When the test code checks tdatanode usage, getDfsUsed() has not updated 
yet. 

Here is whole process.
1) ReplicationMonitor runs every 5 seconds to check data blocks that need to be 
updated and send them to NN. We need 10 seconds to guarantee any information 
reaches NN;
2) Once NN gets operations, it sends them to corresponding DNs. It needs 6 
seconds (2 heartbeat intervals, assume heartbeat interval is 3 seconds) to let 
DNs finish these operations and report updates back to NN;
3) if we consider other processing time, it will be safe to get latest 
information in 20 seconds. 

Based on analysis above, I attached my patch.

> TestBalancerWithNodeGroup.testBalancerWithRackLocality fails
> ------------------------------------------------------------
>
>                 Key: HDFS-6250
>                 URL: https://issues.apache.org/jira/browse/HDFS-6250
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Kihwal Lee
>            Assignee: Chen He
>         Attachments: test_log.txt
>
>
> It was seen in https://builds.apache.org/job/PreCommit-HDFS-Build/6669/
> {panel}
> java.lang.AssertionError: expected:<1800> but was:<1810>
>       at org.junit.Assert.fail(Assert.java:93)
>       at org.junit.Assert.failNotEquals(Assert.java:647)
>       at org.junit.Assert.assertEquals(Assert.java:128)
>       at org.junit.Assert.assertEquals(Assert.java:147)
>       at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
>  .testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:253)
> {panel}



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to