[ 
https://issues.apache.org/jira/browse/HDFS-4261?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13526806#comment-13526806
 ] 

Chris Nauroth commented on HDFS-4261:
-------------------------------------

I reviewed the Windows failure more closely and found this:

{code}
java.io.IOException: THIS IS NOT SUPPOSED TO HAPPEN: replica.getBytesOnDisk() !=
 block.getNumBytes(), block=BP-TEST:blk_1000_2000, replica=ReplicaUnderRecovery,
 blk_1000_2000, RUR
{code}

That came from this check in {{FsDatasetImpl#updateReplicaUnderRecovery}}:

{code}
    //check replica's byte on disk
    if (replica.getBytesOnDisk() != oldBlock.getNumBytes()) {
      throw new IOException("THIS IS NOT SUPPOSED TO HAPPEN:"
          + " replica.getBytesOnDisk() != block.getNumBytes(), block="
          + oldBlock + ", replica=" + replica);
    }
{code}

This is causing the current balancer iteration to move 0 bytes.  Then, the new 
logic returns {{NO_MOVE_PROGRESS}} after exceeding the maximum iterations.

This looks to be an unrelated Windows-specific issue, so I have filed a 
separate jira to track it: HDFS-4289.

                
> TestBalancerWithNodeGroup times out
> -----------------------------------
>
>                 Key: HDFS-4261
>                 URL: https://issues.apache.org/jira/browse/HDFS-4261
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 1.0.4, 1.1.1, 2.0.2-alpha
>            Reporter: Tsz Wo (Nicholas), SZE
>            Assignee: Junping Du
>         Attachments: HDFS-4261.patch, HDFS-4261-v2.patch, HDFS-4261-v3.patch, 
> HDFS-4261-v4.patch
>
>
> When I manually ran TestBalancerWithNodeGroup, it always timed out in my 
> machine.  Looking at the Jerkins report [build 
> #3573|https://builds.apache.org/job/PreCommit-HDFS-Build/3573//testReport/org.apache.hadoop.hdfs.server.balancer/],
>  TestBalancerWithNodeGroup somehow was skipped so that the problem was not 
> detected.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to