[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-09-09 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14127971#comment-14127971
 ] 

Binglin Chang commented on HDFS-6506:
-

Thanks for the review Chris and Junping.

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: balancer, test
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Fix For: 2.6.0
>
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch, 
> HDFS-6506.v3.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, 
> blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008]
> 2014-06-06 18:16:02,423 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741845_1021]
> {noformat}
> Normally this should not happen, when moving a block from src to dest, 
> replica on src should be invalided not t

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-09-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14124995#comment-14124995
 ] 

Hadoop QA commented on HDFS-6506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12667086/HDFS-6506.v3.patch
  against trunk revision a23144f.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.web.TestWebHdfsFileSystemContract
  
org.apache.hadoop.hdfs.server.namenode.ha.TestPipelinesFailover

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7940//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7940//console

This message is automatically generated.

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch, 
> HDFS-6506.v3.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessR

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-09-05 Thread Chris Nauroth (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14123694#comment-14123694
 ] 

Chris Nauroth commented on HDFS-6506:
-

Unfortunately, it appears this patch has gone stale.  [~decster], would you 
mind updating the patch?  [~djp], would you mind +1'ing a new patch quickly if 
you don't have any other feedback?  I'm happy to take care of the commit if 
you're busy.  It would be nice to get this in and hopefully put an end to the 
spurious failures in the balancer tests.  Thanks!

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, 
> blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008]
> 2014-06-06 18:16:02,423 INFO  BlockStateChange 
> (BlockManager.java:invalidat

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-08-07 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088966#comment-14088966
 ] 

Hadoop QA commented on HDFS-6506:
-

{color:green}+1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651956/HDFS-6506.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in 
hadoop-hdfs-project/hadoop-hdfs.

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7577//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7577//console

This message is automatically generated.

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockMan

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-08-06 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14088807#comment-14088807
 ] 

Junping Du commented on HDFS-6506:
--

Sorry for late response. Patch looks good to me in overall. Kick off Jenkins's 
test again as patch doesn't sync for long time.

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, 
> blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008]
> 2014-06-06 18:16:02,423 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741845_1021]
> {noformat}
> Normally this should not happen, when moving a block from src to dest, 
> replica on src should be invalided not the dest, th

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-07-14 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061614#comment-14061614
 ] 

Junping Du commented on HDFS-6506:
--

Sure. I will be around to review soon. Thanks, Binglin!

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, 
> blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008]
> 2014-06-06 18:16:02,423 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741845_1021]
> {noformat}
> Normally this should not happen, when moving a block from src to dest, 
> replica on src should be invalided not the dest, there should be bug inside 
> related logic. 
> I don't think TestBalancer

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted in TestBalancer

2014-07-14 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14061612#comment-14061612
 ] 

Binglin Chang commented on HDFS-6506:
-

Hi [~djp], this bug is related to TestBalancerWithNodeGroup, could you help 
review this? Thanks:)

> Newly moved block replica been invalidated and deleted in TestBalancer
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, 
> blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008]
> 2014-06-06 18:16:02,423 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741845_1021]
> {noformat}
> Normally this should not happen, when moving a block from src to dest, 
> replica on src should be invalided not the dest, there should be bug insid

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted

2014-06-23 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14040758#comment-14040758
 ] 

Hadoop QA commented on HDFS-6506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12651956/HDFS-6506.v2.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.TestRefreshCallQueue

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7210//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7210//console

This message is automatically generated.

> Newly moved block replica been invalidated and deleted
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch, HDFS-6506.v2.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  B

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted

2014-06-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026512#comment-14026512
 ] 

Binglin Chang commented on HDFS-6506:
-

The failed test is not related and is tracked in HDFS-3930, actually recent 
build also failed because of this.
https://builds.apache.org/job/Hadoop-Hdfs-trunk/1770/consoleText

> Newly moved block replica been invalidated and deleted
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003, blk_1073741829_1005, 
> blk_1073741830_1006, blk_1073741831_1007, blk_1073741832_1008]
> 2014-06-06 18:16:02,423 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741845_1021]
> {noformat}
> Normally this should not happen, when moving a block from src to dest, 
> replica on src should be invalided not the dest,

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted

2014-06-10 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026314#comment-14026314
 ] 

Hadoop QA commented on HDFS-6506:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12649548/HDFS-6506.v1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  org.apache.hadoop.hdfs.server.datanode.TestBPOfferService

{color:green}+1 contrib tests{color}.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/7072//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/7072//console

This message is automatically generated.

> Newly moved block replica been invalidated and deleted
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
> Attachments: HDFS-6506.v1.patch
>
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  Bl

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted

2014-06-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026215#comment-14026215
 ] 

Binglin Chang commented on HDFS-6506:
-

Balancer already sleep 2*DFS_HEARTBEAT_INTERVAL seconds between rounds, but in 
TestBalancer.java:
{code}
conf.setLong(DFSConfigKeys.DFS_HEARTBEAT_INTERVAL_KEY, 1L);
{code}
replica state update speed is related to DFS_NAMENODE_REPLICATION_INTERVAL too, 
which is 3 by default.
TestBalancer only change heartbeat interval(which changes heartbeat interval 
and balancer iteration sleep time), but doesn't change ReplicationMonitor check 
interval, so the sleep time is too small to wait for movements getting 
committed.
The other thing is 2*DFS_HEARTBEAT_INTERVAL still seems a little dangerous. 
maybe change it to 2*DFS_HEARTBEAT_INTERVAL + DFS_NAMENODE_REPLICATION_INTERVAL


> Newly moved block replica been invalidated and deleted
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: 

[jira] [Commented] (HDFS-6506) Newly moved block replica been invalidated and deleted

2014-06-10 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-6506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14026188#comment-14026188
 ] 

Binglin Chang commented on HDFS-6506:
-

Look at the log and code more throughly. The reason some block replica is 
invalidated is:
1. balancer round 1: move blk0 from dn0 to dn1, at this time block map haven't 
updated yet(so dn0 still have blk0)
2. balancer round 2 starts, and try to move blk0 from dn0 to dn2
3. dn2 copy data from dn0 
4. dn0 heartbeat and get cmd to delete blk0
5. try to move blk0 from dn0 to dn2 , it canot find dn0, but it has to delete a 
replica, so it delete dn1

To prevent this, balancer need to wait some time to make sure the block 
movements in last round is fully committed, otherwise the movements in last 
round may be invalided.



> Newly moved block replica been invalidated and deleted
> --
>
> Key: HDFS-6506
> URL: https://issues.apache.org/jira/browse/HDFS-6506
> Project: Hadoop HDFS
>  Issue Type: Bug
>Reporter: Binglin Chang
>Assignee: Binglin Chang
>
> TestBalancerWithNodeGroup#testBalancerWithNodeGroup fails recently
> https://builds.apache.org/job/PreCommit-HDFS-Build/7045//testReport/
> from the error log, the reason seems to be that newly moved block replicas 
> been invalidated and deleted, so some work of the balancer are reversed.
> {noformat}
> 2014-06-06 18:15:51,681 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741834_1010 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741833_1009 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741830_1006 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,683 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741831_1007 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:51,682 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741832_1008 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741827_1003 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,702 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741828_1004 with size=100 from 127.0.0.1:49159 
> to 127.0.0.1:55468 through 127.0.0.1:49159
> 2014-06-06 18:15:54,701 INFO  balancer.Balancer (Balancer.java:dispatch(370)) 
> - Successfully moved blk_1073741829_1005 with size=100 fr
> 2014-06-06 18:15:54,706 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741833_1009) is added to 
> invalidated blocks set
> 2014-06-06 18:15:54,709 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741834_1010) is added to 
> invalidated blocks set
> 2014-06-06 18:15:56,421 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741833_1009, blk_1073741834_1010]
> 2014-06-06 18:15:57,717 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741832_1008) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,720 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741827_1003) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,721 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741830_1006) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,722 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741831_1007) is added to 
> invalidated blocks set
> 2014-06-06 18:15:57,723 INFO  BlockStateChange 
> (BlockManager.java:chooseExcessReplicates(2711)) - BLOCK* 
> chooseExcessReplicates: (127.0.0.1:55468, blk_1073741829_1005) is added to 
> invalidated blocks set
> 2014-06-06 18:15:59,422 INFO  BlockStateChange 
> (BlockManager.java:invalidateWorkForOneNode(3242)) - BLOCK* BlockManager: ask 
> 127.0.0.1:55468 to delete [blk_1073741827_1003,