[jira] [Commented] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes

Hadoop QA (Commented) (JIRA) Fri, 30 Mar 2012 20:25:48 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13242987#comment-13242987
 ]


Hadoop QA commented on HDFS-3070:
---------------------------------

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12520710/HDFS-3070.patch
  against trunk revision .

    +1 @author.  The patch does not contain any @author tags.

    -1 tests included.  The patch doesn't appear to include any new or modified 
tests.
                        Please justify why no new tests are needed for this 
patch.
                        Also please list what manual steps were performed to 
verify this patch.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    +1 eclipse:eclipse.  The patch built with eclipse:eclipse.

    +1 findbugs.  The patch does not introduce any new Findbugs (version 1.3.9) 
warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    +1 core tests.  The patch passed unit tests in .

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/2136//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/2136//console

This message is automatically generated.
                
> hdfs balancer doesn't balance blocks between datanodes
> ------------------------------------------------------
>
>                 Key: HDFS-3070
>                 URL: https://issues.apache.org/jira/browse/HDFS-3070
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 2.0.0
>            Reporter: Stephen Chu
>            Assignee: Aaron T. Myers
>         Attachments: HDFS-3070.patch, unbalanced_nodes.png, 
> unbalanced_nodes_inservice.png
>
>
> I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI, 
> both have over 3% disk usage.
> Attached is a screenshot of the Live Nodes web UI.
> On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see 
> the blocks being balanced across all 4 datanodes (all blocks on styx01 and 
> styx02 stay put).
> HA is currently enabled.
> [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
> active
> [schu@styx01 ~]$ hdfs balancer -threshold 1
> 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
> 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
> 12/03/08 10:10:32 INFO balancer.Balancer: p         = 
> Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
> Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  
> Bytes Being Moved
> Balancing took 95.0 milliseconds
> [schu@styx01 ~]$ 
> I believe with a threshold of 1% the balancer should trigger blocks being 
> moved across DataNodes, right? I am curious about the "namenode = []" from 
> the above output.
> [schu@styx01 ~]$ hadoop version
> Hadoop 0.24.0-SNAPSHOT
> Subversion 
> git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
>  -r f6a577d697bbcd04ffbc568167c97b79479ff319
> Compiled by schu on Thu Mar  8 15:32:50 PST 2012
> From source with checksum ec971a6e7316f7fbf471b617905856b8
> From 
> http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
> The threshold parameter is a fraction in the range of (0%, 100%) with a 
> default value of 10%. The threshold sets a target for whether the cluster is 
> balanced. A cluster is balanced if for each datanode, the utilization of the 
> node (ratio of used space at the node to total capacity of the node) differs 
> from the utilization of the (ratio of used space in the cluster to total 
> capacity of the cluster) by no more than the threshold value. The smaller the 
> threshold, the more balanced a cluster will become. It takes more time to run 
> the balancer for small threshold values. Also for a very small threshold the 
> cluster may not be able to reach the balanced state when applications write 
> and delete files concurrently.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HDFS-3070) hdfs balancer doesn't balance blocks between datanodes

Reply via email to