[
https://issues.apache.org/jira/browse/HDFS-3070?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eli Collins updated HDFS-3070:
------------------------------
Target Version/s: 0.23.3
> hdfs balancer doesn't balance blocks between datanodes
> ------------------------------------------------------
>
> Key: HDFS-3070
> URL: https://issues.apache.org/jira/browse/HDFS-3070
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer
> Affects Versions: 0.24.0
> Reporter: Stephen Chu
> Attachments: unbalanced_nodes.png, unbalanced_nodes_inservice.png
>
>
> I TeraGenerated data into DataNodes styx01 and styx02. Looking at the web UI,
> both have over 3% disk usage.
> Attached is a screenshot of the Live Nodes web UI.
> On styx01, I run the _hdfs balancer_ command with threshold 1% and don't see
> the blocks being balanced across all 4 datanodes (all blocks on styx01 and
> styx02 stay put).
> HA is currently enabled.
> [schu@styx01 ~]$ hdfs haadmin -getServiceState nn1
> active
> [schu@styx01 ~]$ hdfs balancer -threshold 1
> 12/03/08 10:10:32 INFO balancer.Balancer: Using a threshold of 1.0
> 12/03/08 10:10:32 INFO balancer.Balancer: namenodes = []
> 12/03/08 10:10:32 INFO balancer.Balancer: p =
> Balancer.Parameters[BalancingPolicy.Node, threshold=1.0]
> Time Stamp Iteration# Bytes Already Moved Bytes Left To Move
> Bytes Being Moved
> Balancing took 95.0 milliseconds
> [schu@styx01 ~]$
> I believe with a threshold of 1% the balancer should trigger blocks being
> moved across DataNodes, right? I am curious about the "namenode = []" from
> the above output.
> [schu@styx01 ~]$ hadoop version
> Hadoop 0.24.0-SNAPSHOT
> Subversion
> git://styx01.sf.cloudera.com/home/schu/hadoop-common/hadoop-common-project/hadoop-common
> -r f6a577d697bbcd04ffbc568167c97b79479ff319
> Compiled by schu on Thu Mar 8 15:32:50 PST 2012
> From source with checksum ec971a6e7316f7fbf471b617905856b8
> From
> http://hadoop.apache.org/hdfs/docs/r0.21.0/api/org/apache/hadoop/hdfs/server/balancer/Balancer.html:
> The threshold parameter is a fraction in the range of (0%, 100%) with a
> default value of 10%. The threshold sets a target for whether the cluster is
> balanced. A cluster is balanced if for each datanode, the utilization of the
> node (ratio of used space at the node to total capacity of the node) differs
> from the utilization of the (ratio of used space in the cluster to total
> capacity of the cluster) by no more than the threshold value. The smaller the
> threshold, the more balanced a cluster will become. It takes more time to run
> the balancer for small threshold values. Also for a very small threshold the
> cluster may not be able to reach the balanced state when applications write
> and delete files concurrently.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira