This feature is frequently requested by users and would be a useful addition to 
HDFS.

I’ve code reviewed many of the sub-tasks and have tried out Disk Balancer in a 
test cluster. Suggested a couple of usability improvements. These are tracked 
by open Jiras but they need not hold up the merge. The documentation looks 
great.

+1 for merging with HDFS-10557 fixed.


On 6/15/16, 5:38 PM, "Anu Engineer" <aengin...@hortonworks.com> wrote:

 Hi All,

I would like to propose a merge vote for HDFS-1312 (Disk balancer) branch to 
trunk. This branch creates a new tool that allows balancing of data on a 
datanode.

The voting commences now and will run for 7 days till Jun/22/2016 5:00 PM PST.

This tool distributes data evenly between the disks of same type on a datanode.
This is useful if a disk has been replaced or if some disks are out of space 
compared to rest of the disks.

The current set of commands supported are:

1. Plan - Allows user to create a plan and review it. The plan describes how 
the data will be moved in the data node.

2. Execute - Allows execution of a plan against a datanode.

3. Query – Queries the status of disk balancer execution.

4. Cancel - cancels a running disk balancer plan.

5. Report – Reports the current state of data distribution on a node.


·         The original proposal that captures the rationale and possible 
solution is here.  [ 
https://issues.apache.org/jira/secure/attachment/12755226/disk-balancer-proposal.pdf
 ]

·         The updated architecture and test plan document is here. [ 
https://issues.apache.org/jira/secure/attachment/12810720/Architecture_and_test_update.pdf
 ]

·         The merge patch that is a diff against trunk is posted here. [ 
https://issues.apache.org/jira/secure/attachment/12810943/HDFS-1312.001.patch ]

·         The user documentation which will be part of apache is posted here. [ 
https://issues.apache.org/jira/secure/attachment/12805976/HDFS-9547-HDFS-1312.002.patch
 ]


HDFS-1312 has a set of sub-tasks and they are ordered in the same sequence as 
they were committed to HDFS-1312. Hopefully this will make it easy to code 
review this branch.

There are a set of commands which we would like to do later, including 
discovering which datanodes in the cluster would benefit by running disk 
balancer.
Appropriate JIRAs for these future work items are filed under HDFS-1312.

Disk Balancer is made possible due to the work of many community members 
including Arpit Agarwal, Vinayakumar B, Mingliang Liu, Tsz Wo Nicholas Sze,
Lei (Eddy) Xu and Xiaobing Zhou. I would like to thank them all for the effort 
and support.

Thanks
Anu



Reply via email to