[ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13755855#comment-13755855
 ] 

Kevin Lyda commented on HDFS-1312:
----------------------------------

I wrote the following in Java which... is not really my favourite language. 
However Hadoop is written in Java and if the code will ever make it into Hadoop 
proper that seems like an important requirement.

My particular cluster that I need this working on is a 1.0.3 cluster so my code 
is written for that. Hence ant, etc. It assumes a built 1.0.3 tree lives in 
../hadoop-common. Again, not a Java person so I know this isn't the greatest 
setup.

The code is here:

https://bitbucket.org/lyda/intranode-balance

I've run it on a test cluster but haven't tried it on "real" data yet. In the 
meantime pull requests are accepted / desired / encouraged / etc.
                
> Re-balance disks within a Datanode
> ----------------------------------
>
>                 Key: HDFS-1312
>                 URL: https://issues.apache.org/jira/browse/HDFS-1312
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode
>            Reporter: Travis Crawford
>
> Filing this issue in response to ``full disk woes`` on hdfs-user.
> Datanodes fill their storage directories unevenly, leading to situations 
> where certain disks are full while others are significantly less used. Users 
> at many different sites have experienced this issue, and HDFS administrators 
> are taking steps like:
> - Manually rebalancing blocks in storage directories
> - Decomissioning nodes & later readding them
> There's a tradeoff between making use of all available spindles, and filling 
> disks at the sameish rate. Possible solutions include:
> - Weighting less-used disks heavier when placing new blocks on the datanode. 
> In write-heavy environments this will still make use of all spindles, 
> equalizing disk use over time.
> - Rebalancing blocks locally. This would help equalize disk use as disks are 
> added/replaced in older cluster nodes.
> Datanodes should actively manage their local disk so operator intervention is 
> not needed.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to