[ https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13001477#comment-13001477 ]
Wang Xu commented on HDFS-1312: ------------------------------- Hi folks, Here is the basic design of the process. Is there any other consideration? The basic flow is: # Re-balance should be only process while it is not in heavy load (should this be guaranteed by the administrator?) # Calculate the total and average available & used space of dirs. # Find the disks have most and least space, and decide move direction. We need define a unbalance threshold here to decide whether it is worthy to re-balance. # Lock origin disks: stop written to them and wait finalization on them. # Find the deepest dirs in every selected disk and move blocks from those dirs. And if a dir is empty, then the dir should also be removed. # Check the balance status while the blocks are migrated, and break from the loop if it reaches a threshold. # Release the lock. The case should be take into account: * If a disk have much less space than other disks, it might have least available space, but could not migrate blocks out. * If two or more dirs are located in a same disk, they might confuse the space calculation. And this is just the case in MiniDFSCluster deployment. > Re-balance disks within a Datanode > ---------------------------------- > > Key: HDFS-1312 > URL: https://issues.apache.org/jira/browse/HDFS-1312 > Project: Hadoop HDFS > Issue Type: New Feature > Components: data-node > Reporter: Travis Crawford > > Filing this issue in response to ``full disk woes`` on hdfs-user. > Datanodes fill their storage directories unevenly, leading to situations > where certain disks are full while others are significantly less used. Users > at many different sites have experienced this issue, and HDFS administrators > are taking steps like: > - Manually rebalancing blocks in storage directories > - Decomissioning nodes & later readding them > There's a tradeoff between making use of all available spindles, and filling > disks at the sameish rate. Possible solutions include: > - Weighting less-used disks heavier when placing new blocks on the datanode. > In write-heavy environments this will still make use of all spindles, > equalizing disk use over time. > - Rebalancing blocks locally. This would help equalize disk use as disks are > added/replaced in older cluster nodes. > Datanodes should actively manage their local disk so operator intervention is > not needed. -- This message is automatically generated by JIRA. - For more information on JIRA, see: http://www.atlassian.com/software/jira