[ 
https://issues.apache.org/jira/browse/HDFS-1312?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15012472#comment-15012472
 ] 

Anu Engineer commented on HDFS-1312:
------------------------------------

bq. Similar to Balancer, we need to define a threshold so that the storage is 
considered as balanced if its dfsUsedRatio is within nodeWeightedMean +/- 
threshold.
Sorry this detail is not in the original design document. It escaped me then. 
It is defined in code and also in the Archtecture docuement from page.5 of 
Architecture_and_test_paln.pdf
{code}
dfs.disk.balancer.block.tolerance.percent |   5  | Since data nodes are 
operational we stop copying data if we have reached a good enough threshold
{code}
It is not based  on werightedMean since the data node is operational and we 
compute the plan for data moves from a static snapshot of disk utilization. 
Currently we support a simple knob in the config that
allows us to say what is a good enough target, that is if we reach close to 5% 
of desired value in terms of data movement , we can consider it balanced. 
Please do let me know if this addresses the concern.

bq. DataTransferProtocol.replaceBlock does support move blocks across storage 
types within the same node. We only need to slightly modify it for disk 
balancing (i.e. moving block within the same storage type in the same node.)
Something that should have been part of the design document. Thanks for bring 
it up.  Actual move is performed by calling into 
{{FsDataSetImpl.java#moveBlockAcrossStorage}} which is part of mover's logic.
Here is the high level logic in DiskBalancer.java -- It finds blocks on the 
source volume and moves them to destination volume by making calls to 
moveBlocksAcrossStorage ( slightly modified version of this function)




> Re-balance disks within a Datanode
> ----------------------------------
>
>                 Key: HDFS-1312
>                 URL: https://issues.apache.org/jira/browse/HDFS-1312
>             Project: Hadoop HDFS
>          Issue Type: New Feature
>          Components: datanode
>            Reporter: Travis Crawford
>            Assignee: Anu Engineer
>         Attachments: Architecture_and_testplan.pdf, disk-balancer-proposal.pdf
>
>
> Filing this issue in response to ``full disk woes`` on hdfs-user.
> Datanodes fill their storage directories unevenly, leading to situations 
> where certain disks are full while others are significantly less used. Users 
> at many different sites have experienced this issue, and HDFS administrators 
> are taking steps like:
> - Manually rebalancing blocks in storage directories
> - Decomissioning nodes & later readding them
> There's a tradeoff between making use of all available spindles, and filling 
> disks at the sameish rate. Possible solutions include:
> - Weighting less-used disks heavier when placing new blocks on the datanode. 
> In write-heavy environments this will still make use of all spindles, 
> equalizing disk use over time.
> - Rebalancing blocks locally. This would help equalize disk use as disks are 
> added/replaced in older cluster nodes.
> Datanodes should actively manage their local disk so operator intervention is 
> not needed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to