[ 
https://issues.apache.org/jira/browse/HDFS-1105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Dmytro Molkov updated HDFS-1105:
--------------------------------

    Attachment: HDFS-1105.4.patch

This patch addresses all of Hairong's comments. I will also file a different 
jira for docs changes in common.

> Balancer improvement
> --------------------
>
>                 Key: HDFS-1105
>                 URL: https://issues.apache.org/jira/browse/HDFS-1105
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Dmytro Molkov
>            Assignee: Dmytro Molkov
>         Attachments: HDFS-1105.2.patch, HDFS-1105.3.patch, HDFS-1105.4.patch, 
> HDFS-1105.patch
>
>
> We were seeing some weird issues with the balancer in our cluster:
> 1) it can get stuck during an iteration and only restarting it helps
> 2) the iterations are highly inefficient. With 20 minutes iteration it moves 
> 7K blocks a minute for the first 6 minutes and hundreds of blocks in the next 
> 14 minutes
> 3) it can hit namenode and the network pretty hard
> A few improvements we came up with as a result:
> Making balancer more deterministic in terms of running time of iteration, 
> improving the efficiency and making the load configurable:
> Make many of the constants configurable command line parameters: Iteration 
> length, number of blocks to move in parallel to a given node and in cluster 
> overall.
> Terminate transfers that are still in progress after iteration is over.
> Previously iteration time was the time window in which the balancer was 
> scheduling the moves and then it would wait for the moves to finish 
> indefinitely. Each scheduling task can run up to iteration time or even 
> longer. This means if you have too many of them and they are long your actual 
> iterations are longer than 20 minutes. Now each scheduling task has a time of 
> the start of iteration and it should schedule the moves only if it did not 
> run out of time. So the tasks that have started after the iteration is over 
> will not schedule any moves.
> The number of move threads and dispatch threads is configurable so that 
> depending on the load of the cluster you can run it slower.
> I will attach a patch, please let me know what you think and what can be done 
> better.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to