Improve the Balancer to move data from over utilized nodes to under utilized 
nodes using balanced nodes
-------------------------------------------------------------------------------------------------------

                 Key: HDFS-2821
                 URL: https://issues.apache.org/jira/browse/HDFS-2821
             Project: Hadoop HDFS
          Issue Type: Improvement
    Affects Versions: 0.20.205.0, 0.24.0, 0.23.1
            Reporter: Devaraj K


h5.Cluster State Before Balancer Run:

||Node||Last Contact||Admin 
State||Configured||Capacity(TB)||Used(TB)||Remaining(TB)||Used(%)||Remaining(%)||Blocks||
|xxx-x-xx-n1|0|In Service|4.25|1.76|    0.84|1.65|41.34|38.86|8465|
|xxx-x-xx-n2|1|In Service|6.03|1.76|0.94        |3.33|29.1|55.24|8465|
|xxx-x-xx-n3|2|In Service|6.93|1.76|0.99 |4.18|25.35|60.31|8465|
|xxx-x-xx-n4|2|In Service|10.5|0|0.54|9.97|0|94.9|0|


\\
\\
h5.Cluster State After Balancer Run:

||Node||Last Contact||Admin 
State||Configured||Capacity(TB)||Used(TB)||Remaining(TB)||Used(%)||Remaining(%)||Blocks||
|xxx-x-xx-n1|2|In Service|4.25|0.95|0.84|2.46|22.36|57.84|4830|
|xxx-x-xx-n2|1|In Service|6.03|1.2|0.94|3.88|19.95|64.4|5858|
|xxx-x-xx-n3|0|In Service|6.93|1.38|0.99|4.56|19.9|65.76|6327|
|xxx-x-xx-n4|2|In Service|10.5|1.74|0.54|8.23|16.53|78.37|8383|

\\


Currently balancer moves the data from over utilized nodes to the under 
utilized nodes and this process continues till the cluster balanced or there is 
no data to move from source to destination. In this process if some nodes usage 
comes to avgUtilization these will not be participated in the balance process 
further.


The above table shows the cluster usage before the balancer run and after 
balancer run using 1 as threshold. After balancer completion, still n1 is over 
utilized and n4 is under utilized. This may be because of n4 contains all the 
blocks which are present in n1.  I feel this can be improved further by moving 
data from over utilized nodes to balanced nodes and then balanced nodes to 
under utilized nodes.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to