Improve the Balancer to move data from over utilized nodes to under utilized nodes using balanced nodes -------------------------------------------------------------------------------------------------------
Key: HDFS-2821 URL: https://issues.apache.org/jira/browse/HDFS-2821 Project: Hadoop HDFS Issue Type: Improvement Affects Versions: 0.20.205.0, 0.24.0, 0.23.1 Reporter: Devaraj K h5.Cluster State Before Balancer Run: ||Node||Last Contact||Admin State||Configured||Capacity(TB)||Used(TB)||Remaining(TB)||Used(%)||Remaining(%)||Blocks|| |xxx-x-xx-n1|0|In Service|4.25|1.76| 0.84|1.65|41.34|38.86|8465| |xxx-x-xx-n2|1|In Service|6.03|1.76|0.94 |3.33|29.1|55.24|8465| |xxx-x-xx-n3|2|In Service|6.93|1.76|0.99 |4.18|25.35|60.31|8465| |xxx-x-xx-n4|2|In Service|10.5|0|0.54|9.97|0|94.9|0| \\ \\ h5.Cluster State After Balancer Run: ||Node||Last Contact||Admin State||Configured||Capacity(TB)||Used(TB)||Remaining(TB)||Used(%)||Remaining(%)||Blocks|| |xxx-x-xx-n1|2|In Service|4.25|0.95|0.84|2.46|22.36|57.84|4830| |xxx-x-xx-n2|1|In Service|6.03|1.2|0.94|3.88|19.95|64.4|5858| |xxx-x-xx-n3|0|In Service|6.93|1.38|0.99|4.56|19.9|65.76|6327| |xxx-x-xx-n4|2|In Service|10.5|1.74|0.54|8.23|16.53|78.37|8383| \\ Currently balancer moves the data from over utilized nodes to the under utilized nodes and this process continues till the cluster balanced or there is no data to move from source to destination. In this process if some nodes usage comes to avgUtilization these will not be participated in the balance process further. The above table shows the cluster usage before the balancer run and after balancer run using 1 as threshold. After balancer completion, still n1 is over utilized and n4 is under utilized. This may be because of n4 contains all the blocks which are present in n1. I feel this can be improved further by moving data from over utilized nodes to balanced nodes and then balanced nodes to under utilized nodes. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira