[ 
https://issues.apache.org/jira/browse/HBASE-22349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Manning updated HBASE-22349:
----------------------------------
    Component/s: Balancer

> Stochastic Load Balancer skips balancing when node is replaced in cluster
> -------------------------------------------------------------------------
>
>                 Key: HBASE-22349
>                 URL: https://issues.apache.org/jira/browse/HBASE-22349
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>    Affects Versions: 3.0.0-alpha-1, 1.3.0, 1.4.4, 2.0.0
>            Reporter: Suthan Phillips
>            Assignee: David Manning
>            Priority: Major
>         Attachments: Hbase-22349.pdf
>
>
> HBASE-24139 allows the load balancer to run when one server has 0 regions and 
> another server has more than 1 region. This is a special case of a more 
> generic problem, where one server has far too few or far too many regions. 
> The StochasticLoadBalancer defaults may decide the cluster is "balanced 
> enough" according to {{hbase.master.balancer.stochastic.minCostNeedBalance}}, 
> even though one server may have a far higher or lower number of regions 
> compared to the rest of the cluster.
> One specific example of this we have seen is when we use {{RegionMover}} to 
> move regions back to a restarted RegionServer, if the 
> {{StochasticLoadBalancer}} happens to be running. The load balancer sees a 
> newly restarted RegionServer with 0 regions, and after HBASE-24139, it will 
> balance regions to this server. Simultaneously, {{RegionMover}} moves back 
> regions. The end result is that the newly restarted RegionServer has twice 
> the load of any other server in the cluster. Future iterations of the load 
> balancer do nothing, as the cluster cost does not exceed 
> {{minCostNeedBalance}}.
> Another example is if the load balancer makes very slow progress on a 
> cluster, it may not move the average cluster load to a newly restarted 
> regionserver in one iteration. But after the first iteration, the balancer 
> may again not run due to cluster cost not exceeding {{minCostNeedBalance}}.
> We can propose a solution where we reuse the {{slop}} concept in 
> {{SimpleLoadBalancer}} and use this to extend the HBASE-24139 logic for 
> deciding to run the balancer as long as there is a "sloppy" server in the 
> cluster.
> +*Previous Description Notes Below, which are relevant, but as stated, were 
> already fixed by HBASE-24139*+
> In EMR cluster, whenever I replace one of the nodes, the regions never get 
> rebalanced.
> The default minCostNeedBalance set to 0.05 is too high.
> The region count on the servers were: 21, 21, 20, 20, 20, 20, 21, 20, 20, 20 
> = 203
> Once a node(region server) got replaced with a new node (terminated and EMR 
> recreated a node), the region count on the servers became: 23, 0, 23, 22, 22, 
> 22, 22, 23, 23, 23 = 203
> From hbase-master-logs, I can see the below WARN which indicates that the 
> default minCostNeedBalance does not hold good for these scenarios.
> ##
> 2019-04-29 09:31:37,027 WARN  
> [ip-172-31-35-122.ec2.internal,16000,1556524892897_ChoreService_1] 
> cleaner.CleanerChore: WALs outstanding under 
> hdfs://ip-172-31-35-122.ec2.internal:8020/user/hbase/oldWALs2019-04-29 
> 09:31:42,920 INFO  
> [ip-172-31-35-122.ec2.internal,16000,1556524892897_ChoreService_1] 
> balancer.StochasticLoadBalancer: Skipping load balancing because balanced 
> cluster; total cost is 52.041826194833405, sum multiplier is 1102.0 min cost 
> which need balance is 0.05
> ##
> To mitigate this, I had to modify the default minCostNeedBalance to lower 
> value like 0.01f and restart Region Servers and Hbase Master. After modifying 
> this value to 0.01f I could see the regions getting re-balanced.
> This has led me to the following questions which I would like to get it 
> answered from the HBase experts.
> 1)What are the factors that affect the value of total cost and sum 
> multiplier? How could we determine the right minCostNeedBalance value for any 
> cluster?
> 2)How did Hbase arrive at setting the default value to 0.05f? Is it optimal 
> value? If yes, then what is the recommended way to mitigate this scenario? 
> Attached: Steps to reproduce
>  
> Note: HBase-17565 patch is already applied.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to