[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083350#comment-17083350
 ] 

Beata Sudi commented on HBASE-24139:
------------------------------------

You can find the PR here: [https://github.com/apache/hbase/pull/1511]

> Balancer should avoid leaving idle region servers
> -------------------------------------------------
>
>                 Key: HBASE-24139
>                 URL: https://issues.apache.org/jira/browse/HBASE-24139
>             Project: HBase
>          Issue Type: Improvement
>          Components: Balancer, Operability
>            Reporter: Sean Busbey
>            Assignee: Beata Sudi
>            Priority: Critical
>              Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to