[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-08 Thread Sean Busbey (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078260#comment-17078260
 ] 

Sean Busbey commented on HBASE-24139:
-

I am inclined to take the approach of adding a short circuit to 
{{needsBalance}} that says to look at plans if there are any region servers 
reporting 0 regions.

The down side to this is that if someone sets the "balancer per table" config 
to true we'll run balancer plans every possible time (every ~5 minutes by 
default) for every table that doesn't have as many regions as there are region 
servers.

Given the other caveats against using "balancer per table" maybe we could 
expressly only do the short circuit when we are not doing per-table?

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Sean Busbey
>Priority: Critical
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-08 Thread Beata Sudi (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078427#comment-17078427
 ] 

Beata Sudi commented on HBASE-24139:


Hi! 

I'd like to try solving this.  

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-08 Thread Sean Busbey (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078495#comment-17078495
 ] 

Sean Busbey commented on HBASE-24139:
-

That is great to hear!

It looks like you are in the "contributors" group in our jira instance already, 
so you ought to be able to click the "Assign to me" link near where this jira 
shows who currently owns it. (usually in the upper right corner in a desktop 
view)

Please reach out if I should add more details about what needs to change.

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-09 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17079283#comment-17079283
 ] 

Mate Szalay-Beko commented on HBASE-24139:
--

> Given the other caveats against using "balancer per table" maybe we could 
> expressly only do the short circuit when we are not doing per-table?

what about changing the new rule like: "don't have RS with 0 regions if there 
is any RS with more than 1 regions"
this would be compatible also with "balancer per table"



> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-09 Thread Mate Szalay-Beko (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17079290#comment-17079290
 ] 

Mate Szalay-Beko commented on HBASE-24139:
--

> I am inclined to take the approach of adding a short circuit to needsBalance

is there any benefit for the user to fine-tune the weight of this rule? If not, 
then I think the simpler approach (adding short circuit) is preferable.

Maybe there is a case, when the user first want to add 5 new RS to the cluster, 
but doesn't want to re-balance 5 times. But I think in this case he should 
disable the loadbalancer before before adding the new nodes then re-enable it 
in the end.

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-14 Thread Beata Sudi (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083340#comment-17083340
 ] 

Beata Sudi commented on HBASE-24139:


Hi!

[~busbey] I attempted a try at this, can you have a look at it? I'm still 
trying to figure out how to test it, but it would be nice to hear your opinion.

 

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-14 Thread Sean Busbey (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083345#comment-17083345
 ] 

Sean Busbey commented on HBASE-24139:
-

{quote}
bq. Given the other caveats against using "balancer per table" maybe we could 
expressly only do the short circuit when we are not doing per-table?

what about changing the new rule like: "don't have RS with 0 regions if there 
is any RS with more than 1 regions"
this would be compatible also with "balancer per table"
{quote}

yeah that would work and isn't terribly complicated.

{quote}
bq. I am inclined to take the approach of adding a short circuit to needsBalance

is there any benefit for the user to fine-tune the weight of this rule? If not, 
then I think the simpler approach (adding short circuit) is preferable.

Maybe there is a case, when the user first want to add 5 new RS to the cluster, 
but doesn't want to re-balance 5 times. But I think in this case he should 
disable the loadbalancer before before adding the new nodes then re-enable it 
in the end.
{quote}

that's essentially the same line of reasoning I found myself following.

{quote}
Sean Busbey I attempted a try at this, can you have a look at it? I'm still 
trying to figure out how to test it, but it would be nice to hear your opinion.
{quote}

That's great! I'd be happy to take a look, can you post a link to the PR here?

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-14 Thread Beata Sudi (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17083350#comment-17083350
 ] 

Beata Sudi commented on HBASE-24139:


You can find the PR here: [https://github.com/apache/hbase/pull/1511]

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-22 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089550#comment-17089550
 ] 

Hudson commented on HBASE-24139:


Results for branch branch-1
[build #1287 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1287/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1287//General_Nightly_Build_Report/]


(x) {color:red}-1 jdk7 checks{color}
-- For more information [see jdk7 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1287//JDK7_Nightly_Build_Report/]


(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-1/1287//JDK8_Nightly_Build_Report_(Hadoop2)/]




(x) {color:red}-1 source release artifact{color}
-- See build output for details.


> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-22 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089721#comment-17089721
 ] 

Hudson commented on HBASE-24139:


Results for branch master
[build #1707 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/1707/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1707/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1698/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1707/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/1707/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-22 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089817#comment-17089817
 ] 

Hudson commented on HBASE-24139:


Results for branch branch-2.3
[build #52 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/52/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/52/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/52/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/52/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- For more information [see jdk11 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.3/52/JDK11_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-22 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17089953#comment-17089953
 ] 

Viraj Jasani commented on HBASE-24139:
--

{quote}I am currently assuming we are only putting this in upcoming minor 
releases because it causes a change in operator expectations and we have a work 
around.
{quote}
Sure [~busbey], I agree on this. Initially I was thinking of adding this to 
branch-2.2 but then realized the nature of change might not fit well for 
maintenance release and hence, good to keep this to 2.3 and 1.7 only apart from 
master.

[~bea0113] Could you please update the Release Note describing the change? You 
can go to Edit -> Release Note.

Thanks.

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-22 Thread Hudson (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090236#comment-17090236
 ] 

Hudson commented on HBASE-24139:


Results for branch branch-2
[build #2627 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2627/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2627/General_20Nightly_20Build_20Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2627/JDK8_20Nightly_20Build_20Report_20_28Hadoop2_29/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2627/JDK8_20Nightly_20Build_20Report_20_28Hadoop3_29/]


(x) {color:red}-1 jdk11 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/2627//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-23 Thread Beata Sudi (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090415#comment-17090415
 ] 

Beata Sudi commented on HBASE-24139:


[~vjasani] I've added a release note, can you please check it? It's my first, 
and I'm not sure if I should add more information or this will be good enough. 

Thanks! 

 

 

 

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (HBASE-24139) Balancer should avoid leaving idle region servers

2020-04-23 Thread Viraj Jasani (Jira)


[ 
https://issues.apache.org/jira/browse/HBASE-24139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17090820#comment-17090820
 ] 

Viraj Jasani commented on HBASE-24139:
--

Thanks [~bea0113]

> Balancer should avoid leaving idle region servers
> -
>
> Key: HBASE-24139
> URL: https://issues.apache.org/jira/browse/HBASE-24139
> Project: HBase
>  Issue Type: Improvement
>  Components: Balancer, Operability
>Reporter: Sean Busbey
>Assignee: Beata Sudi
>Priority: Critical
>  Labels: beginner
> Fix For: 3.0.0, 2.3.0, 1.7.0
>
>
> After HBASE-15529 the StochasticLoadBalancer makes the decision to run based 
> on its internal cost functions rather than the simple region count skew of 
> BaseLoadBalancer.
> Given the default weights for those cost functions, the default minimum cost 
> to indicate a need to rebalance, and a regions per region server density of 
> ~90 we are not very responsive to adding additional region servers for 
> non-trivial cluster sizes:
> * For clusters ~10 nodes, the defaults think a single RS at 0 regions means 
> we need to balance
> * For clusters >20 nodes, the defaults will not consider a single RS at 0 
> regions to mean we need to balance. 2 RS at 0 will cause it to balance.
> * For clusters ~100 nodes, having 6 RS with no regions will still not meet 
> the threshold to cause a balance.
> Note that this is the decision to look at balancer plans at all. The 
> calculation is severely dominated by the region count skew (it has weight 500 
> and all other weights are ~105), so barring a very significant change in all 
> other cost functions this condition will persist indefinitely.
> Two possible approaches:
> * add a new cost function that's essentially "don't have RS with 0 regions" 
> that an operator can tune
> * add a short circuit condition for the {{needsBalance}} method that checks 
> for empty RS similar to the check we do for colocated region replicas
> For those currently hitting this an easy work around is to set 
> {{hbase.master.balancer.stochastic.minCostNeedBalance}} to {{0.01}}. This 
> will mean that a single RS having 0 regions will cause the balancer to run 
> for clusters of up to ~90 region servers. It's essentially the same as the 
> default slop of 0.01 used by the BaseLoadBalancer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)