[ 
https://issues.apache.org/jira/browse/HBASE-26297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clara Xiong updated HBASE-26297:
--------------------------------
    Description: 
{code:java}
protected synchronized boolean areSomeRegionReplicasColocated(Cluster c) {
  regionReplicaHostCostFunction.init(c);
  if (regionReplicaHostCostFunction.cost() > 0) {
    return true;
  }
  regionReplicaRackCostFunction.init(c);
  if (regionReplicaRackCostFunction.cost() > 0) {
    return true;
  }

{code}
The values are in double data type. Balancer could get stuck in constant runs 
and unnecessary moves.
{code:java}
2021-09-24 12:02:41,943 INFO 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Running 
balancer because at least one server hosts replicas of the same region.
2021-09-24 12:01:42,878 INFO 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished 
computing new moving plan. Computation took 2400001 ms to try 3048341 different 
iterations.  Found a solution that moves 81 regions; Going from a computed 
imbalance of 1.7429830473781883E-4 to a new imbalance of 1.6169961756947032E-4. 
{code}
 we should use COST_EPSILON instead of 0 for double comparison.

 

 

  was:
{code:java}
protected synchronized boolean areSomeRegionReplicasColocated(Cluster c) {
  regionReplicaHostCostFunction.init(c);
  if (regionReplicaHostCostFunction.cost() > 0) {
    return true;
  }
  regionReplicaRackCostFunction.init(c);
  if (regionReplicaRackCostFunction.cost() > 0) {
    return true;
  }

{code}
The values are in double data type. we often run into unnecessary runs.
{code:java}
2021-09-24 12:02:41,943 INFO 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Running 
balancer because at least one server hosts replicas of the same region.
2021-09-24 12:01:42,878 INFO 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished 
computing new moving plan. Computation took 2400001 ms to try 3048341 different 
iterations.  Found a solution that moves 81 regions; Going from a computed 
imbalance of 1.7429830473781883E-4 to a new imbalance of 1.6169961756947032E-4. 
{code}
 we should use COST_EPSILON instead of 0 for double comparison.

 

 


> Balancer run is improperly triggered by accuracy error of double comparison
> ---------------------------------------------------------------------------
>
>                 Key: HBASE-26297
>                 URL: https://issues.apache.org/jira/browse/HBASE-26297
>             Project: HBase
>          Issue Type: Bug
>          Components: Balancer
>         Environment: {code:java}
>  {code}
>            Reporter: Clara Xiong
>            Assignee: Clara Xiong
>            Priority: Major
>
> {code:java}
> protected synchronized boolean areSomeRegionReplicasColocated(Cluster c) {
>   regionReplicaHostCostFunction.init(c);
>   if (regionReplicaHostCostFunction.cost() > 0) {
>     return true;
>   }
>   regionReplicaRackCostFunction.init(c);
>   if (regionReplicaRackCostFunction.cost() > 0) {
>     return true;
>   }
> {code}
> The values are in double data type. Balancer could get stuck in constant runs 
> and unnecessary moves.
> {code:java}
> 2021-09-24 12:02:41,943 INFO 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Running 
> balancer because at least one server hosts replicas of the same region.
> 2021-09-24 12:01:42,878 INFO 
> org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished 
> computing new moving plan. Computation took 2400001 ms to try 3048341 
> different iterations.  Found a solution that moves 81 regions; Going from a 
> computed imbalance of 1.7429830473781883E-4 to a new imbalance of 
> 1.6169961756947032E-4. 
> {code}
>  we should use COST_EPSILON instead of 0 for double comparison.
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to