[ https://issues.apache.org/jira/browse/HBASE-26297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Clara Xiong updated HBASE-26297: -------------------------------- Description: {code:java} protected synchronized boolean areSomeRegionReplicasColocated(Cluster c) { regionReplicaHostCostFunction.init(c); if (regionReplicaHostCostFunction.cost() > 0) { return true; } regionReplicaRackCostFunction.init(c); if (regionReplicaRackCostFunction.cost() > 0) { return true; } {code} The values are in double data type. Balancer could get stuck in constant runs and unnecessary moves. {code:java} 2021-09-24 12:02:41,943 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Running balancer because at least one server hosts replicas of the same region. 2021-09-24 12:01:42,878 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished computing new moving plan. Computation took 2400001 ms to try 3048341 different iterations. Found a solution that moves 81 regions; Going from a computed imbalance of 1.7429830473781883E-4 to a new imbalance of 1.6169961756947032E-4. {code} we should use COST_EPSILON instead of 0 for double comparison. was: {code:java} protected synchronized boolean areSomeRegionReplicasColocated(Cluster c) { regionReplicaHostCostFunction.init(c); if (regionReplicaHostCostFunction.cost() > 0) { return true; } regionReplicaRackCostFunction.init(c); if (regionReplicaRackCostFunction.cost() > 0) { return true; } {code} The values are in double data type. we often run into unnecessary runs. {code:java} 2021-09-24 12:02:41,943 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Running balancer because at least one server hosts replicas of the same region. 2021-09-24 12:01:42,878 INFO org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished computing new moving plan. Computation took 2400001 ms to try 3048341 different iterations. Found a solution that moves 81 regions; Going from a computed imbalance of 1.7429830473781883E-4 to a new imbalance of 1.6169961756947032E-4. {code} we should use COST_EPSILON instead of 0 for double comparison. > Balancer run is improperly triggered by accuracy error of double comparison > --------------------------------------------------------------------------- > > Key: HBASE-26297 > URL: https://issues.apache.org/jira/browse/HBASE-26297 > Project: HBase > Issue Type: Bug > Components: Balancer > Environment: {code:java} > {code} > Reporter: Clara Xiong > Assignee: Clara Xiong > Priority: Major > > {code:java} > protected synchronized boolean areSomeRegionReplicasColocated(Cluster c) { > regionReplicaHostCostFunction.init(c); > if (regionReplicaHostCostFunction.cost() > 0) { > return true; > } > regionReplicaRackCostFunction.init(c); > if (regionReplicaRackCostFunction.cost() > 0) { > return true; > } > {code} > The values are in double data type. Balancer could get stuck in constant runs > and unnecessary moves. > {code:java} > 2021-09-24 12:02:41,943 INFO > org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Running > balancer because at least one server hosts replicas of the same region. > 2021-09-24 12:01:42,878 INFO > org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer: Finished > computing new moving plan. Computation took 2400001 ms to try 3048341 > different iterations. Found a solution that moves 81 regions; Going from a > computed imbalance of 1.7429830473781883E-4 to a new imbalance of > 1.6169961756947032E-4. > {code} > we should use COST_EPSILON instead of 0 for double comparison. > > -- This message was sent by Atlassian Jira (v8.3.4#803005)