[ 
https://issues.apache.org/jira/browse/HBASE-26023?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Clara Xiong updated HBASE-26023:
--------------------------------
    Environment: 
{code:java}
 {code}

  was:
There is another bug in the original tableSkew cost function for aggregation of 
the cost per table:

If we have 10 regions, one per table, evenly distributed on 10 nodes, the cost 
is scale to 1.0.

The more tables we have, the closer the value will be to 1.0. The cost function 
becomes useless.

All the balancer tests were set up with large numbers of tables with minimal 
regions per table. This artificially inflates the total cost and trigger 
balancer runs. With this fix on TableSkewFunction, we need to overhaul the 
tests too. We also need to add tests that reflect more diversified scenarios 
for table distribution such as large tables with large numbers of regions.
{code:java}
protected double cost() {
 double max = cluster.numRegions;
 double min = ((double) cluster.numRegions) / cluster.numServers;
 double value = 0;

 for (int i = 0; i < cluster.numMaxRegionsPerTable.length; i++) {
 value += cluster.numMaxRegionsPerTable[i];
 }
 LOG.info("min = {}, max = {}, cost= {}", min, max, value);
 return scale(min, max, value);
 }
}{code}


> Overhaul of test cluster set up for table skew
> ----------------------------------------------
>
>                 Key: HBASE-26023
>                 URL: https://issues.apache.org/jira/browse/HBASE-26023
>             Project: HBase
>          Issue Type: Sub-task
>          Components: Balancer, test
>         Environment: {code:java}
>  {code}
>            Reporter: Clara Xiong
>            Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to