[ https://issues.apache.org/jira/browse/HBASE-29203?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ray Mattingly resolved HBASE-29203. ----------------------------------- Fix Version/s: 4.0.0-alpha-1 2.7.0 3.0.0-beta-2 Release Note: This adds a StoreFileTableSkewCostFunction which will let you balance every table by datasize, which can be a better metric than region number. This will work well, even if you are balancing by cluster (the default balancer setup). It comes with a default cost of 35, equivalent to StorefileSizeCost Resolution: Fixed > There should be a StorefileSize equivalent to the TableSkewCost > --------------------------------------------------------------- > > Key: HBASE-29203 > URL: https://issues.apache.org/jira/browse/HBASE-29203 > Project: HBase > Issue Type: Improvement > Components: Balancer > Affects Versions: 2.7.0 > Reporter: Ray Mattingly > Assignee: Ray Mattingly > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2 > > > We don't want to rely on per-table balancing, but we do want tables to be > balanced effectively based on their data size. This is because regions are > variably sized, but 1gb is always 1gb, so, in a well distributed schema, data > size is a better metric for a region's load than just counting every region > identically. > Right now the TableSkewCostFunction is the balancer's mechanism for balancing > each table somewhat independently within a cluster-wide plan, but it only > looks at region counts. We could balance more effectively if we had a > StoreFileTableSkewCost. -- This message was sent by Atlassian Jira (v8.20.10#820010)