Ray Mattingly created HBASE-28513:
-------------------------------------
Summary: Secondary replica balancing squashes all other cost
considerations
Key: HBASE-28513
URL: https://issues.apache.org/jira/browse/HBASE-28513
Project: HBase
Issue Type: Improvement
Reporter: Ray Mattingly
I have a larger write up available
[here.|https://git.hubteam.com/gist/rmattingly/8bc9cbe7c422db12ffc9cd1825069bd7]
Basically there are a few cost functions with relatively huge default
multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default
multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a
multiplier of 5. Having any multiplier of 100k, while others are single digit,
basically makes the latter category totally irrelevant from balancer
considerations.
I understand that it's critical to distribute a region's replicas across
multiple hosts/racks, but I don't think we should do this at the expense of all
other balancer considerations.
For example, maybe we could have two types of balancer considerations: costs
(as we do now), and conditionals (for the more discrete considerations, like
">1 replica of the same region should not exist on a single host"). This would
allow us to prioritize replica distribution _and_ maintain consideration for
things like storefile balance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)