[ https://issues.apache.org/jira/browse/HBASE-28513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ray Mattingly resolved HBASE-28513. ----------------------------------- Fix Version/s: 4.0.0-alpha-1 2.7.0 3.0.0-beta-2 Resolution: Fixed > The StochasticLoadBalancer should support discrete evaluations for replica > distribution > --------------------------------------------------------------------------------------- > > Key: HBASE-28513 > URL: https://issues.apache.org/jira/browse/HBASE-28513 > Project: HBase > Issue Type: Improvement > Components: Balancer > Reporter: Ray Mattingly > Priority: Major > Labels: pull-request-available > Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2 > > > I have a larger write up available > [here|https://docs.google.com/document/d/1jA8Ghs86v7b-53j5DcsdbPnOXxbHjewkIBFi1E4S1pY/edit?usp=sharing]. > Secondary replica balancing squashes all other cost considerations. > Basically there are a few cost functions with relatively huge default > multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default > multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a > multiplier of 5. Having any multiplier of 100k, while others are single > digit, basically makes the latter category totally irrelevant from balancer > considerations. > I understand that it's critical to distribute a region's replicas across > multiple hosts/racks, but I don't think we should do this at the expense of > all other balancer considerations. > For example, maybe we could have two types of balancer considerations: costs > (as we do now), and conditionals (for the more discrete considerations, like > ">1 replica of the same region should not exist on a single host"). This > would allow us to prioritize replica distribution _and_ maintain > consideration for things like storefile balance. -- This message was sent by Atlassian Jira (v8.20.10#820010)