[
https://issues.apache.org/jira/browse/HBASE-28513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ray Mattingly resolved HBASE-28513.
-----------------------------------
Fix Version/s: 4.0.0-alpha-1
2.7.0
3.0.0-beta-2
Resolution: Fixed
> The StochasticLoadBalancer should support discrete evaluations for replica
> distribution
> ---------------------------------------------------------------------------------------
>
> Key: HBASE-28513
> URL: https://issues.apache.org/jira/browse/HBASE-28513
> Project: HBase
> Issue Type: Improvement
> Components: Balancer
> Reporter: Ray Mattingly
> Priority: Major
> Labels: pull-request-available
> Fix For: 4.0.0-alpha-1, 2.7.0, 3.0.0-beta-2
>
>
> I have a larger write up available
> [here|https://docs.google.com/document/d/1jA8Ghs86v7b-53j5DcsdbPnOXxbHjewkIBFi1E4S1pY/edit?usp=sharing].
> Secondary replica balancing squashes all other cost considerations.
> Basically there are a few cost functions with relatively huge default
> multipliers. For example `PrimaryRegionCountSkewCostFunction` has a default
> multiplier of 100,000. Meanwhile things like StoreFileCostFunction have a
> multiplier of 5. Having any multiplier of 100k, while others are single
> digit, basically makes the latter category totally irrelevant from balancer
> considerations.
> I understand that it's critical to distribute a region's replicas across
> multiple hosts/racks, but I don't think we should do this at the expense of
> all other balancer considerations.
> For example, maybe we could have two types of balancer considerations: costs
> (as we do now), and conditionals (for the more discrete considerations, like
> ">1 replica of the same region should not exist on a single host"). This
> would allow us to prioritize replica distribution _and_ maintain
> consideration for things like storefile balance.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)