[ https://issues.apache.org/jira/browse/HBASE-23073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986759#comment-16986759 ]
Pierre Zemb commented on HBASE-23073: ------------------------------------- Directly on github? Doing :) > Add an optional costFunction to balance regions according to a capacity rule > ---------------------------------------------------------------------------- > > Key: HBASE-23073 > URL: https://issues.apache.org/jira/browse/HBASE-23073 > Project: HBase > Issue Type: New Feature > Components: master > Affects Versions: 3.0.0 > Reporter: Pierre Zemb > Assignee: Pierre Zemb > Priority: Minor > Fix For: 3.0.0, 2.3.0 > > Attachments: HBASE-23073.branch-1.0002.patch, > HBASE-23073.branch-1.001.patch > > > Based on the work in > [HBASE-22618|https://issues.apache.org/jira/browse/HBASE-22618], users can > now load custom costFunctions inside the main balancer used by HBase. As an > example, we like like to add upstream an optional cost function called > HeterogeneousRegionCountCostFunction that will deal with our issue: how to > balance regions according to the capacity of a RS instead of using the > RegionCountSkewCostFunction that is trying to avoid skew. > A rule file is loaded from HDFS before balancing. It contains lines of rules. > A rule is composed of a regexp for hostname, and a limit. For example, we > could have: > * rs[0-9] 200 > * rs1[0-9] 50 > RegionServers with hostname matching the first rules will have a limit of > 200, and the others 50. If there's no match, a default is set. > Thanks to the rule, we have two informations: the max number of regions for > this cluster, and the rules for each servers. HeterogeneousBalancer will try > to balance regions according to their capacity. > Let's take an example. Let's say that we have 20 RS: > 10 RS, named through rs0 to rs9 loaded with 60 regions each, and each can > handle 200 regions. > 10 RS, named through rs10 to rs19 loaded with 60 regions each, and each > can support 50 regions. > Based on the following rules: > rs[0-9] 200 > rs1[0-9] 50 > The second group is overloaded, whereas the first group has plenty of space. > Moving a region from the first group to the second should provide a lower > cost. -- This message was sent by Atlassian Jira (v8.3.4#803005)