[ https://issues.apache.org/jira/browse/HBASE-17969?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15990255#comment-15990255 ]
Ted Yu commented on HBASE-17969: -------------------------------- The test is with hbase-server/src/test/java/org/apache/hadoop/hbase/master/balancer/TestDefaultLoadBalancer.java which doesn't apply to master branch. Can you modify the test to suit master branch (make sure SimpleLoadBalancer is used) ? > Balance by table using SimpleLoadBalancer could end up imbalance > ---------------------------------------------------------------- > > Key: HBASE-17969 > URL: https://issues.apache.org/jira/browse/HBASE-17969 > Project: HBase > Issue Type: Improvement > Affects Versions: 1.1.10 > Reporter: Allan Yang > Assignee: Allan Yang > Attachments: HBASE-17969-branch-1.patch, HBASE-17969-branch-1.v2.patch > > > This really happens in our production env. > Here is a example: > Say we have three RS named r1, r2, r3. A table named table1 with 3 regions > distributes on these rs like this: > r1 1 > r2 1 > r3 1 > Each rs have one region, it means table1 is balanced. So balancer will not > run. > If the region on r3 splits, then it becomes: > r1 1 > r2 1 > r3 2 > For table1, in average, each rs will have min=1, max=2 regions. So still it > is balanced, balancer will not run. > Then a region on r3 splits again, the distribution becomes: > r1 1 > r2 1 > r3 3 > In average, each rs will have min=1, max=2 regions. So balancer will run. > For r1 and r2, they have already have min=1 regions. Balancer won't do any > operation on them. > But for r3, it exceed max=3, so balancer will remove one region from r3 and > choose one rs from r1, r2 to move to. > But r1 and r2 have the same load, so balancer will always choose r1 since > servername r1 < r2(alphabet order, sorted by ServerAndLoad's compareTo > method). It is OK for table1 itself. But if every table in the cluster have > similar situations like table1, then the load in the cluster will always be > like r1 > r2 > r3. > So, the solution here is when each rs reach min regions (min=total region / > servers), but there still some region need to move, shuffle the regionservers > before move. -- This message was sent by Atlassian JIRA (v6.3.15#6346)