[ 
https://issues.apache.org/jira/browse/HBASE-3586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13003681#comment-13003681
 ] 

Ted Yu commented on HBASE-3586:
-------------------------------

Overall +1

For TestLoadBalancer.java, I would use simpler form because 
LoadBalancer.randomize() is only called once for each server in 
balanceCluster():
{code}
        List<HRegionInfo> copy = new ArrayList<HRegionInfo>(original);
        List<HRegionInfo> randomized = LoadBalancer.randomize(copy);
        if (original.equals(randomized)) {
                assertFalse(e.getKey().toString() + " has identical region 
list", true);
        }
{code}
I removed some logging which should have been done using LOG.info().
I also ran TestAdmin which passed.

This approach is smart and should avoid common pitfalls statistically.

Let's observe its efficacy in production. 


> Improve the selection of regions to balance
> -------------------------------------------
>
>                 Key: HBASE-3586
>                 URL: https://issues.apache.org/jira/browse/HBASE-3586
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.1
>            Reporter: Jean-Daniel Cryans
>            Assignee: Ted Yu
>            Priority: Critical
>             Fix For: 0.90.2
>
>         Attachments: 3586-randomize-v2.txt, 3586-randomize.txt, 
> HBASE-3586-by-region-age.patch, HBASE-3586-by-region-age.patch, 
> hbase-3586-table-creation.txt, hbase-3586-with-sort.txt
>
>
> Currently LoadBalancer goes through the list of regions per RS and grabs the 
> few first ones to balance. This is not bad, but that list is often sorted 
> naturally since the a RS that boots will open the regions in a sequential and 
> sorted order (since it comes from .META.) which means that we're balancing 
> regions starting in an almost sorted fashion.
> We discovered that because one of our internal users created a new table 
> starting with letter "p" which has now grown to 100 regions in the last few 
> hours and they are all served by 1 region server. Looking at the master's 
> log, the balancer has moved as many regions from that region server but they 
> are all from the same table that starts with letter "a" (and the regions that 
> were moved all come one after the other).
> The part of the code that should be modified is:
> {code}
> for (HRegionInfo hri: regions) {
>   // Don't rebalance meta regions.
>   if (hri.isMetaRegion()) continue; 
>   regionsToMove.add(new RegionPlan(hri, serverInfo, null));
>   numTaken++;
>   if (numTaken >= numToOffload) break;
> }
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to