[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-15 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020211#comment-13020211
 ] 

stack commented on HBASE-3609:
--

I am +1 on committing.  The doc is good.  Will wait to commit for a day or so 
in case others have opinion on this patch.  If you get feedback from others on 
how this balance change works for them, include it in the issue Ted.  Good 
stuff.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-14 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13020184#comment-13020184
 ] 

stack commented on HBASE-3609:
--

@Ted I'm game for committing this to branch if you add javadoc explaining your 
new balance algorithm; the rules you've implemented.  Boil down your blog and 
use that.  I'm uneasy about the lack of unit test but if its too hard to do, I 
think its OK getting the balancer changes in now into TRUNK before we start in 
trying to harden 0.92.

Good work.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-13 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13019560#comment-13019560
 ] 

Ted Yu commented on HBASE-3609:
---

From Stan Barton who helps me experiment with my changes:

There is no easy
way to check how many regions are assigned to particular RS, so will
probably need to write some small parser to prove that.

I think we should backport HBASE-3704 (at least Regions by Region Server) to 
0.90.3 so that people can easily tell how (un)even the load is distributed.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-12 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13019204#comment-13019204
 ] 

stack commented on HBASE-3609:
--

bq. Empty server can be detected within balanceCluster(). But this detection 
has been performed by Master, hence the flag.

Thats fair.  Its nice having the balance invocation method simple as possible 
though.

bq. The static regionId helps make each region Id unique. I actually utilized 
this fact to debug my code.

I missed that it was being used.

bq. Preliminary response from Stan Barton showed improvement over random 
selector.
I am waiting for further feedback from gaojinc...@huawei.com and Stan.

I think that if you get good feedback from others, that'll help getting this 
patch committed.

Good stuff Ted.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-double-alternation.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018438#comment-13018438
 ] 

Ted Yu commented on HBASE-3609:
---

From Stan Barton:
Apr 8th:
I can see that the regionserver that gets all the inserts makes pauses
time to time - I checked the log and it is because (I assume) the
regions of other tables from this RS are re-assigned elsewhere. When I
started inserting, the table was empty, now it has ~20 regions, all
assigned to one RS, that kicks the scale out property out of the
system. The insertion process is still running, if you are interested
in some other info.

Apr 11th:
I have downloaded the 0.90.2 candidate but I regret to inform you,
that it did not help to solve the issue. I am re-doing the tests and
again, all the newly created regions are being assigned to a single
RS. It is a real performance killer.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018725#comment-13018725
 ] 

stack commented on HBASE-3609:
--

I wonder why all regions of a table are assigned to one regionserver when we 
added randomization to the balancer in 0.90.2?  Didn't we? (i.e. HBASE-3586)

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-11 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018728#comment-13018728
 ] 

Ted Yu commented on HBASE-3609:
---

More background for Stan's cluster:
There're at least 600 regions on each region server. When 30 new regions were 
created on the same region server, random selector only chose 3 out of the 30 
new regions for reassignment. The other region selection was from inactive 
(old) regions. This is expected behavior because new and old regions were 
selected equally probably.

Basically we traded some optimization for safety of not overloading a newly 
discovered region server.

My latest patch avoids the above behavior. At the same time, it also can deal 
with a region server which just joined the cluster and should be assigned both 
old and new regions.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3609) Improve the selection of regions to balance; part 2

2011-04-11 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13018729#comment-13018729
 ] 

stack commented on HBASE-3609:
--

Looking at patch, I see removal of testRandomizer.  Would be nice if a new test 
of new functionality took its place.

I like how you use regionid figuring region age.  Would suggest you add a 
little documentation that you depend on regionid being a timestamp (would also 
clarify what you are up to here).

Do you have to do this:

{code}+  boolean emptyRegionServerPresent) {code}

So this algorithm balances exclusively by age?   No randomness?

Would be nice if you described the algo in a comment included in the patch.

How we know this algo better than what we have?

Patch looks good otherwise Ted.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, 3609-empty-RS.txt, 
 hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004591#comment-13004591
 ] 

Ted Yu commented on HBASE-3609:
---

I was looking for a class which can store moving average of requests/second.
Shall we add that policy in another JIRA ?

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, 
 hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004627#comment-13004627
 ] 

Ted Yu commented on HBASE-3609:
---

I am thinking about adding this enum:
{code}
  public static enum BalancerPolicy {
  BALANCE_REGION_COUNT,
  BALANCE_REQUESTS_PER_MINUTE,
  }
{code}
Shall we introduce hbase.balancer.policy in hbase-default.xml which reflects 
the above enum ?


 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, 
 hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004911#comment-13004911
 ] 

Ted Yu commented on HBASE-3609:
---

Depending on the chosen policy, I plan to introduce interface, 
RegionPlanHolder, which abstracts the concrete data structures holding 
RegionPlans:
{code}
  interface RegionPlanHolder {
  RegionPlanHolder create();
  int size();
  void add(RegionPlan rp);
  // removes the head element
  RegionPlan remove();
  // removes the tail element
  RegionPlan removeLast();
  // retrieves element at index 
  RegionPlan get(int index);
  }
{code}
Then regionsToMove would be declared as RegionPlanHolder so that implementation 
detail such as the following would be hidden in balanceCluster().
{code}
MinMaxPriorityQueueRegionPlan regionsToMove = 
MinMaxPriorityQueue.orderedBy(rpComparator).create();
{code}


 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, 
 hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004966#comment-13004966
 ] 

Ted Yu commented on HBASE-3609:
---

Since request count (HBASE-3507) is contained in HRegion, not HRegionInfo, 
there is more work to be done so that decaying moving average of request rate 
(in Ryan's term) is available to LoadBalancer.
This can be done in other JIRAs.

I suggest 3609-alternate.txt be checked in first.


 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, 
 hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-09 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004967#comment-13004967
 ] 

Ted Yu commented on HBASE-3609:
---

The goal of 3609-alternate.txt is to balance the 'ages' (reflected by sum of 
region Ids) of regions across region servers.
Currently HServerLoad doesn't contain HRegionInfo information. This makes 
validating the above goal less straightforward.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, 
 hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13003842#comment-13003842
 ] 

Ted Yu commented on HBASE-3609:
---

hbase-3609.txt implemented first part of the idea expressed in HBASE-3586 @ 
06/Mar/11 07:10
The alternation between head and tail of regions list happens after each 
addition to regionsToMove. So this should provide good distribution of young 
and old regions.

The implementation for second part needs more refinement because regionsToMove 
is manipulated in three loops in balanceCluster().

TestLoadBalancer passes.

Please review.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004082#comment-13004082
 ] 

Ted Yu commented on HBASE-3609:
---

hbase-3609-by-region-age.txt is the refined version.
This accounts for the out-of-band regions which were assigned to the server 
after some other region server crashed.
In that scenario, random selection may not work very well because the regions 
on individual region server are not sorted.
Both TestLoadBalancer and TestAdmin pass.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004140#comment-13004140
 ] 

Ted Yu commented on HBASE-3609:
---

If someone can point me to an implementation of Double-Ended Priority Queue in 
Java, I can use it in the three loops involving regionsToMove.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004141#comment-13004141
 ] 

Ted Yu commented on HBASE-3609:
---

I will try http://download.oracle.com/javase/6/docs/api/java/util/Deque.html

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004153#comment-13004153
 ] 

Ted Yu commented on HBASE-3609:
---

Can we upgrade Guava so that I can use this ?
http://guava-libraries.googlecode.com/svn/trunk/javadoc/com/google/common/collect/MinMaxPriorityQueue.html

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: hbase-3609-by-region-age.txt, hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] Commented: (HBASE-3609) Improve the selection of regions to balance; part 2

2011-03-08 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13004254#comment-13004254
 ] 

Ted Yu commented on HBASE-3609:
---

This is close to what I described in HBASE-3586.
I used MinMaxPriorityQueue to alternately choose regions from head and tail of 
regionsToMove.

Both TestLoadBalancer and TestAdmin pass.

 Improve the selection of regions to balance; part 2
 ---

 Key: HBASE-3609
 URL: https://issues.apache.org/jira/browse/HBASE-3609
 Project: HBase
  Issue Type: Improvement
Reporter: stack
Assignee: Ted Yu
 Attachments: 3609-alternate.txt, hbase-3609-by-region-age.txt, 
 hbase-3609.txt


 See 'HBASE-3586  Improve the selection of regions to balance' for discussion 
 of algorithms that improve on current random assignment.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira