[jira] Updated: (HBASE-3488) Allow RowCounter to retrieve multiple versions of rows

2011-01-28 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3488:
--

Description: 
Currently RowCounter only retrieves latest version for each row.
Some applications would store multiple versions for the same row.

RowCounter should accept a new parameter for the number of versions to return.
Scan object would be configured with version parameter (for scan.maxVersions).
Then the following API should be called:
{code}
  public KeyValue[] raw() {
{code}


  was:
Currently RowCounter only retrieves latest version for each row.
Some applications would store multiple versions for the same row.

RowCounter should accept a new parameter for the number of versions to return.
Scan object would be configured with version parameter.
Then the following API should be called:
{code}
  public NavigableMap>> 
getMap() {
{code}



> Allow RowCounter to retrieve multiple versions of rows
> --
>
> Key: HBASE-3488
> URL: https://issues.apache.org/jira/browse/HBASE-3488
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.0
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> Currently RowCounter only retrieves latest version for each row.
> Some applications would store multiple versions for the same row.
> RowCounter should accept a new parameter for the number of versions to return.
> Scan object would be configured with version parameter (for scan.maxVersions).
> Then the following API should be called:
> {code}
>   public KeyValue[] raw() {
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3488) Allow RowCounter to retrieve multiple versions of rows

2011-01-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988413#action_12988413
 ] 

Ted Yu commented on HBASE-3488:
---

This JIRA was filed as improvement. We expect different values to co-exist 
(through versions) for the same row key in our table.

In RowCounter, I see:
{code}
Scan scan = new Scan();
{code}
This ctor assigns 1 to maxVersions.

It is desirable to assign other value as maxVersions so that we know the 
correct number of rows in the table.

Description of FirstKeyOnlyFilter usage above holds.

> Allow RowCounter to retrieve multiple versions of rows
> --
>
> Key: HBASE-3488
> URL: https://issues.apache.org/jira/browse/HBASE-3488
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.0
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> Currently RowCounter only retrieves latest version for each row.
> Some applications would store multiple versions for the same row.
> RowCounter should accept a new parameter for the number of versions to return.
> Scan object would be configured with version parameter.
> Then the following API should be called:
> {code}
>   public NavigableMap byte[]>>> getMap() {
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988396#action_12988396
 ] 

Ted Yu commented on HBASE-3373:
---

@Matt:
The following code can be improved through randomization in case of empty tail 
to avoid clustering at consistentHashRing.firstKey()
{code}
regionHash = tail.isEmpty() ? consistentHashRing.firstKey() : tail.firstKey();
{code}


> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
> Attachments: HbaseBalancerTest2.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3488) Allow RowCounter to retrieve multiple versions of rows

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988390#action_12988390
 ] 

Jonathan Gray commented on HBASE-3488:
--

So would the idea be to not actually count rows but to count either columns or 
versions of columns?  As I recall, most of the row counting stuff is using 
FirstKeyOnlyFilter and is optimized to count unique rows regardless if they 
have one version of one column or a millions versions of a million columns.

Also, I don't recommend the {{Result.getMap()}} API.  It's a convenience method 
but it's not especially performant (it iterates all the keys, parses stuff, 
allocates new byte[]s, and builds up the map).  Instead you should just use 
{{Result.raw()}} and operate on the list of KeyValues returned.

> Allow RowCounter to retrieve multiple versions of rows
> --
>
> Key: HBASE-3488
> URL: https://issues.apache.org/jira/browse/HBASE-3488
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.0
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> Currently RowCounter only retrieves latest version for each row.
> Some applications would store multiple versions for the same row.
> RowCounter should accept a new parameter for the number of versions to return.
> Scan object would be configured with version parameter.
> Then the following API should be called:
> {code}
>   public NavigableMap byte[]>>> getMap() {
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3488) Allow RowCounter to retrieve multiple versions of rows

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988391#action_12988391
 ] 

Jonathan Gray commented on HBASE-3488:
--

And is this a bug as filed?

> Allow RowCounter to retrieve multiple versions of rows
> --
>
> Key: HBASE-3488
> URL: https://issues.apache.org/jira/browse/HBASE-3488
> Project: HBase
>  Issue Type: Bug
>  Components: util
>Affects Versions: 0.90.0
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> Currently RowCounter only retrieves latest version for each row.
> Some applications would store multiple versions for the same row.
> RowCounter should accept a new parameter for the number of versions to return.
> Scan object would be configured with version parameter.
> Then the following API should be called:
> {code}
>   public NavigableMap byte[]>>> getMap() {
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988340#action_12988340
 ] 

Ted Yu commented on HBASE-3373:
---

We should sort regionsToMove by the creation time of regions. The rationale is 
that new regions tend to be the hot ones and should be round-robin assigned to 
underloaded servers.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
> Attachments: HbaseBalancerTest2.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-3373:
---

Attachment: HbaseBalancerTest2.java

removed dependency

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
> Attachments: HbaseBalancerTest2.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-3373:
---

Attachment: (was: HbaseBalancerTest.java)

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
> Attachments: HbaseBalancerTest2.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Created: (HBASE-3488) Allow RowCounter to retrieve multiple versions of rows

2011-01-28 Thread Ted Yu (JIRA)
Allow RowCounter to retrieve multiple versions of rows
--

 Key: HBASE-3488
 URL: https://issues.apache.org/jira/browse/HBASE-3488
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.90.0
Reporter: Ted Yu
 Fix For: 0.92.0


Currently RowCounter only retrieves latest version for each row.
Some applications would store multiple versions for the same row.

RowCounter should accept a new parameter for the number of versions to return.
Scan object would be configured with version parameter.
Then the following API should be called:
{code}
  public NavigableMap>> 
getMap() {
{code}


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Matt Corgan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Corgan updated HBASE-3373:
---

Attachment: HbaseBalancerTest.java

Sample consistent hashing balancing

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
> Attachments: HbaseBalancerTest.java
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-3485) implement a Compressor based on LZF

2011-01-28 Thread Jeff Hammerbacher (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jeff Hammerbacher resolved HBASE-3485.
--

Resolution: Duplicate

Dupe of HBASE-2404

> implement a Compressor based on LZF
> ---
>
> Key: HBASE-3485
> URL: https://issues.apache.org/jira/browse/HBASE-3485
> Project: HBase
>  Issue Type: Bug
>Reporter: ryan rawson
>
> this library:
> https://github.com/ning/compress
> implements LZF in pure-java and has an appropriate license.  
> We could consider shipping with this support enabled and provide a ready to 
> use alternative to LZO.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988329#action_12988329
 ] 

Matt Corgan commented on HBASE-3373:


I can't really post my client code since it's intertwined with a bunch of other 
stuff, but I extracted the important parts into a junit test that i attached to 
this issue.  We run java (tomcat) so it's fairly easy to talk directly to hbase 
and integrate a few features into our admin console.  Printing friendly record 
names rather than escaped bytes, triggering backups, moving regions, etc...  
Don't think it requires knowing the keyspace ahead of time, just that you hash 
into a known output range, a 63 bit long in my example.

I think the consistent hashing scheme may be a good out-of-the-box methodology. 
 Even with something smarter, I'd worry about the underlying algorithms getting 
off course and starting a death spiral as bad outputs are fed back in creating 
even worse outputs.  Something like consistent hashing could be a good beacon 
to always be steering towards so things don't get too far off course.

I have about 20 tables with many different access patterns and I can't envision 
an algorithm that balances them truly well.  Everything could be going fine 
until I kick off a MR job that randomly digs up 100 very cold regions and find 
that they're all on the same server.

I'm thinking of a system where each region is either at home  (its consistent 
hash destination) or visiting another server because the balancer decided its 
home was too hot.  Each regionserver could identify it's hotter regions, and 
the balancer could move these around in an effort to smooth out the load.  In 
the mean time, colder regions would stay well distributed based on how good the 
hashing mechanism is.  If a regionserver cools down, the master brings home 
it's vacationing regions first, and if it's still cool, then it borrows someone 
else's hotter home regions.  Without an underlying scheme, I can envision 
things getting extremely chaotic, especially with regards to cold regions of a 
single table getting bundled up since they're being overlooked.  With this 
method, you're never too far from safely hitting the reset button.

...

Regarding your comment about moving the top or bottom child off the parent 
server after a split, I tend to prefer moving the bottom one.  With time series 
data it will keep writing to the bottom child, so if you don't move the bottom 
child then a single server will end up doing the appending forever.  I prefer 
to rotate the server that's doing the work even though it's not quite as 
efficient and may cause a longer split pause makes for a more balanced 
cluster.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Resolved: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2011-01-28 Thread Jonathan Gray (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray resolved HBASE-3305.
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]

Committed to branch and trunk.  Thanks Ted, sorry about the long delay!

> Allow round-robin distribution for table created with multiple regions
> --
>
> Key: HBASE-3305
> URL: https://issues.apache.org/jira/browse/HBASE-3305
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.90.1, 0.92.0
>
> Attachments: hbase-3305-array.patch, 
> hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
> hbase-3305.patch
>
>
> We can distribute the initial regions created for a new table in round-robin 
> fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3446) ProcessServerShutdown fails if META moves, orphaning lots of regions

2011-01-28 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-3446:
-

Attachment: 3446-v9.txt

Improved retries exception reporting.

> ProcessServerShutdown fails if META moves, orphaning lots of regions
> 
>
> Key: HBASE-3446
> URL: https://issues.apache.org/jira/browse/HBASE-3446
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.0
>Reporter: Todd Lipcon
>Assignee: stack
>Priority: Blocker
> Fix For: 0.90.1
>
> Attachments: 3446-v2.txt, 3446-v3.txt, 3446-v4.txt, 3446-v7.txt, 
> 3446-v9.txt, 3446.txt
>
>
> I ran a rolling restart on a 5 node cluster with lots of regions, and 
> afterwards had LOTS of regions left orphaned. The issue appears to be that 
> ProcessServerShutdown failed because the server hosting META was restarted 
> around the same time as another server was being processed

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988295#action_12988295
 ] 

Jonathan Gray commented on HBASE-3455:
--

+1

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Affects Versions: 0.90.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.1
>
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988294#action_12988294
 ] 

Jonathan Gray commented on HBASE-3373:
--

Round-robin assignment at table creation is fine.  Bypassing the load balancer 
and doing your own thing is fine.  Adding intelligence into the balancer to get 
good balance of load is great.

I'm -1 on adding these kinds of specialized hooks into HBase proper.  They 
should either be an external component (seems that they can be) or we should 
make the balancer pluggable and you could provide alternative/configurable 
balancer implementations.

Assigning overloaded regions in a round-robin way to underloaded does make 
sense.  Would be happy to take a contribution to do that.  I'm not sure there's 
a very strong correlation with that and splitting up of daughter regions.  It 
certainly could be the case, but selection of which regions to move off an 
overloaded server is rather dumb so no guarantees that recently split regions 
get reassigned.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread Todd Lipcon (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988292#action_12988292
 ] 

Todd Lipcon commented on HBASE-3455:


Yea.. how about we put in the on/off switch in hbase-default, but leave the 
size switches undocumented except for in the source? (I have no idea what the 
best values are for those... could be very dependent on your JVM settings even)

Ted also reminded me that some unit tests are failing (heapsize tests for Store 
for example), so we should run those before committing and fix up - I don't 
think I got them all right yet.

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Affects Versions: 0.90.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.1
>
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988291#action_12988291
 ] 

Jonathan Gray commented on HBASE-3455:
--

+1 on keeping exotic config out of hbase-default though i'm not sure just the 
turning on/off of this is that exotic.  Definitely need some 
documentation/explanation somewhere tho.

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Affects Versions: 0.90.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.1
>
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988290#action_12988290
 ] 

Ted Yu commented on HBASE-3373:
---

Hive, for instance, may create new (intermediate) tables for its map/reduce 
jobs.
The add-on I propose would be beneficial for that scenario.

Also, in LoadBalancer.balanceCluster(), line 210:
{code}
  while(numTaken < numToTake && regionidx < regionsToMove.size()) {
regionsToMove.get(regionidx).setDestination(server.getKey());
numTaken++;
regionidx++;
  }
{code}
The above code would offload regions from most loaded server to most 
underloaded server.
It is more desirable to round-robin the regions from loaded server(s) to 
underloaded server(s) so that the new daughter regions don't stay on the same 
server.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988289#action_12988289
 ] 

Jonathan Gray commented on HBASE-3455:
--

I'm cool with 0.90.1 because it would be good to get this out there for people 
to experiment with sooner than later (and it's off by default, which I think it 
needs to be until we do more experimenting with more workloads in more 
environments).  Seems like there's a lot of configuration params that could go 
into this too that may need tweaking in general and in different setups?

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Affects Versions: 0.90.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.1
>
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988288#action_12988288
 ] 

stack commented on HBASE-3455:
--

+1 on commit and on 0.90.1 (Fix license so its 2011 on commit, I don't know 
about exposing such exotic configs in hbase-default.xml but will not object to 
their presence).  This is great Todd.  Want to open new issue to figure the ICV 
issue?

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Affects Versions: 0.90.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.1
>
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-3455:
---

Fix Version/s: 0.90.1
 Assignee: Todd Lipcon
Affects Version/s: 0.90.1
   Status: Patch Available  (was: Open)

Marking for 0.90.1 since the new feature is off-by-default. Happy to punt to 
0.92 if others disagree.

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Affects Versions: 0.90.1
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
>Priority: Critical
> Fix For: 0.90.1
>
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3455) Heap fragmentation in region server

2011-01-28 Thread Todd Lipcon (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3455?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Todd Lipcon updated HBASE-3455:
---

Attachment: mslab-3.txt

Oops, missed one test change before.

This patch should be good to go. I tested under heavy load (95% write, 5% 
increment, large working set to trigger block cache churn) on a cluster for 
about 16 hours, no full GCs at all in 8G heap with CMS.

But it's still off by default because of the worrisome theoretical 
possibilities with upsert.

> Heap fragmentation in region server
> ---
>
> Key: HBASE-3455
> URL: https://issues.apache.org/jira/browse/HBASE-3455
> Project: HBase
>  Issue Type: Brainstorming
>  Components: performance, regionserver
>Reporter: Todd Lipcon
>Priority: Critical
> Attachments: collapse-arrays.patch, HBasefragmentation.pdf, 
> icv-frag.png, mslab-1.txt, mslab-2.txt, mslab-3.txt, parse-fls-statistics.py, 
> with-kvallocs.png
>
>
> Stop-the-world GC pauses have long been a problem in HBase. "Concurrent mode 
> failures" can usually be tuned around by setting the initiating occupancy 
> fraction low, but eventually the heap becomes fragmented and a promotion 
> failure occurs.
> This JIRA is to do research/experiments about the heap fragmentation issue 
> and possible solutions.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988269#action_12988269
 ] 

Jonathan Gray commented on HBASE-3373:
--

Both of your solutions are rather specialized and I'm not sure generally 
applicable.  I would much prefer spending effort on improving our current load 
balancer and it seems to me that it would be possible to implement similar 
behaviors in a more generalized way.

Also, the addition of an HBaseAdmin region move API makes it so you don't need 
to muck with HBase server code to do specialized balancing logic.  With the 
current APIs, it's possible to basically push the balancer out into your own 
client.

@Matt, I don't think I'm really understanding how you upgrade our load balancer 
w/ consistent hashing?

The fact that split regions open back up on the same server is actually an 
optimization in many cases because it reduces the amount of time the regions 
are offline and when they come back online and do a compaction to drop 
references, all the files are more likely to be on the local DataNode rather 
than remote.  In some cases, like time-series, you may want the splits to move 
to different servers.  I could imagine some configurable logic in there to 
ensure the bottom half goes to a different server (or maybe the top half would 
actually be more efficient to move away since most the time you'll write more 
to the bottom half and thus want the data locality / quick turnaround).  
There's likely going to be a bit of split rework in 0.92 to make it more like 
the ZK-based regions-in-transition.

As far as binding regions to servers between cluster restarts, this is already 
implemented and on by default in 0.90.

Consistent hashing also requires a fixed keyspace (right?) and that's a 
mismatch for HBase's flexibility in this regard.

Do you have any code for this client-side consistent hashing balancer?  I'm 
confused about how that could be implemented without knowing a lot about your 
data, the regions, the servers available, etc.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988249#action_12988249
 ] 

Ted Yu commented on HBASE-3373:
---

This is what I added in HMaster:
{code}
  /**
   * Evenly distributes the regions of the tables (assuming the number of 
regions is much bigger
   *  than the number of region servers)
   * @param tableNames tables to load balance
   * @param ttl Time-to-live for load balance request. If negative, request is 
withdrawn
   * @throws IOException e
   */
  public void loadBalanceTable(final byte [][] tableNames, long ttl) throws 
IOException {
{code}

Our production environment has 150 to 300 tables. We run flow sequentially. 
Each flow creates about 10 new tables.
The above API would allow load balancer to distribute hot (recently split) 
regions off certain region server(s).

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988229#action_12988229
 ] 

Matt Corgan commented on HBASE-3373:


Gotcha.  I guess I was thinking of it more as a quick upgrade to the current 
load balancer which only looks at region count.  We store a lot of time series 
data, and regions that split were left on the same server while it moved cold 
regions off.  I wrote a little client side consistent hashing balancer that 
solved the problem in our case, but there are definitely better ways.  
Consistent hashing also binds regions to severs across cluster restarts which 
helps keep regions near their last major compacted hdfs file.

Whatever balancing scheme you do use, don't you need some starting point for 
randomly distributing the regions?  If no other data is available or you need a 
tie breaker, maybe consistent hashing is better than round robin or purely 
random placement.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Jonathan Gray (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988221#action_12988221
 ] 

Jonathan Gray commented on HBASE-3373:
--

I think consistent hashing would be a major step backwards for us and 
unnecessary because there is no cost of moving bits around in HBase.  The 
primary benefit of consistent hashing is that it reduces the amount of data you 
have to physically move around.  Because of our use of HDFS, we never have to 
move physical data around.

In your benefit list, we are already implementing almost all of these features, 
or if not, it is possible in the current architecture.  In addition, our 
architecture is extremely flexible and we can do all kinds of interesting load 
balancing techniques related to actual load profiles not just #s of 
shards/buckets as we do today or as would be done with consistent hashing.

> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3373) Allow regions of specific table to be load-balanced

2011-01-28 Thread Matt Corgan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988214#action_12988214
 ] 

Matt Corgan commented on HBASE-3373:


Have you guys considered using a consistent hashing method to choose which 
server a region belongs to?  You would create ~50 buckets for each server by 
hashing serverName_port_bucketNum, and then hash the start key of each region 
into the buckets.

There are a few benefits:

* when you add a server it takes an equal load from all existing servers
* if you remove a server it distributes its regions equally to the remaining 
servers
* adding a server does not cause all regions to shuffle like round robin 
assignment would
* assignment is nearly random, but repeatable, so no hot spots
* when a region splits the front half will stay on the same server, but the 
back half will usually be sent to another server

And a few drawbacks:

* each server wouldn't end up with exactly the same number of regions, but they 
would be close
* if a hot spot does end up developing, you can't do anything about it, at 
least not unless it supported a list of manual overrides



> Allow regions of specific table to be load-balanced
> ---
>
> Key: HBASE-3373
> URL: https://issues.apache.org/jira/browse/HBASE-3373
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
> Fix For: 0.92.0
>
>
> From our experience, cluster can be well balanced and yet, one table's 
> regions may be badly concentrated on few region servers.
> For example, one table has 839 regions (380 regions at time of table 
> creation) out of which 202 are on one server.
> It would be desirable for load balancer to distribute regions for specified 
> tables evenly across the cluster. Each of such tables has number of regions 
> many times the cluster size.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Updated: (HBASE-3305) Allow round-robin distribution for table created with multiple regions

2011-01-28 Thread Jonathan Gray (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3305?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Gray updated HBASE-3305:
-

Fix Version/s: 0.92.0
   0.90.1

> Allow round-robin distribution for table created with multiple regions
> --
>
> Key: HBASE-3305
> URL: https://issues.apache.org/jira/browse/HBASE-3305
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.20.6
>Reporter: Ted Yu
>Assignee: Ted Yu
> Fix For: 0.90.1, 0.92.0
>
> Attachments: hbase-3305-array.patch, 
> hbase-3305-default-round-robin.patch, hbase-3305-round-robin-unit-test.patch, 
> hbase-3305.patch
>
>
> We can distribute the initial regions created for a new table in round-robin 
> fashion.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.



[jira] Commented: (HBASE-3330) Add publish of a snapshot to apache repo to our pom

2011-01-28 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3330?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12988131#action_12988131
 ] 

stack commented on HBASE-3330:
--

I opened https://issues.apache.org/jira/browse/INFRA-3398.  Its as though the 
'staging repository' filter is not working and catching the release SNAPSHOT 
moving it aside to staging.

> Add publish of a snapshot to apache repo to our pom
> ---
>
> Key: HBASE-3330
> URL: https://issues.apache.org/jira/browse/HBASE-3330
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
> Fix For: 0.90.1
>
>
> See here 
> http://www.apache.org/dev/publishing-maven-artifacts.html#publish-snapshot

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.