[jira] [Updated] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5268:
-

Attachment: 5268.txt

Here's a patch. The bulk is testing.

During testing with deleted marker types I found one strange scenario:
Say you
# put columns 123, 1234, 12345
# then delete with prefix 123
# then put column 123 again
# now delete 123 with a normal column marker

Now what happens is that the ScanDeleteTracker sees the normal column delete 
marker first, then it will see the new put for column 123. Now it will conclude 
that it is done with all versions of column 123, and thus seek ahead to the 
next column. During that process the prefix marker with prefix 123 is also 
skipped. And hence 1234 and 12345 are no longer marked as deleted.

This only happens in exactly this scenario.

I cannot fix this without de-optimizing column delete markers or adding 
complicated logic to sort prefix delete marker always before all prefixes they 
affect regardless of the timestamp.

I added this scenario as a unit test.

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191997#comment-13191997
 ] 

Lars Hofhansl edited comment on HBASE-5268 at 1/24/12 8:14 AM:
---

Here's a patch. The bulk is testing.

During testing with delete marker types I found one strange scenario:
Say you
# put columns 123, 1234, 12345
# then delete with prefix 123
# then put column 123 again
# now delete 123 with a normal column marker

Now what happens is that the ScanDeleteTracker sees the normal column delete 
marker first, then it will see the new put for column 123. Now it will conclude 
that it is done with all versions of column 123, and thus seeks ahead to the 
next column. During that process the prefix marker with prefix 123 is also 
skipped. And hence 1234 and 12345 are no longer marked as deleted.

This only happens in exactly this scenario.

I cannot fix this without de-optimizing column delete markers or adding 
complicated logic to sort prefix delete marker always before all prefixes they 
affect regardless of the timestamp.

I added this scenario as a unit test.

  was (Author: lhofhansl):
Here's a patch. The bulk is testing.

During testing with deleted marker types I found one strange scenario:
Say you
# put columns 123, 1234, 12345
# then delete with prefix 123
# then put column 123 again
# now delete 123 with a normal column marker

Now what happens is that the ScanDeleteTracker sees the normal column delete 
marker first, then it will see the new put for column 123. Now it will conclude 
that it is done with all versions of column 123, and thus seek ahead to the 
next column. During that process the prefix marker with prefix 123 is also 
skipped. And hence 1234 and 12345 are no longer marked as deleted.

This only happens in exactly this scenario.

I cannot fix this without de-optimizing column delete markers or adding 
complicated logic to sort prefix delete marker always before all prefixes they 
affect regardless of the timestamp.

I added this scenario as a unit test.
  
> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192016#comment-13192016
 ] 

Lars Hofhansl commented on HBASE-5268:
--

When I muck with KeyValue.KeyComparator to always sort prefix delete marker 
before all columns with the same prefix regardless of TS, then I see the 
opposite scenario that in the setup above the column delete marker is ignored 
and hence the newly put version of column 123 is visible again. This is the 
because the column marker is considered less specific and hence won't override 
the column prefix marker. Arghh :)

So ScanDeleteTracker would need to be completely refactored in order to keep 
prefix marker separate from column and version markers.
The question is: Is that worth the risk, added code entropy, and effort, or can 
we document that prefix delete markers should not be used together with column 
delete markers (where the column markers column is identical to the prefix 
marker's column prefix, which does not make much sense anyway)?


> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191958#comment-13191958
 ] 

Lars Hofhansl edited comment on HBASE-5268 at 1/24/12 8:42 AM:
---

@Stack... Here's an example.
Say you can columns X100, X101, X102, ..., X199, and X200, X201, X202, ..., 
X299, X300, ..., in a row R, all in the same column family.
(I'd have this if I modeled rows inside HBase rows by using a prefix to 
identify the inner rows.)

Now I want to be able to delete all columns that start with X1. Without a 
prefix delete marker I'd have to place 100 column delete markers. With a prefix 
marker I can just place one marker for the X1 prefix.
If the prefix is empty, this would degenerate into the same behavior as a 
family marker (albeit less efficient).


  was (Author: lhofhansl):
@Stack... Here's an example.
Say you can columns X100, X101, X102, ..., X199, and X200, X201, X202, ..., 
X299, X300, ..., in a row R, all in the same column family.
(I'd have this if I modeled rows inside HBase by using a prefix to identify the 
inner rows.)

Now I want to be able to delete all columns that start with X1. Without a 
prefix delete marker I'd have to place 100 column delete markers. With a prefix 
marker I can just place one marker for the X1 prefix.
If the prefix is empty, this would degenerate into the same behavior as a 
family marker (albeit less efficient).

  
> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13191958#comment-13191958
 ] 

Lars Hofhansl edited comment on HBASE-5268 at 1/24/12 8:44 AM:
---

@Stack... Here's an example.
Say you have columns X100, X101, X102, ..., X199, and X200, X201, X202, ..., 
X299, X300, ..., in a row R, all in the same column family.
(I'd have this if I modeled rows inside HBase rows by using a prefix to 
identify the inner rows.)

Now I want to be able to delete all columns that start with X1. Without a 
prefix delete marker I'd have to place 100 column delete markers. With a prefix 
marker I can just place one marker for the X1 prefix.
If the prefix is empty, this would degenerate into the same behavior as a 
family marker (albeit less efficient).


  was (Author: lhofhansl):
@Stack... Here's an example.
Say you can columns X100, X101, X102, ..., X199, and X200, X201, X202, ..., 
X299, X300, ..., in a row R, all in the same column family.
(I'd have this if I modeled rows inside HBase rows by using a prefix to 
identify the inner rows.)

Now I want to be able to delete all columns that start with X1. Without a 
prefix delete marker I'd have to place 100 column delete markers. With a prefix 
marker I can just place one marker for the X1 prefix.
If the prefix is empty, this would degenerate into the same behavior as a 
family marker (albeit less efficient).

  
> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192016#comment-13192016
 ] 

Lars Hofhansl edited comment on HBASE-5268 at 1/24/12 8:43 AM:
---

When I muck with KeyValue.KeyComparator to always sort prefix delete marker 
before all columns with the same prefix regardless of TS, then I see the 
opposite scenario that in the setup above the column delete marker is ignored 
and hence the newly put version of column 123 is visible again. This is the 
because the column marker is considered less specific and hence won't override 
the column prefix marker, even though it is newer. Arghh :)

So ScanDeleteTracker would need to be completely refactored in order to keep 
prefix markers separate from column and version markers.
The question is: Is that worth the risk, added code entropy, and effort, or can 
we document that prefix delete markers should not be used together with column 
delete markers (where the column markers column is identical to the prefix 
marker's column prefix, which does not make much sense anyway)?


  was (Author: lhofhansl):
When I muck with KeyValue.KeyComparator to always sort prefix delete marker 
before all columns with the same prefix regardless of TS, then I see the 
opposite scenario that in the setup above the column delete marker is ignored 
and hence the newly put version of column 123 is visible again. This is the 
because the column marker is considered less specific and hence won't override 
the column prefix marker. Arghh :)

So ScanDeleteTracker would need to be completely refactored in order to keep 
prefix marker separate from column and version markers.
The question is: Is that worth the risk, added code entropy, and effort, or can 
we document that prefix delete markers should not be used together with column 
delete markers (where the column markers column is identical to the prefix 
marker's column prefix, which does not make much sense anyway)?

  
> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192024#comment-13192024
 ] 

jirapos...@reviews.apache.org commented on HBASE-4608:
--



bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 34
bq.  > 
bq.  >
bq.  > '/less' should be removed.

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 42
bq.  > 
bq.  >
bq.  > javadoc needs update.

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 43
bq.  > 
bq.  >
bq.  > Either remove the word 'a' or change it into 'an'

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 78
bq.  > 
bq.  >
bq.  > Please change ourKV to keyval or something similar.

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 82
bq.  > 
bq.  >
bq.  > Update javadoc to match the context parameter.

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/CompressedKeyValue.java, 
line 94
bq.  > 
bq.  >
bq.  > I think adding 'the effect of compression would be good' at the end 
would make the sentence more easily understandable.

fixed


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java,
 line 60
bq.  > 
bq.  >
bq.  > Remove whitespace.

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java,
 line 154
bq.  > 
bq.  >
bq.  > This javadoc is more suitable for the init() method.

fixed.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogReader.java,
 line 186
bq.  > 
bq.  >
bq.  > Please include e in new IOE.

fixed. I assume you mean store it as the cause.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/SequenceFileLogWriter.java,
 line 93
bq.  > 
bq.  >
bq.  > Please include e in the new IOE.

fixed above.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java, line 
2
bq.  > 
bq.  >
bq.  > Please remove year.

fixed above.


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java, line 
35
bq.  > 
bq.  >
bq.  > Please put this line at the end of line 34.

fixed


bq.  On 2012-01-20 22:56:07, Ted Yu wrote:
bq.  > 
src/main/java/org/apache/hadoop/hbase/regionserver/wal/WALDictionary.java, line 
53
bq.  > 
bq.  >
bq.  > 'ad' should be 'add'

fixed.


- Li


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/#review4508
---


On 2012-01-13 01:37:35, Li Pi wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/2740/
bq.  ---
bq.  
bq.  (Updated 2012-01-13 01:37:35)
bq.  
bq.  
bq.  Review request for hbase, Eli Collins and Todd Lipcon.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  HLog compression. Has unit tests and a command line tool for 
compressing/decompressing.
bq.  
bq.  
bq.  This addresses bug HBase-4608.
bq.  https://issues.apache.org/jira/browse/HBase-4608
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq

[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread jirapos...@reviews.apache.org (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192025#comment-13192025
 ] 

jirapos...@reviews.apache.org commented on HBASE-4608:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/2740/
---

(Updated 2012-01-24 09:00:37.768707)


Review request for hbase, Eli Collins and Todd Lipcon.


Summary
---

HLog compression. Has unit tests and a command line tool for 
compressing/decompressing.


This addresses bug HBase-4608.
https://issues.apache.org/jira/browse/HBase-4608


Diffs (updated)
-

  CHANGES.txt 1d7238e 
  bin/hbase 350abef 
  bin/hbase-daemon.sh 5c42ac1 
  dev-support/findHangingTest.sh PRE-CREATION 
  pom.xml 6566a1c 
  src/docbkx/book.xml c67ca06 
  src/docbkx/configuration.xml 7fd90e7 
  src/docbkx/ops_mgt.xml f93c9f2 
  src/docbkx/performance.xml e61248f 
  src/docbkx/preface.xml 10fa755 
  src/docbkx/troubleshooting.xml 0b7c93a 
  src/docbkx/upgrading.xml c0642f5 
  src/main/jamon/org/apache/hbase/tmpl/regionserver/RSStatusTmpl.jamon 24caabd 
  src/main/java/org/apache/hadoop/hbase/HBaseConfiguration.java 0477be8 
  src/main/java/org/apache/hadoop/hbase/HConstants.java 5120a3c 
  src/main/java/org/apache/hadoop/hbase/catalog/CatalogTracker.java 8ec5042 
  src/main/java/org/apache/hadoop/hbase/client/ClientScanner.java 6cdeec1 
  src/main/java/org/apache/hadoop/hbase/client/ConnectionUtils.java 
PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/client/Delete.java 51bbc63 
  src/main/java/org/apache/hadoop/hbase/client/HBaseAdmin.java 8cd9bd0 
  src/main/java/org/apache/hadoop/hbase/client/HConnection.java 0e78d96 
  src/main/java/org/apache/hadoop/hbase/client/HConnectionManager.java 852a810 
  src/main/java/org/apache/hadoop/hbase/client/HTable.java 839d79b 
  src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java 0bc9577 
  src/main/java/org/apache/hadoop/hbase/client/MetaScanner.java 4135e55 
  src/main/java/org/apache/hadoop/hbase/client/RowMutation.java PRE-CREATION 
  src/main/java/org/apache/hadoop/hbase/client/ServerCallable.java 9b568e3 
  
src/main/java/org/apache/hadoop/hbase/client/coprocessor/AggregationClient.java 
0d4a9e4 
  
src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateImplementation.java 
ba3414d 
  src/main/java/org/apache/hadoop/hbase/coprocessor/AggregateProtocol.java 
f25ba11 
  src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java 
b47423c 
  src/main/java/org/apache/hadoop/hbase/io/HbaseObjectWritable.java 9002a0f 
  src/main/java/org/apache/hadoop/hbase/ipc/ExecRPCInvoker.java 3ad6cd5 
  src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 07ddbca 
  src/main/java/org/apache/hadoop/hbase/ipc/HRegionInterface.java 4327a44 
  src/main/java/org/apache/hadoop/hbase/ipc/Invocation.java 39c73f5 
  src/main/java/org/apache/hadoop/hbase/mapreduce/LoadIncrementalHFiles.java 
bd574b2 
  src/main/java/org/apache/hadoop/hbase/mapreduce/TableInputFormat.java 3dcbf74 
  src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java e6f8a6e 
  src/main/java/org/apache/hadoop/hbase/master/HMaster.java cb2f084 
  src/main/java/org/apache/hadoop/hbase/master/LoadBalancerFactory.java 89685bb 
  src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java 3938fa7 
  src/main/java/org/apache/hadoop/hbase/master/ServerManager.java 9de1784 
  src/main/java/org/apache/hadoop/hbase/master/SplitLogManager.java 667a8b1 
  src/main/java/org/apache/hadoop/hbase/master/handler/ClosedRegionHandler.java 
2dfc3e7 
  
src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java 
2dd497b 
  src/main/java/org/apache/hadoop/hbase/monitoring/MonitoredRPCHandlerImpl.java 
493dcdb 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java fb4ec05 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java 3917d40 
  src/main/java/org/apache/hadoop/hbase/regionserver/HRegionThriftServer.java 
18b6c13 
  src/main/java/org/apache/hadoop/hbase/regionserver/KeyValueHeap.java c840e7c 
  src/main/java/org/apache/hadoop/hbase/regionserver/OperationStatus.java 
b6f7456 
  src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java 
7cee17c 
  src/main/java/org/apache/hadoop/hbase/regionserver/SplitRequest.java 41f5dff 
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java b928731 
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreScanner.java bd6f70d 
  
src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseMetaHandler.java
 e8e95ed 
  
src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRegionHandler.java
 a25ca32 
  
src/main/java/org/apache/hadoop/hbase/regionserver/handler/CloseRootHandler.java
 fa38ad6 
  
src/main/java/org/apache/hadoop/hb

[jira] [Updated] (HBASE-5267) Add a configuration to disable the slab cache by default

2012-01-24 Thread Li Pi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-5267:
-

Attachment: 5267.txt

> Add a configuration to disable the slab cache by default
> 
>
> Key: HBASE-5267
> URL: https://issues.apache.org/jira/browse/HBASE-5267
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Li Pi
>Priority: Blocker
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5267.txt
>
>
> From what I commented at the tail of HBASE-4027:
> {quote}
> I changed the release note, the patch doesn't have a "hbase.offheapcachesize" 
> configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize 
> (which is actually a big problem when you consider this: 
> http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). 
> {quote}
> We need to add hbase.offheapcachesize and set it to false by default.
> Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5267) Add a configuration to disable the slab cache by default

2012-01-24 Thread Li Pi (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Pi updated HBASE-5267:
-

Status: Patch Available  (was: Open)

> Add a configuration to disable the slab cache by default
> 
>
> Key: HBASE-5267
> URL: https://issues.apache.org/jira/browse/HBASE-5267
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Li Pi
>Priority: Blocker
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5267.txt
>
>
> From what I commented at the tail of HBASE-4027:
> {quote}
> I changed the release note, the patch doesn't have a "hbase.offheapcachesize" 
> configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize 
> (which is actually a big problem when you consider this: 
> http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). 
> {quote}
> We need to add hbase.offheapcachesize and set it to false by default.
> Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default

2012-01-24 Thread Li Pi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192029#comment-13192029
 ] 

Li Pi commented on HBASE-5267:
--

Fixed. Block cache now defaults to disabled.

> Add a configuration to disable the slab cache by default
> 
>
> Key: HBASE-5267
> URL: https://issues.apache.org/jira/browse/HBASE-5267
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Li Pi
>Priority: Blocker
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5267.txt
>
>
> From what I commented at the tail of HBASE-4027:
> {quote}
> I changed the release note, the patch doesn't have a "hbase.offheapcachesize" 
> configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize 
> (which is actually a big problem when you consider this: 
> http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). 
> {quote}
> We need to add hbase.offheapcachesize and set it to false by default.
> Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread Li Pi (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192033#comment-13192033
 ] 

Li Pi commented on HBASE-4608:
--

@Lars

Unless we know when exactly the dictionary is flushed, we can't rebuild the 
original HLog, can't we?

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default

2012-01-24 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192051#comment-13192051
 ] 

Hadoop QA commented on HBASE-5267:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511652/5267.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 84 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.coprocessor.TestMasterObserver

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/846//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/846//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/846//console

This message is automatically generated.

> Add a configuration to disable the slab cache by default
> 
>
> Key: HBASE-5267
> URL: https://issues.apache.org/jira/browse/HBASE-5267
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Li Pi
>Priority: Blocker
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5267.txt
>
>
> From what I commented at the tail of HBASE-4027:
> {quote}
> I changed the release note, the patch doesn't have a "hbase.offheapcachesize" 
> configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize 
> (which is actually a big problem when you consider this: 
> http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). 
> {quote}
> We need to add hbase.offheapcachesize and set it to false by default.
> Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5269) IllegalMonitorStateException while retryin HLog split in 0.90 branch.

2012-01-24 Thread ramkrishna.s.vasudevan (Created) (JIRA)
IllegalMonitorStateException while retryin HLog split in 0.90 branch.
-

 Key: HBASE-5269
 URL: https://issues.apache.org/jira/browse/HBASE-5269
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.90.5
Reporter: ramkrishna.s.vasudevan
Assignee: ramkrishna.s.vasudevan
 Fix For: 0.90.6


As part of HBASE-5137 fix this bug is introduced.  The splitLogLock is released 
in the finally block inside the do-while loop. So when the loop executes second 
time the unlock of the splitLogLock throws Illegal Monitor Exception. 



--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5269) IllegalMonitorStateException while retryin HLog split in 0.90 branch.

2012-01-24 Thread ramkrishna.s.vasudevan (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ramkrishna.s.vasudevan updated HBASE-5269:
--

Attachment: HBASE-5269.patch

> IllegalMonitorStateException while retryin HLog split in 0.90 branch.
> -
>
> Key: HBASE-5269
> URL: https://issues.apache.org/jira/browse/HBASE-5269
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6
>
> Attachments: HBASE-5269.patch
>
>
> As part of HBASE-5137 fix this bug is introduced.  The splitLogLock is 
> released in the finally block inside the do-while loop. So when the loop 
> executes second time the unlock of the splitLogLock throws Illegal Monitor 
> Exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192179#comment-13192179
 ] 

Mikhail Bautin commented on HBASE-5262:
---

I was thinking of a lightly-loaded system table different from .META.. This way 
it is easier to control this feature, and we can even make it optional based on 
the configuration. Storing event log data in HBase makes sense to me because we 
don't expect a huge number of records compared to the user's data itself, and 
we probably don't want to implement another logging framework with HDFS file 
management, cleaning up old files, etc. The multiple writers to one file 
question is also there. If we store this data in HBase, then yes, we can use 
TTL to set event log retention to e.g. one month or so, or make it configurable.

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192204#comment-13192204
 ] 

Zhihong Yu commented on HBASE-5231:
---

@Stack:
The scope of original getAssignments() is cluster-wide.
If Master calls getAssignments() directly, its signature would be changed to:
{code}
Map>> getAssignments()
{code}
This would imply a per-table assignment.

Please confirm this is what you wanted.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5179) Concurrent processing of processFaileOver and ServerShutdownHandler may cause region to be assigned before log splitting is completed, causing data loss

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192208#comment-13192208
 ] 

Zhihong Yu commented on HBASE-5179:
---

I created HBASE-5270.

Let's integrate patch v18 to 0.90 for this JIRA.

> Concurrent processing of processFaileOver and ServerShutdownHandler may cause 
> region to be assigned before log splitting is completed, causing data loss
> 
>
> Key: HBASE-5179
> URL: https://issues.apache.org/jira/browse/HBASE-5179
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.2
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5179-90.txt, 5179-90v10.patch, 5179-90v11.patch, 
> 5179-90v12.patch, 5179-90v13.txt, 5179-90v14.patch, 5179-90v15.patch, 
> 5179-90v16.patch, 5179-90v17.txt, 5179-90v18.txt, 5179-90v2.patch, 
> 5179-90v3.patch, 5179-90v4.patch, 5179-90v5.patch, 5179-90v6.patch, 
> 5179-90v7.patch, 5179-90v8.patch, 5179-90v9.patch, 5179-92v17.patch, 
> 5179-v11-92.txt, 5179-v11.txt, 5179-v2.txt, 5179-v3.txt, 5179-v4.txt, 
> Errorlog, hbase-5179.patch, hbase-5179v10.patch, hbase-5179v12.patch, 
> hbase-5179v17.patch, hbase-5179v5.patch, hbase-5179v6.patch, 
> hbase-5179v7.patch, hbase-5179v8.patch, hbase-5179v9.patch
>
>
> If master's processing its failover and ServerShutdownHandler's processing 
> happen concurrently, it may appear following  case.
> 1.master completed splitLogAfterStartup()
> 2.RegionserverA restarts, and ServerShutdownHandler is processing.
> 3.master starts to rebuildUserRegions, and RegionserverA is considered as 
> dead server.
> 4.master starts to assign regions of RegionserverA because it is a dead 
> server by step3.
> However, when doing step4(assigning region), ServerShutdownHandler may be 
> doing split log, Therefore, it may cause data loss.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-01-24 Thread Zhihong Yu (Created) (JIRA)
Handle potential data loss due to concurrent processing of processFaileOver and 
ServerShutdownHandler
-

 Key: HBASE-5270
 URL: https://issues.apache.org/jira/browse/HBASE-5270
 Project: HBase
  Issue Type: Sub-task
Reporter: Zhihong Yu
 Fix For: 0.94.0, 0.92.1


This JIRA continues the effort from HBASE-5179. Starting with Stack's comments 
about patches for 0.92 and TRUNK:

Reviewing 0.92v17

isDeadServerInProgress is a new public method in ServerManager but it does not 
seem to be used anywhere.

Does isDeadRootServerInProgress need to be public? Ditto for meta version.

This method param names are not right 'definitiveRootServer'; what is meant by 
definitive? Do they need this qualifier?

Is there anything in place to stop us expiring a server twice if its carrying 
root and meta?

What is difference between asking assignment manager isCarryingRoot and this 
variable that is passed in? Should be doc'd at least. Ditto for meta.

I think I've asked for this a few times - onlineServers needs to be 
explained... either in javadoc or in comment. This is the param passed into 
joinCluster. How does it arise? I think I know but am unsure. God love the poor 
noob that comes awandering this code trying to make sense of it all.

It looks like we get the list by trawling zk for regionserver znodes that have 
not checked in. Don't we do this operation earlier in master setup? Are we 
doing it again here?

Though distributed split log is configured, we will do in master single process 
splitting under some conditions with this patch. Its not explained in code why 
we would do this. Why do we think master log splitting 'high priority' when it 
could very well be slower. Should we only go this route if distributed 
splitting is not going on. Do we know if concurrent distributed log splitting 
and master splitting works?

Why would we have dead servers in progress here in master startup? Because a 
servershutdownhandler fired?

This patch is different to the patch for 0.90. Should go into trunk first with 
tests, then 0.92. Should it be in this issue? This issue is really hard to 
follow now. Maybe this issue is for 0.90.x and new issue for more work on this 
trunk patch?

This patch needs to have the v18 differences applied.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5210) HFiles are missing from an incremental load

2012-01-24 Thread Lawrence Simpson (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5210?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192217#comment-13192217
 ] 

Lawrence Simpson commented on HBASE-5210:
-

@Todd:
Two questions about your solution:
1. If we were to form a file name from just the numeric digits of the task 
attempt ID, that would be 23 digits.  As I look at the file names for HBase 
tables, they seem to be 18-19 digits long.  Do you know if there are any 
assumptions made in other HBase code about the length of file names for store 
files?
2. In the unlikely event that there was a name conflict with an HFile created 
by a reducer, what should happen then?  (The job number looks like it might 
roll at 1 jobs - I don't know if anyone has gotten that far without 
restarting Map/Reduce.)  

It still seems to me that the safest solution is a change to HFileOutputFormat 
to use a new output committer class that adds rename logic to 
moveTaskOutputs().  These changes could be implemented strictly in the HBase 
code tree without having to involve the underlying Hadoop implementation. 

> HFiles are missing from an incremental load
> ---
>
> Key: HBASE-5210
> URL: https://issues.apache.org/jira/browse/HBASE-5210
> Project: HBase
>  Issue Type: Bug
>  Components: mapreduce
>Affects Versions: 0.90.2
> Environment: HBase 0.90.2 with Hadoop-0.20.2 (with durable sync).  
> RHEL 2.6.18-164.15.1.el5.  4 node cluster (1 master, 3 slaves)
>Reporter: Lawrence Simpson
> Attachments: HBASE-5210-crazy-new-getRandomFilename.patch
>
>
> We run an overnight map/reduce job that loads data from an external source 
> and adds that data to an existing HBase table.  The input files have been 
> loaded into hdfs.  The map/reduce job uses the HFileOutputFormat (and the 
> TotalOrderPartitioner) to create HFiles which are subsequently added to the 
> HBase table.  On at least two separate occasions (that we know of), a range 
> of output would be missing for a given day.  The range of keys for the 
> missing values corresponded to those of a particular region.  This implied 
> that a complete HFile somehow went missing from the job.  Further 
> investigation revealed the following:
>  * Two different reducers (running in separate JVMs and thus separate class 
> loaders)
>  * in the same server can end up using the same file names for their
>  * HFiles.  The scenario is as follows:
>  *1.  Both reducers start near the same time.
>  *2.  The first reducer reaches the point where it wants to write its 
> first file.
>  *3.  It uses the StoreFile class which contains a static Random 
> object 
>  *which is initialized by default using a timestamp.
>  *4.  The file name is generated using the random number generator.
>  *5.  The file name is checked against other existing files.
>  *6.  The file is written into temporary files in a directory named
>  *after the reducer attempt.
>  *7.  The second reduce task reaches the same point, but its 
> StoreClass
>  *(which is now in the file system's cache) gets loaded within the
>  *time resolution of the OS and thus initializes its Random()
>  *object with the same seed as the first task.
>  *8.  The second task also checks for an existing file with the name
>  *generated by the random number generator and finds no conflict
>  *because each task is writing files in its own temporary folder.
>  *9.  The first task finishes and gets its temporary files committed
>  *to the "real" folder specified for output of the HFiles.
>  * 10.The second task then reaches its own conclusion and commits its
>  *files (moveTaskOutputs).  The released Hadoop code just 
> overwrites
>  *any files with the same name.  No warning messages or anything.
>  *The first task's HFiles just go missing.
>  * 
>  *  Note:  The reducers here are NOT different attempts at the same 
>  *reduce task.  They are different reduce tasks so data is
>  *really lost.
> I am currently testing a fix in which I have added code to the Hadoop 
> FileOutputCommitter.moveTaskOutputs method to check for a conflict with
> an existing file in the final output folder and to rename the HFile if
> needed.  This may not be appropriate for all uses of FileOutputFormat.
> So I have put this into a new class which is then used by a subclass of
> HFileOutputFormat.  Subclassing of FileOutputCommitter itself was a bit 
> more of a problem due to private declarations.
> I don't know if my approach is the best fix for the problem.  If someone
> more knowledgeable than myself deems that it is, I will be happy to share
> what I hav

[jira] [Commented] (HBASE-5269) IllegalMonitorStateException while retryin HLog split in 0.90 branch.

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192229#comment-13192229
 ] 

Zhihong Yu commented on HBASE-5269:
---

Patch looks good.

> IllegalMonitorStateException while retryin HLog split in 0.90 branch.
> -
>
> Key: HBASE-5269
> URL: https://issues.apache.org/jira/browse/HBASE-5269
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6
>
> Attachments: HBASE-5269.patch
>
>
> As part of HBASE-5137 fix this bug is introduced.  The splitLogLock is 
> released in the finally block inside the do-while loop. So when the loop 
> executes second time the unlock of the splitLogLock throws Illegal Monitor 
> Exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5269) IllegalMonitorStateException while retryin HLog split in 0.90 branch.

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192247#comment-13192247
 ] 

stack commented on HBASE-5269:
--

+1

> IllegalMonitorStateException while retryin HLog split in 0.90 branch.
> -
>
> Key: HBASE-5269
> URL: https://issues.apache.org/jira/browse/HBASE-5269
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6
>
> Attachments: HBASE-5269.patch
>
>
> As part of HBASE-5137 fix this bug is introduced.  The splitLogLock is 
> released in the finally block inside the do-while loop. So when the loop 
> executes second time the unlock of the splitLogLock throws Illegal Monitor 
> Exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5269) IllegalMonitorStateException while retryin HLog split in 0.90 branch.

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5269?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192248#comment-13192248
 ] 

stack commented on HBASE-5269:
--

+1

> IllegalMonitorStateException while retryin HLog split in 0.90 branch.
> -
>
> Key: HBASE-5269
> URL: https://issues.apache.org/jira/browse/HBASE-5269
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.5
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.6
>
> Attachments: HBASE-5269.patch
>
>
> As part of HBASE-5137 fix this bug is introduced.  The splitLogLock is 
> released in the finally block inside the do-while loop. So when the loop 
> executes second time the unlock of the splitLogLock throws Illegal Monitor 
> Exception. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192251#comment-13192251
 ] 

stack commented on HBASE-5231:
--

It does not necessarily by table; could be by server.

Whats this:

{code}
+result.put("ensemble", getAssignments());
{code}

Whats 'ensemble'?

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192252#comment-13192252
 ] 

Lars Hofhansl commented on HBASE-4608:
--

The flush will place a special WAL entry. See HLog.completeCacheFlush(...).
The compressor could take this as a flag to reset the dictionary.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192253#comment-13192253
 ] 

stack commented on HBASE-5262:
--

@Mikhail Sounds good to me

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192255#comment-13192255
 ] 

Lars Hofhansl commented on HBASE-5268:
--

I'll work on a fully correct patch today, and then we can decide whether it is 
worth the extra complexity. I find this first patch compelling, because (1) is 
very small and easy to verify and (2) it keeps column prefix delete markers 
sorted (by timestamp) with the KVs they affect.
In the bigger patch, neither will be try anymore.


> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192257#comment-13192257
 ] 

stack commented on HBASE-5267:
--

lgtm

> Add a configuration to disable the slab cache by default
> 
>
> Key: HBASE-5267
> URL: https://issues.apache.org/jira/browse/HBASE-5267
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Li Pi
>Priority: Blocker
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5267.txt
>
>
> From what I commented at the tail of HBASE-4027:
> {quote}
> I changed the release note, the patch doesn't have a "hbase.offheapcachesize" 
> configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize 
> (which is actually a big problem when you consider this: 
> http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). 
> {quote}
> We need to add hbase.offheapcachesize and set it to false by default.
> Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Issue Comment Edited] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Issue Comment Edited) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192255#comment-13192255
 ] 

Lars Hofhansl edited comment on HBASE-5268 at 1/24/12 4:27 PM:
---

I'll work on a fully correct patch today, and then we can decide whether it is 
worth the extra complexity. I find this first patch compelling, because (1) is 
very small and easy to verify and (2) it keeps column prefix delete markers 
sorted (by timestamp) with the KVs they affect.
In the bigger patch, neither will be true anymore.


  was (Author: lhofhansl):
I'll work on a fully correct patch today, and then we can decide whether it 
is worth the extra complexity. I find this first patch compelling, because (1) 
is very small and easy to verify and (2) it keeps column prefix delete markers 
sorted (by timestamp) with the KVs they affect.
In the bigger patch, neither will be try anymore.

  
> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Ghais Issa (Created) (JIRA)
Result.getValue and Result.getColumnLatest return the wrong column.
---

 Key: HBASE-5271
 URL: https://issues.apache.org/jira/browse/HBASE-5271
 Project: HBase
  Issue Type: Bug
  Components: client
Affects Versions: 0.90.5
Reporter: Ghais Issa


In the following example result.getValue returns the wrong column

KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
Bytes.toBytes("2"), Bytes.toBytes(7L));
Result result = new Result(new KeyValue[] { kv });
System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
Bytes.toBytes("2"; //prints 7.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5204) Backward compatibility fixes for 0.92

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192267#comment-13192267
 ] 

stack commented on HBASE-5204:
--

Yes, agree, adding patch to trunk would be silly.  Lets drop it.  The codes 
should never have been changed.  Hopefully your warning...

{code}
1231985  stack 

1231985  stack // WARNING: Please do not insert, remove or swap any 
line in this static  //
1231985  stack // block.  Doing so would change or shift all the codes 
used to serialize //
1231985  stack // objects, which makes backwards compatibility very 
hard for clients.//
1231985  stack // New codes should always be added at the end. Code 
removal is   //
1231985  stack // discouraged because code is a short now.  
 //
1231985  stack 

{code}

 will help w/ that.

Can we check in an asynchbase unit test that exercises all apis you need so we 
fail fast in case we mess up again (HBase has 16 committers now and hard to 
have them all on message)

> Backward compatibility fixes for 0.92
> -
>
> Key: HBASE-5204
> URL: https://issues.apache.org/jira/browse/HBASE-5204
> Project: HBase
>  Issue Type: Bug
>  Components: ipc
>Reporter: Benoit Sigoure
>Assignee: Benoit Sigoure
>Priority: Blocker
>  Labels: backwards-compatibility
> Fix For: 0.92.0
>
> Attachments: 
> 0001-Add-some-backward-compatible-support-for-reading-old.patch, 
> 0002-Make-sure-that-a-connection-always-uses-a-protocol.patch, 
> 0003-Change-the-code-used-when-serializing-HTableDescript.patch, 5204-92.txt, 
> 5204-trunk.txt, 5204.addendum
>
>
> Attached are 3 patches that are necessary to allow compatibility between 
> HBase 0.90.x (and previous releases) and HBase 0.92.0.
> First of all, I'm well aware that 0.92.0 RC4 has been thumbed up by a lot of 
> people and would probably wind up being released as 0.92.0 tomorrow, so I 
> sincerely apologize for creating this issue so late in the process.  I spent 
> a lot of time trying to work around the quirks of 0.92 but once I realized 
> that with a few very quasi-trivial changes compatibility would be made 
> significantly easier, I immediately sent these 3 patches to Stack, who 
> suggested I create this issue.
> The first patch is required as without it clients sending a 0.90-style RPC to 
> a 0.92-style server causes the server to die uncleanly.  It seems that 0.92 
> ships with {{\-XX:OnOutOfMemoryError="kill \-9 %p"}}, and when a 0.92 server 
> fails to deserialize a 0.90-style RPC, it attempts to allocate a large buffer 
> because it doesn't read fields of 0.90-style RPCs properly.  This allocation 
> attempt immediately triggers an OOME, which causes the JVM to die abruptly of 
> a {{SIGKILL}}.  So whenever a 0.90.x client attempts to connect to HBase, it 
> kills whichever RS is hosting the {{\-ROOT-}} region.
> The second patch fixes a bug introduced by HBASE-2002, which added support 
> for letting clients specify what "protocol" they want to speak.  If a client 
> doesn't properly specify what protocol to use, the connection's {{protocol}} 
> field will be left {{null}}, which causes any subsequent RPC on that 
> connection to trigger an NPE in the server, even though the connection was 
> successfully established from the client's point of view.  The fix is to 
> simply give the connection a default protocol, by assuming the client meant 
> to speak to a RegionServer.
> The third patch fixes an oversight that slipped in HBASE-451, where a change 
> to {{HbaseObjectWritable}} caused all the codes used to serialize 
> {{Writables}} to shift by one.  This was carefully avoided in other changes 
> such as HBASE-1502, which cleanly removed entries for {{HMsg}} and 
> {{HMsg[]}}, so I don't think this breakage in HBASE-451 was intended.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192274#comment-13192274
 ] 

Zhihong Yu commented on HBASE-5231:
---

The String key in the outer map represents table name which could be a place 
holder such as "ensemble".
It shouldn't be server name because at master level, load balancing involves 
more than one server.

"ensemble" means the collection of all the tables.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192280#comment-13192280
 ] 

Zhihong Yu commented on HBASE-5262:
---

One aspect we should consider is the role for opentsdb.
Would there be redundancy between the event log table and what opentsdb stores 
in HBase?

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192282#comment-13192282
 ] 

Zhihong Yu commented on HBASE-5271:
---

Can you write up a unit test which demonstrates this issue ?

Thanks

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192284#comment-13192284
 ] 

Zhihong Yu commented on HBASE-5270:
---

https://reviews.apache.org/r/3601 is created to track the review process for 
this JIRA.

> Handle potential data loss due to concurrent processing of processFaileOver 
> and ServerShutdownHandler
> -
>
> Key: HBASE-5270
> URL: https://issues.apache.org/jira/browse/HBASE-5270
> Project: HBase
>  Issue Type: Sub-task
>  Components: master
>Reporter: Zhihong Yu
> Fix For: 0.94.0, 0.92.1
>
>
> This JIRA continues the effort from HBASE-5179. Starting with Stack's 
> comments about patches for 0.92 and TRUNK:
> Reviewing 0.92v17
> isDeadServerInProgress is a new public method in ServerManager but it does 
> not seem to be used anywhere.
> Does isDeadRootServerInProgress need to be public? Ditto for meta version.
> This method param names are not right 'definitiveRootServer'; what is meant 
> by definitive? Do they need this qualifier?
> Is there anything in place to stop us expiring a server twice if its carrying 
> root and meta?
> What is difference between asking assignment manager isCarryingRoot and this 
> variable that is passed in? Should be doc'd at least. Ditto for meta.
> I think I've asked for this a few times - onlineServers needs to be 
> explained... either in javadoc or in comment. This is the param passed into 
> joinCluster. How does it arise? I think I know but am unsure. God love the 
> poor noob that comes awandering this code trying to make sense of it all.
> It looks like we get the list by trawling zk for regionserver znodes that 
> have not checked in. Don't we do this operation earlier in master setup? Are 
> we doing it again here?
> Though distributed split log is configured, we will do in master single 
> process splitting under some conditions with this patch. Its not explained in 
> code why we would do this. Why do we think master log splitting 'high 
> priority' when it could very well be slower. Should we only go this route if 
> distributed splitting is not going on. Do we know if concurrent distributed 
> log splitting and master splitting works?
> Why would we have dead servers in progress here in master startup? Because a 
> servershutdownhandler fired?
> This patch is different to the patch for 0.90. Should go into trunk first 
> with tests, then 0.92. Should it be in this issue? This issue is really hard 
> to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
> this trunk patch?
> This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192290#comment-13192290
 ] 

Zhihong Yu commented on HBASE-4720:
---

Integrated to TRUNK.

Thanks for the patch Mubarak.

Thanks for the review comments, Andrew.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Ghais Issa (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ghais Issa updated HBASE-5271:
--

Attachment: testGetValue.diff

Added a unit test in the TestResult.java class.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Attachments: testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5267) Add a configuration to disable the slab cache by default

2012-01-24 Thread Jean-Daniel Cryans (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5267?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192304#comment-13192304
 ] 

Jean-Daniel Cryans commented on HBASE-5267:
---

I'd like to see this configuration added to hbase-default.xml with some 
documentation, also it should be documented in the book. hbase-env.sh also 
needs some fixin:

bq. # Set hbase.offheapcachesize in hbase-site.xml

> Add a configuration to disable the slab cache by default
> 
>
> Key: HBASE-5267
> URL: https://issues.apache.org/jira/browse/HBASE-5267
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Jean-Daniel Cryans
>Assignee: Li Pi
>Priority: Blocker
> Fix For: 0.94.0, 0.92.1
>
> Attachments: 5267.txt
>
>
> From what I commented at the tail of HBASE-4027:
> {quote}
> I changed the release note, the patch doesn't have a "hbase.offheapcachesize" 
> configuration and it's enabled as soon as you set -XX:MaxDirectMemorySize 
> (which is actually a big problem when you consider this: 
> http://hbase.apache.org/book.html#trouble.client.oome.directmemory.leak). 
> {quote}
> We need to add hbase.offheapcachesize and set it to false by default.
> Marking as a blocker for 0.92.1 and assigning to Li Pi at Todd's request.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192307#comment-13192307
 ] 

Andrew Purtell commented on HBASE-4720:
---

Ted, I am going to revert your commit.

{quote}
bq. The URI format for requests is '/// ...' This violates that by 
adding, just for check-and cases, a prefix. Having a special case like that 
should be avoided. What about handling this in TableResource, with a query 
parameter? '///?check' E.g.Then you won't need 
CheckAndXTableResource classes. Additionally use the appropriate HTTP 
operations. PUT/POST for check-and-put. DELETE for check-and-delete. The spec 
does not forbid bodies in DELETE requests. (I am unsure if Jetty/Jersey will 
support it however.)

We have discussed the design choices earlier (refer comments in the same JIRA), 
Stack and Ted have voted for option # 2 (/checkandput, /checkanddelete) option. 
If i have to go back to option #1 then i will have to re-work most of the stuff 
here.
{quote}

This has not changed, therefore -1.


> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192309#comment-13192309
 ] 

Andrew Purtell commented on HBASE-4720:
---

Either the current semantics for REST paths remain the same, that is 
/table/row/ ..., or every operation is changed to specify the operation first, 
e.g. /operation/table/row ... . 

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192308#comment-13192308
 ] 

Andrew Purtell commented on HBASE-4720:
---

Actually Ted you should now honor my -1 and revert that commit. Next time allow 
me chance to find out there is even a discussion happening. Stack brought this 
to my attention late last night only.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5270) Handle potential data loss due to concurrent processing of processFaileOver and ServerShutdownHandler

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192317#comment-13192317
 ] 

Zhihong Yu commented on HBASE-5270:
---

Updated patch to address Stack's comments up till 'Is there anything in place 
to stop us expiring a server twice if its carrying root and meta?'

> Handle potential data loss due to concurrent processing of processFaileOver 
> and ServerShutdownHandler
> -
>
> Key: HBASE-5270
> URL: https://issues.apache.org/jira/browse/HBASE-5270
> Project: HBase
>  Issue Type: Sub-task
>  Components: master
>Reporter: Zhihong Yu
> Fix For: 0.94.0, 0.92.1
>
>
> This JIRA continues the effort from HBASE-5179. Starting with Stack's 
> comments about patches for 0.92 and TRUNK:
> Reviewing 0.92v17
> isDeadServerInProgress is a new public method in ServerManager but it does 
> not seem to be used anywhere.
> Does isDeadRootServerInProgress need to be public? Ditto for meta version.
> This method param names are not right 'definitiveRootServer'; what is meant 
> by definitive? Do they need this qualifier?
> Is there anything in place to stop us expiring a server twice if its carrying 
> root and meta?
> What is difference between asking assignment manager isCarryingRoot and this 
> variable that is passed in? Should be doc'd at least. Ditto for meta.
> I think I've asked for this a few times - onlineServers needs to be 
> explained... either in javadoc or in comment. This is the param passed into 
> joinCluster. How does it arise? I think I know but am unsure. God love the 
> poor noob that comes awandering this code trying to make sense of it all.
> It looks like we get the list by trawling zk for regionserver znodes that 
> have not checked in. Don't we do this operation earlier in master setup? Are 
> we doing it again here?
> Though distributed split log is configured, we will do in master single 
> process splitting under some conditions with this patch. Its not explained in 
> code why we would do this. Why do we think master log splitting 'high 
> priority' when it could very well be slower. Should we only go this route if 
> distributed splitting is not going on. Do we know if concurrent distributed 
> log splitting and master splitting works?
> Why would we have dead servers in progress here in master startup? Because a 
> servershutdownhandler fired?
> This patch is different to the patch for 0.90. Should go into trunk first 
> with tests, then 0.92. Should it be in this issue? This issue is really hard 
> to follow now. Maybe this issue is for 0.90.x and new issue for more work on 
> this trunk patch?
> This patch needs to have the v18 differences applied.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192323#comment-13192323
 ] 

Zhihong Yu commented on HBASE-4720:
---

@Andrew:
Glad to see your response.

Mubarak responded to your comment at the end of:
https://issues.apache.org/jira/browse/HBASE-4720?focusedCommentId=13184647&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13184647

which refers to Stack's comment: 
https://issues.apache.org/jira/browse/HBASE-4720?focusedCommentId=13169969&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13169969

I have been requesting your comment since Jan. 12th: 
https://issues.apache.org/jira/browse/HBASE-4720?focusedCommentId=13185150&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13185150

I didn't see a -1 until this morning.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192328#comment-13192328
 ] 

Andrew Purtell commented on HBASE-4720:
---

Ted, I raised a objection on this issue and you committed it with that 
objection unaddressed before I had a chance to ack the commit candidate. I 
apologize if it was not clear my objection earlier was a -1. It was. I thought 
it clear enough, my mistake.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Andrew Purtell (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192329#comment-13192329
 ] 

Andrew Purtell commented on HBASE-4720:
---

Anyway, my apologies, but this commit must be reverted. It is poor and IMO 
unacceptable design to have such an inconsistency in how request URLs should be 
constructed. Either the current semantics for REST paths remain the same, that 
is /table/row/ ..., or every operation is changed to specify the operation 
first, e.g. /operation/table/row ... .

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5268:
-

Attachment: 5268-v2.txt

This version has no anomalies. And it actually isn't so bad. The only part I do 
not like is the change to the sorting in KeyValye.KeyComparator.

Please let me know what you think.

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5265) Fix 'revoke' shell command

2012-01-24 Thread Eugene Koontz (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eugene Koontz reassigned HBASE-5265:


Assignee: Eugene Koontz  (was: Andrew Purtell)

> Fix 'revoke' shell command
> --
>
> Key: HBASE-5265
> URL: https://issues.apache.org/jira/browse/HBASE-5265
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.92.0
>Reporter: Andrew Purtell
>Assignee: Eugene Koontz
> Fix For: 0.94.0, 0.92.1
>
>
> The 'revoke' shell command needs to be reworked for the AccessControlProtocol 
> implementation that was finalized for 0.92. The permissions being removed 
> must exactly match what was previously granted. No wildcard matching is done 
> server side.
> Allow two forms of the command in the shell for convenience:
> Revocation of a specific grant:
> {code}
> revoke , ,  [ ,  ]
> {code}
> Have the shell automatically do so for all permissions on a table for a given 
> user:
> {code}
> revoke , 
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad, region server should calculate disparity of loaded coprocessors among regions and send report through HServerLoad

2012-01-24 Thread Eugene Koontz (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192336#comment-13192336
 ] 

Eugene Koontz commented on HBASE-5258:
--

@Zhihong, I think it should be ok to raise coprocessor reporting out of 
HServerLoad.HRegionLoad to just HServerLoad. 

How do you see this as it relates to HBASE-4660 ? Does this fix make it 
(HBASE-4660) easier?

> Move coprocessors set out of RegionLoad, region server should calculate 
> disparity of loaded coprocessors among regions and send report through 
> HServerLoad
> --
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192335#comment-13192335
 ] 

Zhihong Yu commented on HBASE-4720:
---

That is Okay, Andy.
I have reverted the patch.

Let's start from the beginning.

@Mubarak:
Now that Andy has chosen option #1, can you work up a new patch ?

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5258) Move coprocessors set out of RegionLoad, region server should calculate disparity of loaded coprocessors among regions and send report through HServerLoad

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5258?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192338#comment-13192338
 ] 

Zhihong Yu commented on HBASE-5258:
---

I was looking for HBASE-4660 yesterday, thanks for finding it.

Please work on this JIRA before tackling webuiport.

> Move coprocessors set out of RegionLoad, region server should calculate 
> disparity of loaded coprocessors among regions and send report through 
> HServerLoad
> --
>
> Key: HBASE-5258
> URL: https://issues.apache.org/jira/browse/HBASE-5258
> Project: HBase
>  Issue Type: Task
>Reporter: Zhihong Yu
>
> When I worked on HBASE-5256, I revisited the code related to Ser/De of 
> coprocessors set in RegionLoad.
> I think the rationale for embedding coprocessors set is for maximum 
> flexibility where each region can load different coprocessors.
> This flexibility is causing extra cost in the region server to Master 
> communication and increasing the footprint of Master heap.
> Would HServerLoad be a better place for this set ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5249) NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'

2012-01-24 Thread Benoit Sigoure (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192348#comment-13192348
 ] 

Benoit Sigoure commented on HBASE-5249:
---

I just ran into this again, still on JD's 0.92 test cluster.

>From the logs of OpenTSDB:
{code}
2012-01-24 18:32:17,605 ERROR [New I/O server worker #1-1] UniqueId: Put 
failed, attempts left=5 (retrying in 800 ms), put=PutRequest(table="tsdb-uid", 
key="\x00", family="id", qualifier="metrics
", value=[0, 0, 0, 0, 0, 0, 0, 1], lockid=-1, durable=true, bufferable=false, 
attempt=0, region=RegionInfo(table="tsdb-uid", 
region_name="tsdb-uid,,1327429528678.c421780d32aae9959a1b821a441fca86.", 
stop_key=""))
org.hbase.async.RemoteException: java.io.IOException: 
java.lang.NullPointerException

at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1076)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.convertThrowableToIOE(HRegionServer.java:1065)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1815)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1348)
Caused by: java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:922)
at 
org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2639)
at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1658)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1813)
... 6 more
{code}

>From the logs of the RegionServer:
{code}
2012-01-24 18:31:47,545 DEBUG 
org.apache.hadoop.hbase.regionserver.HRegionServer: Row lock 
-8669450923246717741 explicitly acquired by client
2012-01-24 18:32:17,554 ERROR 
org.apache.hadoop.hbase.regionserver.HRegionServer: 
java.lang.NullPointerException
at 
java.util.concurrent.ConcurrentHashMap.remove(ConcurrentHashMap.java:922)
at 
org.apache.hadoop.hbase.regionserver.HRegion.releaseRowLock(HRegion.java:2639)
at org.apache.hadoop.hbase.regionserver.HRegion.put(HRegion.java:1658)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.put(HRegionServer.java:1813)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at 
org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
at 
org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1348)
2012-01-24 18:32:47,549 INFO 
org.apache.hadoop.hbase.regionserver.HRegionServer: Row Lock 
-8669450923246717741 lease expired
2012-01-24 18:32:47,576 WARN org.apache.hadoop.ipc.HBaseServer: 
(operationTooSlow): 
{"processingtimems":29166,"client":"10.4.13.49:47018","starttimems":1327429938406,"queuetimems":0,"class":"HRegionServer","responsesize":0,"method":"put","totalColumns":1,"table":"tsdb-uid","families":{"id":[{"timestamp":-8669450923246717741,"qualifier":"metrics","vlen":8}]},"row":"\\x00"}
{code}

> NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'
> --
>
> Key: HBASE-5249
> URL: https://issues.apache.org/jira/browse/HBASE-5249
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.1
>
>
> See 
> http://search-hadoop.com/m/ZTJxL1S7Hq61/Error+obtaining+row+lock+%2528fsOk%253A+true%2529&subj=Re+NPE+while+obtaining+row+lock
> Benoit just ran into this too testing tsdb against 0.92:
> {code}
> 2012-01-20 17:09:54,074 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Error obtaining
> row lock (fsOk: true)
> java.lang.NullPointerException
>at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:2313)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:2299)
>at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImp

[jira] [Updated] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5199:
---

Attachment: D1311.4.patch

Liyin updated the revision "[jira][HBASE-5199] Delete out of TTL store files 
before compaction selection ".
Reviewers: Kannan, JIRA, khemani, aaiyer, Karthik

  During the minor compaction selection time, if there is any expired store 
files, these expired store files will be selected to compact directly.
  Since these files have already expired, there would be no-ops during the 
compaction time and these files will be deleted after the compaction.

REVISION DETAIL
  https://reviews.facebook.net/D1311

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java


> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4720) Implement atomic update operations (checkAndPut, checkAndDelete) for REST client/server

2012-01-24 Thread Mubarak Seyed (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192349#comment-13192349
 ] 

Mubarak Seyed commented on HBASE-4720:
--

Sure, will do. Thanks.

> Implement atomic update operations (checkAndPut, checkAndDelete) for REST 
> client/server 
> 
>
> Key: HBASE-4720
> URL: https://issues.apache.org/jira/browse/HBASE-4720
> Project: HBase
>  Issue Type: Improvement
>Reporter: Daniel Lord
>Assignee: Mubarak Seyed
> Fix For: 0.94.0
>
> Attachments: HBASE-4720.trunk.v1.patch, HBASE-4720.trunk.v2.patch, 
> HBASE-4720.trunk.v3.patch, HBASE-4720.trunk.v4.patch, 
> HBASE-4720.trunk.v5.patch, HBASE-4720.trunk.v6.patch, HBASE-4720.v1.patch, 
> HBASE-4720.v3.patch
>
>
> I have several large application/HBase clusters where an application node 
> will occasionally need to talk to HBase from a different cluster.  In order 
> to help ensure some of my consistency guarantees I have a sentinel table that 
> is updated atomically as users interact with the system.  This works quite 
> well for the "regular" hbase client but the REST client does not implement 
> the checkAndPut and checkAndDelete operations.  This exposes the application 
> to some race conditions that have to be worked around.  It would be ideal if 
> the same checkAndPut/checkAndDelete operations could be supported by the REST 
> client.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5249) NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'

2012-01-24 Thread Benoit Sigoure (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benoit Sigoure updated HBASE-5249:
--

Priority: Blocker  (was: Critical)

> NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'
> --
>
> Key: HBASE-5249
> URL: https://issues.apache.org/jira/browse/HBASE-5249
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.92.1
>
>
> See 
> http://search-hadoop.com/m/ZTJxL1S7Hq61/Error+obtaining+row+lock+%2528fsOk%253A+true%2529&subj=Re+NPE+while+obtaining+row+lock
> Benoit just ran into this too testing tsdb against 0.92:
> {code}
> 2012-01-20 17:09:54,074 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Error obtaining
> row lock (fsOk: true)
> java.lang.NullPointerException
>at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:2313)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:2299)
>at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1327)
> It happened only once out of thousands of RPCs that grabbed and
> released a row lock.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192352#comment-13192352
 ] 

Phabricator commented on HBASE-5199:


Liyin has requested a review of the revision "[jira][HBASE-5199] Delete out of 
TTL store files before compaction selection ".

  Hi @Karthik,
  The new diff have changed the logic a little bit. There is no need to get the 
write lock to do the deletion job for these expired store files. It would be 
much easier to put these expired store files directly to the compaction, which 
will do the deletion job for free.

  Would you mind reviewing it again? Thanks

REVISION DETAIL
  https://reviews.facebook.net/D1311


> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5249) NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'

2012-01-24 Thread Benoit Sigoure (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192351#comment-13192351
 ] 

Benoit Sigoure commented on HBASE-5249:
---

I can actually consistently reproduce this bug on a fresh deployment of 
OpenTSDB with HBase 0.92.

> NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'
> --
>
> Key: HBASE-5249
> URL: https://issues.apache.org/jira/browse/HBASE-5249
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.1
>
>
> See 
> http://search-hadoop.com/m/ZTJxL1S7Hq61/Error+obtaining+row+lock+%2528fsOk%253A+true%2529&subj=Re+NPE+while+obtaining+row+lock
> Benoit just ran into this too testing tsdb against 0.92:
> {code}
> 2012-01-20 17:09:54,074 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Error obtaining
> row lock (fsOk: true)
> java.lang.NullPointerException
>at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:2313)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:2299)
>at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1327)
> It happened only once out of thousands of RPCs that grabbed and
> released a row lock.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192368#comment-13192368
 ] 

Zhihong Yu commented on HBASE-5271:
---

@Ghais:
Do you want to provide a patch that fixes this?

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Attachments: testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192372#comment-13192372
 ] 

stack commented on HBASE-5231:
--

Its unexplained in the code, its just introduced and its very odd.  Wrong even.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192378#comment-13192378
 ] 

Phabricator commented on HBASE-5199:


Karthik has commented on the revision "[jira][HBASE-5199] Delete out of TTL 
store files before compaction selection ".

INLINE COMMENTS
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java:1118 Could we 
return a new CompactSelection object instead of modifying the existing one and 
returning a boolean? We can return null or an empty compact selection if there 
are no such files...

  CompactSelection expiredFiles = compactSelection.selectExpiredStoreFiles(...)


REVISION DETAIL
  https://reviews.facebook.net/D1311


> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5199:
---

Attachment: D1311.5.patch

Liyin updated the revision "[jira][HBASE-5199] Delete out of TTL store files 
before compaction selection ".
Reviewers: Kannan, JIRA, khemani, aaiyer, Karthik

  Address @Karthik's comments.

REVISION DETAIL
  https://reviews.facebook.net/D1311

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java


> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch, D1311.5.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Phabricator (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192414#comment-13192414
 ] 

Phabricator commented on HBASE-5199:


Karthik has accepted the revision "[jira][HBASE-5199] Delete out of TTL store 
files before compaction selection ".

  Looks good!

REVISION DETAIL
  https://reviews.facebook.net/D1311


> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch, D1311.5.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192420#comment-13192420
 ] 

Todd Lipcon commented on HBASE-4608:


Why reset on flush? Seems to me we need to reset on log roll, but not flush.

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4940) hadoop-metrics.properties can include configuration of the "rest" context for ganglia

2012-01-24 Thread Mubarak Seyed (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed reassigned HBASE-4940:


Assignee: Mubarak Seyed

> hadoop-metrics.properties can include configuration of the "rest" context for 
> ganglia
> -
>
> Key: HBASE-4940
> URL: https://issues.apache.org/jira/browse/HBASE-4940
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.90.5
> Environment: HBase-0.90.1
>Reporter: Mubarak Seyed
>Assignee: Mubarak Seyed
>Priority: Minor
>  Labels: hbase-rest
> Fix For: 0.94.0
>
> Attachments: HBASE-4940.patch, HBASE-4940.trunk.v1.patch
>
>
> It appears from hadoop-metrics.properties that configuration for rest context 
> is missing. It would be good if we add the rest context and commented out 
> them, if anyone is using rest-server and if they want to monitor using 
> ganglia context then they can uncomment the rest context and use them for 
> rest-server monitoring using ganglia.
> {code}
> # Configuration of the "rest" context for ganglia
> #rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext
> #rest.period=10
> #rest.servers=ganglia-metad-hostname:port
> {code}
> Working on the patch, will submit it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4940) hadoop-metrics.properties can include configuration of the "rest" context for ganglia

2012-01-24 Thread Mubarak Seyed (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mubarak Seyed updated HBASE-4940:
-

Attachment: HBASE-4940.trunk.v2.patch

The attached file (HBASE-4940.trunk.v2.patch)  updates the patch with 
explanation for GMETADHOST_IP

> hadoop-metrics.properties can include configuration of the "rest" context for 
> ganglia
> -
>
> Key: HBASE-4940
> URL: https://issues.apache.org/jira/browse/HBASE-4940
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.90.5
> Environment: HBase-0.90.1
>Reporter: Mubarak Seyed
>Assignee: Mubarak Seyed
>Priority: Minor
>  Labels: hbase-rest
> Fix For: 0.94.0
>
> Attachments: HBASE-4940.patch, HBASE-4940.trunk.v1.patch, 
> HBASE-4940.trunk.v2.patch
>
>
> It appears from hadoop-metrics.properties that configuration for rest context 
> is missing. It would be good if we add the rest context and commented out 
> them, if anyone is using rest-server and if they want to monitor using 
> ganglia context then they can uncomment the rest context and use them for 
> rest-server monitoring using ganglia.
> {code}
> # Configuration of the "rest" context for ganglia
> #rest.class=org.apache.hadoop.metrics.ganglia.GangliaContext
> #rest.period=10
> #rest.servers=ganglia-metad-hostname:port
> {code}
> Working on the patch, will submit it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192431#comment-13192431
 ] 

Lars Hofhansl commented on HBASE-4608:
--

On recovery we'd always to have scan the entire log from the beginning. Maybe 
that's not a big deal, because log size in limited?

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192435#comment-13192435
 ] 

Lars Hofhansl commented on HBASE-5271:
--

Good find. From the code it looks like only a prefix of the family needs to 
match. D'oh.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Attachments: testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192441#comment-13192441
 ] 

Mikhail Bautin commented on HBASE-5262:
---

I think we could implement something small-scale in HBase itself, so that it 
can eventually be used for making optimization decisions for compactions and 
caching. OpenTSDB is probably better for large-scale time series collection. 
Also, we don't want to introduce a circular dependency between HBase and 
OpenTSDB.

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192440#comment-13192440
 ] 

stack commented on HBASE-5231:
--

In a method named getAssignmentsByTable we will return regions by table IFF a 
configuration hbase.master.loadbalance.bytable is true.  Otherwise we return 
all regions belonging to a 'table' named 'ensemble'.  No where is 'ensemble' 
explained.  How is a noob who follows along after us trying to make sense of 
this supposed to figure whats going on?  A method named getAssignmentsByTable 
should return assignments by table... not assignments by table and then if some 
flag in config. is set, assignments by some arbitrary pseudo table. At a 
minimum it needs to be explained by comments and in javadoc.  But really it 
says to me that this new feature is not well thought through.  Why do we worry 
about regions by table outside of the balancer invocation; shouldn't the 
balancer-by-table being asking about a regions table down in the balancer guts 
rather than up here high in the master.

Looking more at what is going on, when the balance finishes, do we have a 
balanced cluster?  There is no test to prove it and thinking on it, given as we 
invoke the balancer per table, if lots of tables with different region count 
skew, I'd think it could throw off the basic cluster balance (regions per 
regionserver).



> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5249) NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'

2012-01-24 Thread Benoit Sigoure (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192443#comment-13192443
 ] 

Benoit Sigoure commented on HBASE-5249:
---

Related asynchbase issue: https://github.com/stumbleupon/asynchbase/issues/16

I'm now wondering whether the problem is really in HBase.  Could be a bug in 
asynchbase too.  I'm going to investigate after lunch.  Even if it's a bug in 
asynchbase, HBase shouldn't NPE like that.  There's some error handling that's 
missing in the RegionServer.

But if this turns out to be an asynchbase-only bug, we can lower the severity 
of this bug.

> NPE getting a rowlock, 'Error obtaining row lock (fsOk: true)'
> --
>
> Key: HBASE-5249
> URL: https://issues.apache.org/jira/browse/HBASE-5249
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Fix For: 0.92.1
>
>
> See 
> http://search-hadoop.com/m/ZTJxL1S7Hq61/Error+obtaining+row+lock+%2528fsOk%253A+true%2529&subj=Re+NPE+while+obtaining+row+lock
> Benoit just ran into this too testing tsdb against 0.92:
> {code}
> 2012-01-20 17:09:54,074 ERROR
> org.apache.hadoop.hbase.regionserver.HRegionServer: Error obtaining
> row lock (fsOk: true)
> java.lang.NullPointerException
>at 
> java.util.concurrent.ConcurrentHashMap.put(ConcurrentHashMap.java:881)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.addRowLock(HRegionServer.java:2313)
>at 
> org.apache.hadoop.hbase.regionserver.HRegionServer.lockRow(HRegionServer.java:2299)
>at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>at java.lang.reflect.Method.invoke(Method.java:597)
>at 
> org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:364)
>at 
> org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1327)
> It happened only once out of thousands of RPCs that grabbed and
> released a row lock.
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5231:
--

Attachment: 5231.addendum

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.addendum, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192452#comment-13192452
 ] 

Zhihong Yu commented on HBASE-5231:
---

bq. There is no test to prove it
I confine the impact for this feature within the limit of balancer slop which 
was introduced more than 6 months ago. This means there is no real change in 
the overall balancing for the whole cluster.

The addendum adds javadoc to getAssignmentsByTable().

Part of the reason for putting some of balance-by-table logic in Master is to 
limit the complexity of DefaultLoadBalancer.

There may be better ways to simplify the logic in DefaultLoadBalancer. I will 
create a separate JIRA for that.

One intrinsic limitation of regions per regionserver being the sole balancing 
criterion is that the actual load on the regions is not taken into account.
I have work pending open-source which addresses this limitation.

> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.addendum, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-5272) Simplify the logic in DefaultLoadBalancer

2012-01-24 Thread Zhihong Yu (Created) (JIRA)
Simplify the logic in DefaultLoadBalancer
-

 Key: HBASE-5272
 URL: https://issues.apache.org/jira/browse/HBASE-5272
 Project: HBase
  Issue Type: Improvement
Reporter: Zhihong Yu


Currently the body of DefaultLoadBalancer.balanceCluster() is 250 lines long.
It involves multiple iterations which deal with various kinds of corner cases.

We should simplify this part of code.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192464#comment-13192464
 ] 

Zhihong Yu commented on HBASE-5262:
---

Can we approach this from the perspective of what information is needed for 
compactions and caching decision making ?

We should also provide abstraction in providing such information so that there 
can be multiple sources of information the decision making can be based.

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5268:
-

Attachment: 5268-v3.txt

Sorting prefix markers first has another problem.
Placing a prefix marker for 123 at T and then another for 1234 at T+1 will 
ignore the 2nd one even though it is for newer timestamp. To fix that 
ScanDeleteTracker would need to keep track of delete marker that are prefixes 
of each other and would increase memory requirements.

So I am back to the first approach. It is simpler, does not much with the sort 
order, has less memory requirements, and is easier to grok.
Delete.deleteColumnsByPrefix now advises against mixing this with deleteColumn 
for the *same* qualifiers.

I added more test and propose this for commit.

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192477#comment-13192477
 ] 

Zhihong Yu commented on HBASE-5268:
---

bq. Delete.deleteColumnsByPrefix now advises against mixing this with 
deleteColumn for the same qualifiers.
Shall we poll user@ and dev@ for the above constraint ?

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Ghais Issa (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ghais Issa updated HBASE-5271:
--

Attachment: fixKeyValueMatchingColumn.diff

I believe this patch fixes the issue.
It looks like the KeyValue.matchingColumn was incorrectly passing in the 
family.length instead of the actual column's family length.

Is there some reviewboard I should submit to, or is there some page which 
explains the submission process?

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5230) Unit test to ensure compactions don't cache data on write

2012-01-24 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5230?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192479#comment-13192479
 ] 

Mikhail Bautin commented on HBASE-5230:
---

Re-running failed unit tests locally.

Running org.apache.hadoop.hbase.master.TestSplitLogManager
Tests run: 12, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 15.748 sec
Running org.apache.hadoop.hbase.replication.TestReplication
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 128.254 sec
Running org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 215.328 sec
Running org.apache.hadoop.hbase.mapreduce.TestImportTsv
Tests run: 8, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 85.367 sec
Running org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 95.431 sec
Running org.apache.hadoop.hbase.mapred.TestTableMapReduce
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.762 sec

Results :

Tests run: 39, Failures: 0, Errors: 0, Skipped: 

> Unit test to ensure compactions don't cache data on write
> -
>
> Key: HBASE-5230
> URL: https://issues.apache.org/jira/browse/HBASE-5230
> Project: HBase
>  Issue Type: Test
>Reporter: Mikhail Bautin
>Assignee: Mikhail Bautin
>Priority: Minor
> Attachments: D1353.1.patch, D1353.2.patch, D1353.3.patch, 
> D1353.4.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-21_00_53_54.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_10_23_45.patch, 
> Don-t-cache-data-blocks-on-compaction-2012-01-23_15_27_23.patch
>
>
> Create a unit test for HBASE-3976 (making sure we don't cache data blocks on 
> write during compactions even if cache-on-write is enabled generally 
> enabled). This is because we have very different implementations of 
> HBASE-3976 without HBASE-4422 CacheConfig (on top of 89-fb, created by Liyin) 
> and with CacheConfig (presumably it's there but not sure if it even works, 
> since the patch in HBASE-3976 may not have been committed). We need to create 
> a unit test to verify that we don't cache data blocks on write during 
> compactions, and resolve HBASE-3976 so that this new unit test does not fail.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192481#comment-13192481
 ] 

Zhihong Yu commented on HBASE-5271:
---

See http://hbase.apache.org/book.html#developer

For large patch, https://reviews.apache.org can be used for review.

You can push the 'Submit Patch' button to let Hadoop QA run through the tests.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Ghais Issa (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ghais Issa updated HBASE-5271:
--

Status: Patch Available  (was: Open)

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5271:
--

Fix Version/s: 0.92.1
   0.90.6
   0.94.0
 Hadoop Flags: Reviewed

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192491#comment-13192491
 ] 

Mikhail Bautin commented on HBASE-5262:
---

Some of the information needed for compaction decision making:

 * Compaction start time
 * For each compaction input file:
   * Size
   * Number of key/value pairs
   * Average key/value size
   * other metadata
 * Compaction end time
 * Compaction status (success, failure)
 * The same metadata as above for the compaction output file

I am not yet sure what information we would like to collect for caching 
decision making—that needs more thinking.

We could also collect "region history", e.g. region open / close events:

 * Region name 
 * Event type
 * Server where the region was opened or closed
 * Reason

That would allow to detect problematic regions that move from machine to 
machine automatically.

I agree that it would make sense to isolate information collection logic from 
decision making logic, so that external adaptive cluster tuning tools and/or 
external sources of information could be plugged in.


> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192494#comment-13192494
 ] 

Todd Lipcon commented on HBASE-4608:


Don't we already have to scan the entire log from the beginning on recovery? 
Log splitting splits entire segments, afaik. Am I forgetting about some index 
structure or something?

> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192496#comment-13192496
 ] 

Todd Lipcon commented on HBASE-5268:


This seems a little complicated to me... if this is meant for the purpose of 
building higher-level transactional capabilities, maybe we should start a 
branch for that exploration? I'm wary that this introduces a lot of complexity 
around optimizing the read path, for the sake of optimizing a delete path 
(which in my experience is rarely used)

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Todd Lipcon (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192500#comment-13192500
 ] 

Todd Lipcon commented on HBASE-5262:


Also we should isolate the writing of this table to a separate thread with a 
bounded queue (which drops off the end if it becomes full). Otherwise we may 
run into issues where the logging of information may actually cause operations 
to fail, or cause deadlocks, etc. (eg a write to a region causes a flush which 
causes a write to the eventlog table on the same server, but all of the IPC 
threads are blocked because of the ongoing flush, so the eventlog write never 
completes). So, we should always assume that the information in the event log 
is potentially lossy, and only use it for heuristics and diagnosis, not for 
anything critical to operation.

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5262) Structured event log for HBase for monitoring and auto-tuning performance

2012-01-24 Thread Mikhail Bautin (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192504#comment-13192504
 ] 

Mikhail Bautin commented on HBASE-5262:
---

@Todd: I totally agree. We should also disable event logging for the event log 
table itself.

> Structured event log for HBase for monitoring and auto-tuning performance
> -
>
> Key: HBASE-5262
> URL: https://issues.apache.org/jira/browse/HBASE-5262
> Project: HBase
>  Issue Type: Improvement
>Reporter: Mikhail Bautin
>
> Creating this JIRA to open a discussion about a structured (machine-readable) 
> log that will record events such as compaction start/end times, compaction 
> input/output files, their sizes, the same for flushes, etc. This can be 
> stored e.g. in a new system table in HBase itself. The data from this log can 
> then be analyzed and used to optimize compactions at run time, or otherwise 
> auto-tune HBase configuration to reduce the number of knobs the user has to 
> configure.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192518#comment-13192518
 ] 

Zhihong Yu commented on HBASE-5199:
---

@Liyin:
Can you try adding JIRA to the CCs ?

I have to deal with 6 copies of each review email in my InBox.

Thanks

> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch, D1311.5.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192524#comment-13192524
 ] 

Zhihong Yu commented on HBASE-5271:
---

+1 on patch.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192527#comment-13192527
 ] 

Hadoop QA commented on HBASE-5271:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12511722/fixKeyValueMatchingColumn.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 javadoc.  The javadoc tool appears to have generated -145 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 156 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.replication.TestReplication
  org.apache.hadoop.hbase.mapreduce.TestImportTsv
  org.apache.hadoop.hbase.mapred.TestTableMapReduce
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/847//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/847//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/847//console

This message is automatically generated.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192529#comment-13192529
 ] 

Lars Hofhansl commented on HBASE-5268:
--

@Todd: Are you talking about v2 or v3? I agree v2 is too complicated.
v3 just adds this to the read path (in ScanDeleteTracker):
{code}
+// check whether the delete marker is a prefix
+if (deleteType == KeyValue.Type.DeleteColumnPrefix.getCode()) {
+  if (Bytes.compareTo(deleteBuffer, deleteOffset, deleteLength, buffer,
+  qualifierOffset, deleteLength) == 0) {
+if (timestamp <= deleteTimestamp) {
+  return DeleteResult.COLUMN_DELETED;
+}
+  } else {
+// past the prefix marker
+deleteBuffer = null;
+  }
{code}
The rest is indentation (which makes the change to ScanDeleteTracker look big), 
tests, and two more methods on Delete.java.
If prefix markers are not used there is no performance impact.

@Ted: Sure. Although if we cannot live with this constraint, this becomes too 
complicated to be useful.


> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192532#comment-13192532
 ] 

Zhihong Yu commented on HBASE-5271:
---

There was no hanging test in PreCommit build 847.

Will integrate the patch tonight.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Zhihong Yu (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192541#comment-13192541
 ] 

Zhihong Yu commented on HBASE-5268:
---

I think the idea in patch v3 is useful in handling wide rows.

I suggest making the selection of column prefix generic (in place of 
Bytes.compareTo call).
e.g. user may want to use a regex to select the columns to be deleted.
How hard is it to accommodate the above case ?

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192555#comment-13192555
 ] 

Lars Hofhansl commented on HBASE-5268:
--

I think that would difficult, because we could not sort the delete markers 
correctly. Delete by prefix works, because I can sort those at the right place 
w.r.t. the cells that they affect (delete by suffix, for example, would not 
work).

I think what you have in mind would be the work of some kind of "delete filter" 
that needs to scan all the columns.

The goal I want to reach is that we have the same flexibility for columns that 
we have for row-keys, so that key schema in HBase is not limited to the row-key.

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268-v4.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5268:
-

Attachment: 5268-v4.txt

One more test. And guard for short column identifiers following longer prefix 
delete markers.

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268-v4.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5199) Delete out of TTL store files before compaction selection

2012-01-24 Thread Phabricator (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5199?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-5199:
---

Attachment: D1311.5.patch

Liyin added you to the CC list for the revision "[jira][HBASE-5199] Delete out 
of TTL store files before compaction selection ".
Reviewers: Kannan, JIRA, khemani, aaiyer, Karthik

  Currently, HBase deletes the out of TTL store files after compaction. We can 
change the sequence to delete the out of TTL store files before selecting store 
files for compactions.
  In this way, HBase can keep deleting the old invalid store files without 
compaction, and also prevent from unnecessary compactions since the out of TTL 
store files will be deleted before the compaction selection.

TEST PLAN
  TestStore

REVISION DETAIL
  https://reviews.facebook.net/D1311

AFFECTED FILES
  src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
  src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
  
src/main/java/org/apache/hadoop/hbase/regionserver/compactions/CompactSelection.java
  src/test/java/org/apache/hadoop/hbase/regionserver/TestStore.java


> Delete out of TTL store files before compaction selection
> -
>
> Key: HBASE-5199
> URL: https://issues.apache.org/jira/browse/HBASE-5199
> Project: HBase
>  Issue Type: Improvement
>Reporter: Liyin Tang
>Assignee: Liyin Tang
> Attachments: D1311.1.patch, D1311.2.patch, D1311.3.patch, 
> D1311.4.patch, D1311.5.patch, D1311.5.patch
>
>
> Currently, HBase deletes the out of TTL store files after compaction. We can 
> change the sequence to delete the out of TTL store files before selecting 
> store files for compactions. 
> In this way, HBase can keep deleting the old invalid store files without 
> compaction, and also prevent from unnecessary compactions since the out of 
> TTL store files will be deleted before the compaction selection.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5268) Add delete column prefix delete marker

2012-01-24 Thread Lars Hofhansl (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5268:
-

Status: Patch Available  (was: Open)

> Add delete column prefix delete marker
> --
>
> Key: HBASE-5268
> URL: https://issues.apache.org/jira/browse/HBASE-5268
> Project: HBase
>  Issue Type: Improvement
>  Components: client, regionserver
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.94.0
>
> Attachments: 5268-v2.txt, 5268-v3.txt, 5268-v4.txt, 5268.txt
>
>
> This is another part missing in the "wide row challenge".
> Currently entire families of a row can be deleted or individual columns or 
> versions.
> There is no facility to mark multiple columns for deletion by column prefix.
> Turns out that be achieve with very little code (it's possible that I missed 
> some of the new delete bloom filter code, so please review this thoroughly). 
> I'll attach a patch soon, just working on some tests now.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Updated) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu updated HBASE-5271:
--

Attachment: 5271-90.txt
5271-v2.txt

Updated patch by wrapping a long line.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5271-90.txt, 5271-v2.txt, 
> fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4608) HLog Compression

2012-01-24 Thread Lars Hofhansl (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4608?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192580#comment-13192580
 ] 

Lars Hofhansl commented on HBASE-4608:
--

You know more about that than I do :)
I'm saying that we do not need to scan the entire log, especially if we add 
some custom log replaying tools (for example replaying for region).
If we're not careful now we shut ourselves out from future optimizations.
Might not be a big deal as the logs are rolled anyway and that naturally limits 
the amount of WALEdit we have to scan go back to find a dictionary.


> HLog Compression
> 
>
> Key: HBASE-4608
> URL: https://issues.apache.org/jira/browse/HBASE-4608
> Project: HBase
>  Issue Type: New Feature
>Reporter: Li Pi
>Assignee: Li Pi
> Attachments: 4608v1.txt, 4608v5.txt, 4608v6.txt, 4608v7.txt, 
> 4608v8fixed.txt
>
>
> The current bottleneck to HBase write speed is replicating the WAL appends 
> across different datanodes. We can speed up this process by compressing the 
> HLog. Current plan involves using a dictionary to compress table name, region 
> id, cf name, and possibly other bits of repeated data. Also, HLog format may 
> be changed in other ways to produce a smaller HLog.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Zhihong Yu (Assigned) (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhihong Yu reassigned HBASE-5271:
-

Assignee: Ghais Issa

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
>Assignee: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5271-90.txt, 5271-v2.txt, 
> fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5271) Result.getValue and Result.getColumnLatest return the wrong column.

2012-01-24 Thread Hadoop QA (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5271?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192581#comment-13192581
 ] 

Hadoop QA commented on HBASE-5271:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12511739/5271-90.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/848//console

This message is automatically generated.

> Result.getValue and Result.getColumnLatest return the wrong column.
> ---
>
> Key: HBASE-5271
> URL: https://issues.apache.org/jira/browse/HBASE-5271
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.5
>Reporter: Ghais Issa
> Fix For: 0.94.0, 0.90.6, 0.92.1
>
> Attachments: 5271-90.txt, 5271-v2.txt, 
> fixKeyValueMatchingColumn.diff, testGetValue.diff
>
>
> In the following example result.getValue returns the wrong column
> KeyValue kv = new KeyValue(Bytes.toBytes("r"), Bytes.toBytes("24"), 
> Bytes.toBytes("2"), Bytes.toBytes(7L));
> Result result = new Result(new KeyValue[] { kv });
> System.out.println(Bytes.toLong(result.getValue(Bytes.toBytes("2"), 
> Bytes.toBytes("2"; //prints 7.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-5231) Backport HBASE-3373 (per-table load balancing) to 0.92

2012-01-24 Thread stack (Commented) (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13192588#comment-13192588
 ] 

stack commented on HBASE-5231:
--

bq. I confine the impact for this feature within the limit of balancer slop 
which was introduced more than 6 months ago. This means there is no real change 
in the overall balancing for the whole cluster.

I do not know what that means (it seems out of scope for this patch)

bq. The addendum adds javadoc to getAssignmentsByTable().

The addendum documents an internal implementation detail. Why should caller 
even care about this hackery?

bq. Part of the reason for putting some of balance-by-table logic in Master is 
to limit the complexity of DefaultLoadBalancer.

What does balance-by-table have to do w/ default balancer?  That is written.  
Done.  Balance-by-table should be something else.

bq. There may be better ways to simplify the logic in DefaultLoadBalancer. I 
will create a separate JIRA for that.

Yes.  The default balancer code is too hard to grok

bq. One intrinsic limitation of regions per regionserver being the sole 
balancing criterion is that the actual load on the regions is not taken into 
account.

Isn't this a balancer implementation detail?  It would figure how to ask this?  
If it can't, then we  need to beef up the balancer interface api so this stuff 
is passed in.



> Backport HBASE-3373 (per-table load balancing) to 0.92
> --
>
> Key: HBASE-5231
> URL: https://issues.apache.org/jira/browse/HBASE-5231
> Project: HBase
>  Issue Type: Improvement
>Reporter: Zhihong Yu
> Fix For: 0.92.1
>
> Attachments: 5231-v2.txt, 5231.addendum, 5231.txt
>
>
> This JIRA backports per-table load balancing to 0.90

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira




  1   2   >