[jira] [Commented] (HBASE-5783) Faster HBase bulk loader

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476780#comment-13476780
 ] 

Lars Hofhansl commented on HBASE-5783:
--

The tracking cookie is a very interesting idea!
Do we need to track every single cookie? Or can we just track the highest one 
per region, and if that made it to disk all previous ones are on disk?

With MR Bulk Loader you mean LoadIncrementalHFiles? Or Import/ImportTsv?



 Faster HBase bulk loader
 

 Key: HBASE-5783
 URL: https://issues.apache.org/jira/browse/HBASE-5783
 Project: HBase
  Issue Type: New Feature
  Components: Client, IPC/RPC, Performance, regionserver
Reporter: Karthik Ranganathan
Assignee: Amitanand Aiyer

 We can get a 3x to 4x gain based on a prototype demonstrating this approach 
 in effect (hackily) over the MR bulk loader for very large data sets by doing 
 the following:
 1. Do direct multi-puts from HBase client using GZIP compressed RPC's
 2. Turn off WAL (we will ensure no data loss in another way)
 3. For each bulk load client, we need to:
 3.1 do a put
 3.2 get back a tracking cookie (memstoreTs or HLogSequenceId) per put
 3.3 be able to ask the RS if the tracking cookie has been flushed to disk
 4. For each client, we can succeed it if the tracking cookie for the last put 
 it did (for every RS) makes it to disk. Otherwise the map task fails and is 
 retried.
 5. If the last put did not make it to disk for a timeout (say a second or so) 
 we issue a manual flush.
 Enhancements:
 - Increase the memstore size so that we flush larger files
 - Decrease the compaction ratios (say increase the number of files to compact)
 Quick background:
 The bottlenecks in the multiput approach are that the data is transferred 
 *uncompressed* twice over the top-of-rack: once from the client to the RS (on 
 the multi put call) and again because of WAL (HDFS replication). We reduced 
 the former with RPC compression and eliminated the latter above while still 
 guaranteeing that data wont be lost.
 This is better than the MR bulk loader at a high level because we dont need 
 to merge sort all the files for a given region and then make it a HFile - 
 thats the equivalent of bulk loading AND majorcompacting in one shot. Also 
 there is much more disk involved in the MR method (sort/spill).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476804#comment-13476804
 ] 

Anoop Sam John commented on HBASE-6942:
---

API signature wise I am okey Lars. Passing in Scan attributes, I was in 2 minds 
all the time..:)
BulkDeleteResponse delete(Scan scan, DeleteType type, Long timestamp)
We need to pass the rowBatchSize too.. This is needed to accumulate the rows 
for a batched delete.
Do we need a Request object taking up the attributes? Yes some thing inline of 
protobufs?  Just asked

Yes above suggestions sounds good to me
bq.Documenting this will be tricky. I can have a shot at that (if you like, 
Anoop. If you prefer to do that, that's fine too).
Yes Lars you can do that if you like  :)

bq.Eventually, since we made it so general now, I can see this as an official 
API in HTable... But let's do that in an another jira (if others agree).
+1


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476807#comment-13476807
 ] 

Anoop Sam John commented on HBASE-6942:
---

bq.Just worried that N will necessarily be (much?) larger than M.
We do not have any way to get only the 1st KVs from all the families of row 
right? We have FirstKeyOnlyFilter now.


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6786) Convert MultiRowMutationProtocol to protocol buffer service

2012-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6786?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476809#comment-13476809
 ] 

Hadoop QA commented on HBASE-6786:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12549266/6786-1.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
82 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3056//console

This message is automatically generated.

 Convert MultiRowMutationProtocol to protocol buffer service
 ---

 Key: HBASE-6786
 URL: https://issues.apache.org/jira/browse/HBASE-6786
 Project: HBase
  Issue Type: Sub-task
  Components: Coprocessors
Reporter: Gary Helmling
Assignee: Devaraj Das
 Fix For: 0.96.0

 Attachments: 6786-1.patch


 With coprocessor endpoints now exposed as protobuf defined services, we 
 should convert over all of our built-in endpoints to PB services.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6979) recovered.edits file should not break distributed log splitting

2012-10-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476812#comment-13476812
 ] 

stack commented on HBASE-6979:
--

I'm ok w/ this.  /tmp is better than no where.  And this is a mess when folks 
run into it -- if they ever do.  This is good clean up in that case.

 recovered.edits file should not break distributed log splitting
 ---

 Key: HBASE-6979
 URL: https://issues.apache.org/jira/browse/HBASE-6979
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-6979.patch


 Distributed log splitting fails in creating the recovered.edits folder during 
 upgrade because there is a file called recovered.edits there.
 Instead of checking if the patch exists, we need to check if it exists and is 
 a path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476814#comment-13476814
 ] 

Lars Hofhansl commented on HBASE-6942:
--

bq. We do not have any way to get only the 1st KVs from all the families of row 
right? We have FirstKeyOnlyFilter now.
Right. I can't see any way to only get the 1st KV for a CF. I think we can live 
with that for now.

bq. We need to pass the rowBatchSize too
Yes, forgot about that.

bq. Do we need a Request object taking up the attributes? Yes some thing inline 
of protobufs?
Not sure. I like just passing the four parameters needed. (If that is not 
possible with protobufs, we should use a request object.)


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5843) Improve HBase MTTR - Mean Time To Recover

2012-10-16 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476830#comment-13476830
 ] 

nkeywal commented on HBASE-5843:


bq. Not sure what you mean here. Are the rows the same or not? Are there are 
just more flushes on the 10M case?
Yes, it's exactly the same rows for 1M puts and 10M puts.

 Improve HBase MTTR - Mean Time To Recover
 -

 Key: HBASE-5843
 URL: https://issues.apache.org/jira/browse/HBASE-5843
 Project: HBase
  Issue Type: Umbrella
Affects Versions: 0.96.0
Reporter: nkeywal
Assignee: nkeywal

 A part of the approach is described here: 
 https://docs.google.com/document/d/1z03xRoZrIJmg7jsWuyKYl6zNournF_7ZHzdi0qz_B4c/edit
 The ideal target is:
 - failure impact client applications only by an added delay to execute a 
 query, whatever the failure.
 - this delay is always inferior to 1 second.
 We're not going to achieve that immediately...
 Priority will be given to the most frequent issues.
 Short term:
 - software crash
 - standard administrative tasks as stop/start of a cluster.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6962) Upgrade hadoop 1 dependency to hadoop 1.1

2012-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476935#comment-13476935
 ] 

Hudson commented on HBASE-6962:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #222 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/222/])
HBASE-6962 Upgrade hadoop 1 dependency to hadoop 1.1 (Revision 1398580)

 Result = FAILURE
enis : 
Files : 
* /hbase/trunk/pom.xml


 Upgrade hadoop 1 dependency to hadoop 1.1
 -

 Key: HBASE-6962
 URL: https://issues.apache.org/jira/browse/HBASE-6962
 Project: HBase
  Issue Type: Bug
 Environment: hadoop 1.1 contains multiple important fixes, including 
 HDFS-3703
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.96.0

 Attachments: 6962.txt




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread liang xie (JIRA)
liang xie created HBASE-6998:


 Summary: Uncatched exception in main() makes the 
HMaster/HRegionServer process suspend
 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie


I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
HMaster/HRegionServer process still up if the main thread is dead. Here is the 
stack trace:

xception in thread main java.net.UnknownHostException: unknown host: cluster1
at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
at org.apache.hadoop.ipc.Client.call(Client.java:1050)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy8.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)

Then i need to kill the process manually each time, so annoyed.
After applied the attached patch, the process will exist as expected, then i am 
happy again :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-6998:
-

Attachment: HBASE-6998.patch

 Uncatched exception in main() makes the HMaster/HRegionServer process suspend
 -

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-6998.patch


 I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
 HMaster/HRegionServer process still up if the main thread is dead. Here is 
 the stack trace:
 xception in thread main java.net.UnknownHostException: unknown host: 
 cluster1
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
 at org.apache.hadoop.ipc.Client.call(Client.java:1050)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)
 Then i need to kill the process manually each time, so annoyed.
 After applied the attached patch, the process will exist as expected, then i 
 am happy again :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-6998:
-

Description: 
I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
HMaster/HRegionServer process still up if the main thread is dead. Here is the 
stack trace:

xception in thread main java.net.UnknownHostException: unknown host: cluster1
at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
at org.apache.hadoop.ipc.Client.call(Client.java:1050)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy8.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)

Then i need to kill the process manually to cleanup each time, so annoyed.
After applied the attached patch, the process will exist as expected, then i am 
happy again :)

  was:
I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
HMaster/HRegionServer process still up if the main thread is dead. Here is the 
stack trace:

xception in thread main java.net.UnknownHostException: unknown host: cluster1
at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
at org.apache.hadoop.ipc.Client.call(Client.java:1050)
at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
at $Proxy8.getProtocolVersion(Unknown Source)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
at 
org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
at 
org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
at 
org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
at 
org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
at 
org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
at 
org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)

Then i need to kill the process manually each time, so annoyed.
After applied the attached patch, the process will exist as expected, then i am 
happy again :)


 Uncatched exception in main() makes the HMaster/HRegionServer process suspend
 -

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: 

[jira] [Commented] (HBASE-6974) Metric for blocked updates

2012-10-16 Thread Michael Drzal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476960#comment-13476960
 ] 

Michael Drzal commented on HBASE-6974:
--

I'll try to get a patch up later today.

 Metric for blocked updates
 --

 Key: HBASE-6974
 URL: https://issues.apache.org/jira/browse/HBASE-6974
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Michael Drzal
Priority: Critical
 Fix For: 0.94.3, 0.96.0


 When the disc subsystem cannot keep up with a sustained high write load, a 
 region will eventually block updates to throttle clients.
 (HRegion.checkResources).
 It would be nice to have a metric for this, so that these occurrences can be 
 tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-6998:
-

Attachment: (was: HBASE-6998.patch)

 Uncatched exception in main() makes the HMaster/HRegionServer process suspend
 -

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-6998.patch


 I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
 HMaster/HRegionServer process still up if the main thread is dead. Here is 
 the stack trace:
 xception in thread main java.net.UnknownHostException: unknown host: 
 cluster1
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
 at org.apache.hadoop.ipc.Client.call(Client.java:1050)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)
 Then i need to kill the process manually to cleanup each time, so annoyed.
 After applied the attached patch, the process will exist as expected, then i 
 am happy again :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-6998:
-

Attachment: HBASE-6998.patch

 Uncatched exception in main() makes the HMaster/HRegionServer process suspend
 -

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-6998.patch


 I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
 HMaster/HRegionServer process still up if the main thread is dead. Here is 
 the stack trace:
 xception in thread main java.net.UnknownHostException: unknown host: 
 cluster1
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
 at org.apache.hadoop.ipc.Client.call(Client.java:1050)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)
 Then i need to kill the process manually to cleanup each time, so annoyed.
 After applied the attached patch, the process will exist as expected, then i 
 am happy again :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread liang xie (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

liang xie updated HBASE-6998:
-

Status: Patch Available  (was: Open)

 Uncatched exception in main() makes the HMaster/HRegionServer process suspend
 -

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-6998.patch


 I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
 HMaster/HRegionServer process still up if the main thread is dead. Here is 
 the stack trace:
 xception in thread main java.net.UnknownHostException: unknown host: 
 cluster1
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
 at org.apache.hadoop.ipc.Client.call(Client.java:1050)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)
 Then i need to kill the process manually to cleanup each time, so annoyed.
 After applied the attached patch, the process will exist as expected, then i 
 am happy again :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6965) Generic MXBean Utility class to support all JDK vendors

2012-10-16 Thread Kumar Ravi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476983#comment-13476983
 ] 

Kumar Ravi commented on HBASE-6965:
---

As suggested, I have combined the two patches into one. I ran above core tests 
with multiple JDKs before and after submitting the patch and have been unable 
to recreaet the problems in above builds. I understand that the problem 
reported in above pre-commit build is in the hbase-server area, whereas the 
patch above applies to the hbase-common module. Can someone suggest next steps 
on how to resolve this issue?

 Generic MXBean Utility class to support all JDK vendors
 ---

 Key: HBASE-6965
 URL: https://issues.apache.org/jira/browse/HBASE-6965
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.94.1
Reporter: Kumar Ravi
Assignee: Kumar Ravi
  Labels: patch
 Fix For: 0.94.3

 Attachments: HBASE-6965.patch


 This issue is related to JIRA 
 https://issues.apache.org/jira/browse/HBASE-6945. This issue is opened to 
 propose the use of a newly created generic 
 org.apache.hadoop.hbase.util.OSMXBean class that can be used by other 
 classes. JIRA HBASE-6945 contains a patch for the class 
 org.apache.hadoop.hbase.ResourceChecker that uses OSMXBean. With the 
 inclusion of this new class, HBase can be built and become functional with 
 JDKs and JREs other than what is provided by Oracle.
  This class uses reflection to determine the JVM vendor (Sun, IBM) and the 
 platform (Linux or Windows), and contains other methods that return the OS 
 properties - 1. Number of Open File descriptors;  2. Maximum number of File 
 Descriptors.
  This class compiles without any problems with IBM JDK 7, OpenJDK 6 as well 
 as Oracle JDK 6. Junit tests (runDevTests category) completed without any 
 failures or errors when tested on all the three JDKs.The builds and tests 
 were attempted on branch hbase-0.94 Revision 1396305.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6965) Generic MXBean Utility class to support all JDK vendors

2012-10-16 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13476997#comment-13476997
 ] 

nkeywal commented on HBASE-6965:


Hi Kumar,

The issue above is very unlikely to be caused by your patch. precommit is an 
environment where all tests are executed before being integrated. But some 
tests are, unfortunately, flaky. It's likely to be the cause here.  The next 
step for your patch is to be reviewed and committed to trunk. Ted reviewed it 
already, so if he's ok he will commit it. If he does not look at it within two 
days I will have a look at it myself (or another committer will take the lead 
in between). Thanks for your contribution :-)

 Generic MXBean Utility class to support all JDK vendors
 ---

 Key: HBASE-6965
 URL: https://issues.apache.org/jira/browse/HBASE-6965
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.94.1
Reporter: Kumar Ravi
Assignee: Kumar Ravi
  Labels: patch
 Fix For: 0.94.3

 Attachments: HBASE-6965.patch


 This issue is related to JIRA 
 https://issues.apache.org/jira/browse/HBASE-6945. This issue is opened to 
 propose the use of a newly created generic 
 org.apache.hadoop.hbase.util.OSMXBean class that can be used by other 
 classes. JIRA HBASE-6945 contains a patch for the class 
 org.apache.hadoop.hbase.ResourceChecker that uses OSMXBean. With the 
 inclusion of this new class, HBase can be built and become functional with 
 JDKs and JREs other than what is provided by Oracle.
  This class uses reflection to determine the JVM vendor (Sun, IBM) and the 
 platform (Linux or Windows), and contains other methods that return the OS 
 properties - 1. Number of Open File descriptors;  2. Maximum number of File 
 Descriptors.
  This class compiles without any problems with IBM JDK 7, OpenJDK 6 as well 
 as Oracle JDK 6. Junit tests (runDevTests category) completed without any 
 failures or errors when tested on all the three JDKs.The builds and tests 
 were attempted on branch hbase-0.94 Revision 1396305.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6998) Uncatched exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477013#comment-13477013
 ] 

Hadoop QA commented on HBASE-6998:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12549299/HBASE-6998.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
82 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3057//console

This message is automatically generated.

 Uncatched exception in main() makes the HMaster/HRegionServer process suspend
 -

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-6998.patch


 I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
 HMaster/HRegionServer process still up if the main thread is dead. Here is 
 the stack trace:
 xception in thread main java.net.UnknownHostException: unknown host: 
 cluster1
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
 at org.apache.hadoop.ipc.Client.call(Client.java:1050)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)

[jira] [Commented] (HBASE-6965) Generic MXBean Utility class to support all JDK vendors

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477023#comment-13477023
 ] 

Ted Yu commented on HBASE-6965:
---

{code}
+ * It will decide to use the sun api or its own implementation
{code}
I think it's better to replace sun with Oracle.
{code}
+public class OSMXBean
{code}
Please add annotation for audience and stability for the above class.
{code}
+   * Check if the OS is unix. If using the IBM java runtime, this
+   * will only work for linux.
{code}
Do you need to mention IBM in the above javadoc ?
{code}
+  public boolean getUnix() {
{code}
Rename method to isUnix().
{code}
+  private Long getOSUnixMXBeanMethod (String mBeanMethodName)
{code}
Rename above method runUnixMXBeanMethod().


 Generic MXBean Utility class to support all JDK vendors
 ---

 Key: HBASE-6965
 URL: https://issues.apache.org/jira/browse/HBASE-6965
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.94.1
Reporter: Kumar Ravi
Assignee: Kumar Ravi
  Labels: patch
 Fix For: 0.94.3

 Attachments: HBASE-6965.patch


 This issue is related to JIRA 
 https://issues.apache.org/jira/browse/HBASE-6945. This issue is opened to 
 propose the use of a newly created generic 
 org.apache.hadoop.hbase.util.OSMXBean class that can be used by other 
 classes. JIRA HBASE-6945 contains a patch for the class 
 org.apache.hadoop.hbase.ResourceChecker that uses OSMXBean. With the 
 inclusion of this new class, HBase can be built and become functional with 
 JDKs and JREs other than what is provided by Oracle.
  This class uses reflection to determine the JVM vendor (Sun, IBM) and the 
 platform (Linux or Windows), and contains other methods that return the OS 
 properties - 1. Number of Open File descriptors;  2. Maximum number of File 
 Descriptors.
  This class compiles without any problems with IBM JDK 7, OpenJDK 6 as well 
 as Oracle JDK 6. Junit tests (runDevTests category) completed without any 
 failures or errors when tested on all the three JDKs.The builds and tests 
 were attempted on branch hbase-0.94 Revision 1396305.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6998) Uncaught exception in main() makes the HMaster/HRegionServer process suspend

2012-10-16 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-6998:
--

Summary: Uncaught exception in main() makes the HMaster/HRegionServer 
process suspend  (was: Uncatched exception in main() makes the 
HMaster/HRegionServer process suspend)

 Uncaught exception in main() makes the HMaster/HRegionServer process suspend
 

 Key: HBASE-6998
 URL: https://issues.apache.org/jira/browse/HBASE-6998
 Project: HBase
  Issue Type: Bug
  Components: master, regionserver
Affects Versions: 0.94.2, 0.96.0
 Environment: CentOS6.2 + CDH4.1 HDFS  + hbase0.94.2
Reporter: liang xie
Assignee: liang xie
 Attachments: HBASE-6998.patch


 I am trying HDFS QJM feature in our test env. after a misconfig, i found the 
 HMaster/HRegionServer process still up if the main thread is dead. Here is 
 the stack trace:
 xception in thread main java.net.UnknownHostException: unknown host: 
 cluster1
 at org.apache.hadoop.ipc.Client$Connection.init(Client.java:214)
 at org.apache.hadoop.ipc.Client.getConnection(Client.java:1196)
 at org.apache.hadoop.ipc.Client.call(Client.java:1050)
 at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:225)
 at $Proxy8.getProtocolVersion(Unknown Source)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:396)
 at org.apache.hadoop.ipc.RPC.getProxy(RPC.java:379)
 at 
 org.apache.hadoop.hdfs.DFSClient.createRPCNamenode(DFSClient.java:119)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:238)
 at org.apache.hadoop.hdfs.DFSClient.init(DFSClient.java:203)
 at 
 org.apache.hadoop.hdfs.DistributedFileSystem.initialize(DistributedFileSystem.java:89)
 at 
 org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1386)
 at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:66)
 at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1404)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:254)
 at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:123)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3647)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.startRegionServer(HRegionServer.java:3631)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.start(HRegionServerCommandLine.java:61)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServerCommandLine.run(HRegionServerCommandLine.java:75)
 at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
 at 
 org.apache.hadoop.hbase.util.ServerCommandLine.doMain(ServerCommandLine.java:76)
 at 
 org.apache.hadoop.hbase.regionserver.HRegionServer.main(HRegionServer.java:3691)
 Then i need to kill the process manually to cleanup each time, so annoyed.
 After applied the attached patch, the process will exist as expected, then i 
 am happy again :)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4955) Use the official versions of surefire junit

2012-10-16 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4955?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477031#comment-13477031
 ] 

nkeywal commented on HBASE-4955:


From JUnit mailing list (15th oct) ??I am happy to announce the release of 
JUnit 4.11-beta-1. There have been a lot of contributions by a full cast of 
contributors.?? So we will have the release this quarter with some luck.
Surefire: Still waiting for #800. May be it will make it to the 2.13. No date.

 Use the official versions of surefire  junit
 -

 Key: HBASE-4955
 URL: https://issues.apache.org/jira/browse/HBASE-4955
 Project: HBase
  Issue Type: Improvement
  Components: test
Affects Versions: 0.94.0
 Environment: all
Reporter: nkeywal
Assignee: nkeywal
Priority: Minor

 We currently use private versions for Surefire  JUnit since HBASE-4763.
 This JIRA traks what we need to move to official versions.
 Surefire 2.11 is just out, but, after some tests, it does not contain all 
 what we need.
 JUnit. Could be for JUnit 4.11. Issue to monitor:
 https://github.com/KentBeck/junit/issues/359: fixed in our version, no 
 feedback for an integration on trunk
 Surefire: Could be for Surefire 2.12. Issues to monitor are:
 329 (category support): fixed, we use the official implementation from the 
 trunk
 786 (@Category with forkMode=always): fixed, we use the official 
 implementation from the trunk
 791 (incorrect elapsed time on test failure): fixed, we use the official 
 implementation from the trunk
 793 (incorrect time in the XML report): Not fixed (reopen) on trunk, fixed on 
 our version.
 760 (does not take into account the test method): fixed in trunk, not fixed 
 in our version
 798 (print immediately the test class name): not fixed in trunk, not fixed in 
 our version
 799 (Allow test parallelization when forkMode=always): not fixed in trunk, 
 not fixed in our version
 800 (redirectTestOutputToFile not taken into account): not yet fix on trunk, 
 fixed on our version
 800  793 are the more important to monitor, it's the only ones that are 
 fixed in our version but not on trunk.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477064#comment-13477064
 ] 

Ted Yu commented on HBASE-6942:
---

{code}
+long noOfRowsDeleted = invokeBulkDeleteProtocol(tableName, new Scan(), 
500, DeleteType.ROW,
+null);
{code}
I think the test should also cover the case where batchSize is smaller than the 
number of rows to be deleted.

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Work started] (HBASE-6974) Metric for blocked updates

2012-10-16 Thread Michael Drzal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Work on HBASE-6974 started by Michael Drzal.

 Metric for blocked updates
 --

 Key: HBASE-6974
 URL: https://issues.apache.org/jira/browse/HBASE-6974
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Michael Drzal
Priority: Critical
 Fix For: 0.94.3, 0.96.0


 When the disc subsystem cannot keep up with a sustained high write load, a 
 region will eventually block updates to throttle clients.
 (HRegion.checkResources).
 It would be nice to have a metric for this, so that these occurrences can be 
 tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6965) Generic MXBean Utility class to support all JDK vendors

2012-10-16 Thread Kumar Ravi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477069#comment-13477069
 ] 

Kumar Ravi commented on HBASE-6965:
---

Ted and nkeywal - Thanks for your comments and patience working with me on this 
Jira. 

Ted, I'm working on addressing your concerns raised above. I do have one 
question:
Not sure what you mean by Please add annotation for audience and stability for 
the above class.

I think by audience you mean when this class gets invoked? I am not clear about 
what you mean by stability.
If you could point me to some examples in the hbase code that would be great.


 Generic MXBean Utility class to support all JDK vendors
 ---

 Key: HBASE-6965
 URL: https://issues.apache.org/jira/browse/HBASE-6965
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.94.1
Reporter: Kumar Ravi
Assignee: Kumar Ravi
  Labels: patch
 Fix For: 0.94.3

 Attachments: HBASE-6965.patch


 This issue is related to JIRA 
 https://issues.apache.org/jira/browse/HBASE-6945. This issue is opened to 
 propose the use of a newly created generic 
 org.apache.hadoop.hbase.util.OSMXBean class that can be used by other 
 classes. JIRA HBASE-6945 contains a patch for the class 
 org.apache.hadoop.hbase.ResourceChecker that uses OSMXBean. With the 
 inclusion of this new class, HBase can be built and become functional with 
 JDKs and JREs other than what is provided by Oracle.
  This class uses reflection to determine the JVM vendor (Sun, IBM) and the 
 platform (Linux or Windows), and contains other methods that return the OS 
 properties - 1. Number of Open File descriptors;  2. Maximum number of File 
 Descriptors.
  This class compiles without any problems with IBM JDK 7, OpenJDK 6 as well 
 as Oracle JDK 6. Junit tests (runDevTests category) completed without any 
 failures or errors when tested on all the three JDKs.The builds and tests 
 were attempted on branch hbase-0.94 Revision 1396305.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6965) Generic MXBean Utility class to support all JDK vendors

2012-10-16 Thread Kumar Ravi (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6965?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477089#comment-13477089
 ] 

Kumar Ravi commented on HBASE-6965:
---

I looked at the other classes in the util sub-dir and I now understand what is 
meant by Audience and stability. Please ignore my earlier comment.
Thanks

 Generic MXBean Utility class to support all JDK vendors
 ---

 Key: HBASE-6965
 URL: https://issues.apache.org/jira/browse/HBASE-6965
 Project: HBase
  Issue Type: Improvement
  Components: build
Affects Versions: 0.94.1
Reporter: Kumar Ravi
Assignee: Kumar Ravi
  Labels: patch
 Fix For: 0.94.3

 Attachments: HBASE-6965.patch


 This issue is related to JIRA 
 https://issues.apache.org/jira/browse/HBASE-6945. This issue is opened to 
 propose the use of a newly created generic 
 org.apache.hadoop.hbase.util.OSMXBean class that can be used by other 
 classes. JIRA HBASE-6945 contains a patch for the class 
 org.apache.hadoop.hbase.ResourceChecker that uses OSMXBean. With the 
 inclusion of this new class, HBase can be built and become functional with 
 JDKs and JREs other than what is provided by Oracle.
  This class uses reflection to determine the JVM vendor (Sun, IBM) and the 
 platform (Linux or Windows), and contains other methods that return the OS 
 properties - 1. Number of Open File descriptors;  2. Maximum number of File 
 Descriptors.
  This class compiles without any problems with IBM JDK 7, OpenJDK 6 as well 
 as Oracle JDK 6. Junit tests (runDevTests category) completed without any 
 failures or errors when tested on all the three JDKs.The builds and tests 
 were attempted on branch hbase-0.94 Revision 1396305.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477128#comment-13477128
 ] 

Ted Yu commented on HBASE-6942:
---

{code}
BulkDeleteResponse delete(Scan scan, DeleteType type, Long timestamp, int 
batchSize)
{code}
What if user wants to delete more than one column family ? How would he / she 
formulate through one request ?

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5783) Faster HBase bulk loader

2012-10-16 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477137#comment-13477137
 ] 

Karthik Ranganathan commented on HBASE-5783:


No, we track only the last (highest) one per region. Also, in the actual 
implementation, we did it with just timestamps from the RS. So, after doing all 
the puts the loader gets the time on the RS (t1). The server tracks the start 
time of the last successfully completed flush {t2). Querying that and making 
sure t2  t1 is enough. Of course - if the region has moved gracefully, thats 
considered a success too as an optimization.

We used the term MR Bulk Loader simply to say that the load of the data 
should be repeatable in case of failure (as opposed to a online use case).

 Faster HBase bulk loader
 

 Key: HBASE-5783
 URL: https://issues.apache.org/jira/browse/HBASE-5783
 Project: HBase
  Issue Type: New Feature
  Components: Client, IPC/RPC, Performance, regionserver
Reporter: Karthik Ranganathan
Assignee: Amitanand Aiyer

 We can get a 3x to 4x gain based on a prototype demonstrating this approach 
 in effect (hackily) over the MR bulk loader for very large data sets by doing 
 the following:
 1. Do direct multi-puts from HBase client using GZIP compressed RPC's
 2. Turn off WAL (we will ensure no data loss in another way)
 3. For each bulk load client, we need to:
 3.1 do a put
 3.2 get back a tracking cookie (memstoreTs or HLogSequenceId) per put
 3.3 be able to ask the RS if the tracking cookie has been flushed to disk
 4. For each client, we can succeed it if the tracking cookie for the last put 
 it did (for every RS) makes it to disk. Otherwise the map task fails and is 
 retried.
 5. If the last put did not make it to disk for a timeout (say a second or so) 
 we issue a manual flush.
 Enhancements:
 - Increase the memstore size so that we flush larger files
 - Decrease the compaction ratios (say increase the number of files to compact)
 Quick background:
 The bottlenecks in the multiput approach are that the data is transferred 
 *uncompressed* twice over the top-of-rack: once from the client to the RS (on 
 the multi put call) and again because of WAL (HDFS replication). We reduced 
 the former with RPC compression and eliminated the latter above while still 
 guaranteeing that data wont be lost.
 This is better than the MR bulk loader at a high level because we dont need 
 to merge sort all the files for a given region and then make it a HFile - 
 thats the equivalent of bulk loading AND majorcompacting in one shot. Also 
 there is much more disk involved in the MR method (sort/spill).

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6979) recovered.edits file should not break distributed log splitting

2012-10-16 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6979?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477141#comment-13477141
 ] 

Jimmy Xiang commented on HBASE-6979:


Thanks a lot for the review.  I will commit this in trunk tomorrow if no 
objection.

 recovered.edits file should not break distributed log splitting
 ---

 Key: HBASE-6979
 URL: https://issues.apache.org/jira/browse/HBASE-6979
 Project: HBase
  Issue Type: Improvement
  Components: master
Reporter: Jimmy Xiang
Assignee: Jimmy Xiang
 Attachments: trunk-6979.patch


 Distributed log splitting fails in creating the recovered.edits folder during 
 upgrade because there is a file called recovered.edits there.
 Instead of checking if the patch exists, we need to check if it exists and is 
 a path.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477143#comment-13477143
 ] 

Anoop Sam John commented on HBASE-6942:
---

bq.What if user wants to delete more than one column family ? How would he / 
she formulate through one request ?
When the type is FAMILY, we will delete all the families coming as part of the 
scan result.. So add N families in Scan

Yes I will add that test case also..


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477156#comment-13477156
 ] 

Ted Yu commented on HBASE-6942:
---

bq. as part of the scan result
ListKeyValue is returned from scanner.next(). It would be easier to 
understand user intention through Scan.getFamilyMap() instead of analyzing scan 
result.

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477174#comment-13477174
 ] 

Lars Hofhansl commented on HBASE-6942:
--

I don't necessarily agree here, Ted. Analyzing the scan result is the whole 
point of this jira. Passing a template family map will not make this easier for 
a user (IMHO).

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477177#comment-13477177
 ] 

Ted Yu commented on HBASE-6942:
---

bq. Just worried that N will necessarily be (much?) larger than M.
How do we correlate scan result with which column families to delete ?

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

2012-10-16 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477194#comment-13477194
 ] 

Karthik Ranganathan commented on HBASE-6980:


@ramakrishna - this should not be necessary for ensuring no data loss right? 
Once we have a snapshot memstore, we automatically should know the max seq id 
to which it has data - that would never change.

1. From what I remember of the code (when I was looking into something 
unrelated), we track the *min* seq id from the current memstore instead of the 
max seq id from the snapshot memstore to put into the HLog when its rolled 
after a flush. So this synchronization becomes necessary - if we store the max 
seq id along with the memstore that is flushed, we should be able to eliminate 
the locks.

2. Also, its arguable if we need the absolute correct max-seq-id flushed. In a 
very small % of cases, we would end up rolling logs a bit slower. As long as we 
are conservative with updating the max seq id in the HLog we should be good, 
right?

 Parallel Flushing Of Memstores
 --

 Key: HBASE-6980
 URL: https://issues.apache.org/jira/browse/HBASE-6980
 Project: HBase
  Issue Type: New Feature
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 For write dominated workloads, single threaded memstore flushing is an 
 unnecessary bottleneck. With a single flusher thread, we are basically not 
 setup to take advantage of the aggregate throughput that multi-disk nodes 
 provide.
 * For puts with WAL enabled, the bottleneck is more likely the single WAL 
 per region server. So this particular fix may not buy as much unless we 
 unlock that bottleneck with multiple commit logs per region server. (Topic 
 for a separate JIRA-- HBASE-6981).
 * But for puts with WAL disabled (e.g., when using HBASE-5783 style fast bulk 
 imports), we should be able to support much better ingest rates with parallel 
 flushing of memstores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6949) Automatically delete empty directories in CleanerChore

2012-10-16 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477207#comment-13477207
 ] 

Jesse Yates commented on HBASE-6949:


[~stack], [~lhofhansl] what do you guys think? Good to go?

 Automatically delete empty directories in CleanerChore
 --

 Key: HBASE-6949
 URL: https://issues.apache.org/jira/browse/HBASE-6949
 Project: HBase
  Issue Type: Bug
Affects Versions: 0.94.3, 0.96.0
Reporter: Jesse Yates
Assignee: Jesse Yates
 Fix For: 0.94.3, 0.96.0

 Attachments: hbase-6949-v0.patch, hbase-6949-v1.patch


 Currently the CleanerChore asks cleaner delegates if both directories and 
 files should be deleted. However, this leads to somewhat odd behavior in some 
 delegates - you don't actually care if the directory hierarchy is preserved, 
 the files; this means you always will delete directories and then implement 
 the logic you actually want for preserving files. Instead we can handle this 
 logic one layer higher in the CleanerChore and let the delegates just worry 
 about preserving files.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6797) TestHFileCleaner#testHFileCleaning sometimes fails in trunk

2012-10-16 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6797?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6797:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

 TestHFileCleaner#testHFileCleaning sometimes fails in trunk
 ---

 Key: HBASE-6797
 URL: https://issues.apache.org/jira/browse/HBASE-6797
 Project: HBase
  Issue Type: Bug
Reporter: Ted Yu
Assignee: Jesse Yates
 Attachments: hbase-6797-v0.patch, hbase-6797-v1.patch


 In build #3334, I saw:
 {code}
 java.lang.AssertionError: expected:1 but was:0
   at org.junit.Assert.fail(Assert.java:93)
   at org.junit.Assert.failNotEquals(Assert.java:647)
   at org.junit.Assert.assertEquals(Assert.java:128)
   at org.junit.Assert.assertEquals(Assert.java:472)
   at org.junit.Assert.assertEquals(Assert.java:456)
   at 
 org.apache.hadoop.hbase.master.cleaner.TestHFileCleaner.testHFileCleaning(TestHFileCleaner.java:88)
 {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6858) Fix the incorrect BADVERSION checking in the recoverable zookeeper

2012-10-16 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6858:
---

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Integrated into trunk. Thanks all for the review.

 Fix the incorrect BADVERSION checking in the recoverable zookeeper
 --

 Key: HBASE-6858
 URL: https://issues.apache.org/jira/browse/HBASE-6858
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Liyin Tang
Assignee: Liyin Tang
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-6858.patch, HBASE-6858_v2.patch, 
 HBASE-6858_v3.patch, trunk-6858.patch, trunk-6858_v2.patch, 
 trunk-6858_v3.patch


 Thanks for Stack and Kaka's reporting that there is a bug in the recoverable 
 zookeeper when handling BADVERSION exception for setData(). It shall compare 
 the ID payload of the data in zk with its own identifier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6986) Reenable TestClientTimeouts for security build

2012-10-16 Thread Gregory Chanan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gregory Chanan resolved HBASE-6986.
---

Resolution: Won't Fix

Marking as Won't Fix.  Getting it to work with both the secure and non-secure 
builds is difficult.  The issue is you don't seem to be able to change the 
invocation handler for a proxy once it's been set.  I want to set my own 
handler and dispatch through the actual invocation handler for the RpcEngine, 
but I don't know how to create the InvocationHandler for an arbitrary 
RpcEngine.  I can maintain a mapping for each type of RpcEngine, but that code 
ended up looking pretty ugly.

I still think having an RpcEngine that throws random SocketTimeoutExceptions is 
useful for testing, but I'll investigate doing it only on trunk via HBASE-6987.

 Reenable TestClientTimeouts for security build
 --

 Key: HBASE-6986
 URL: https://issues.apache.org/jira/browse/HBASE-6986
 Project: HBase
  Issue Type: Sub-task
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.94.3


 TestClientTimeouts was disabled to get 0.94.2 out the door because it didn't 
 work in security build.  Investigate and reenable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6858) Fix the incorrect BADVERSION checking in the recoverable zookeeper

2012-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477306#comment-13477306
 ] 

Hudson commented on HBASE-6858:
---

Integrated in HBase-TRUNK #3450 (See 
[https://builds.apache.org/job/HBase-TRUNK/3450/])
HBASE-6858 Fix the incorrect BADVERSION checking in the recoverable 
zookeeper (Revision 1398920)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java


 Fix the incorrect BADVERSION checking in the recoverable zookeeper
 --

 Key: HBASE-6858
 URL: https://issues.apache.org/jira/browse/HBASE-6858
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Liyin Tang
Assignee: Liyin Tang
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-6858.patch, HBASE-6858_v2.patch, 
 HBASE-6858_v3.patch, trunk-6858.patch, trunk-6858_v2.patch, 
 trunk-6858_v3.patch


 Thanks for Stack and Kaka's reporting that there is a bug in the recoverable 
 zookeeper when handling BADVERSION exception for setData(). It shall compare 
 the ID payload of the data in zk with its own identifier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6894) Adding metadata to a table in the shell is both arcane and painful

2012-10-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477352#comment-13477352
 ] 

Sergey Shelukhin commented on HBASE-6894:
-

ping? :)

 Adding metadata to a table in the shell is both arcane and painful
 --

 Key: HBASE-6894
 URL: https://issues.apache.org/jira/browse/HBASE-6894
 Project: HBase
  Issue Type: Bug
  Components: shell
Affects Versions: 0.96.0
Reporter: stack
Assignee: Sergey Shelukhin
  Labels: noob
 Attachments: HBASE-6894.patch, HBASE-6894.patch, HBASE-6894.patch


 In production we have hundreds of tables w/ whack names like 'aliaserv', 
 'ashish_bulk', 'age_gender_topics', etc.  It be grand if you could look in 
 master UI and see stuff like owner, eng group responsible, miscellaneous 
 description, etc.
 Now, HTD has support for this; each carries a dictionary.  Whats a PITA 
 though is adding attributes to the dictionary.  Here is what seems to work on 
 trunk (though I do not trust it is doing the right thing):
 {code}
 hbase create 'SOME_TABLENAME', {NAME = 'd', VERSION = 1, COMPRESSION = 
 'LZO'}
 hbase # Here is how I added metadata
 hbase disable 'SOME_TABLENAME'
 hbase alter 'SOME_TABLENAME', METHOD = 'table_att', OWNER = 'SOMEON', 
 CONFIG = {'ENVIRONMENT' = 'BLAH BLAH', 'SIZING' = 'The size should be 
 between 0-10K most of the time with new URLs coming in and getting removed as 
 they are processed unless the pipeline has fallen behind', 'MISCELLANEOUS' = 
 'Holds the list of URLs waiting to be processed in the parked page detection 
 analyzer in ingestion pipeline.'}
 ...
 describe...
 enable...
 {code}
 The above doesn't work in 0.94.  Complains about the CONFIG, the keyword we 
 are using for the HTD dictionary.
 It works in 0.96 though I'd have to poke around some more to ensure it is 
 doing the right thing.
 But this METHOD = 'table_att' stuff is really ugly can we fix it?
 And I can't add table attributes on table create seemingly.
 A little bit of thought and a bit of ruby could clean this all up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6583) Enhance Hbase load test tool to automatically create cf's if not present

2012-10-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477358#comment-13477358
 ] 

Sergey Shelukhin commented on HBASE-6583:
-

as in, it creates columns if not present... although to actually use this 
functionality some other change would need to be made - I noticed columns are 
hardcoded

 Enhance Hbase load test tool to automatically create cf's if not present
 

 Key: HBASE-6583
 URL: https://issues.apache.org/jira/browse/HBASE-6583
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Karthik Ranganathan
Assignee: Sergey Shelukhin
  Labels: noob
 Attachments: HBASE-6583.patch, HBASE-6583.patch


 The load test tool currently disables the table and applies any changes to 
 the cf descriptor if any, but does not create the cf if not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6583) Enhance Hbase load test tool to automatically create cf's if not present

2012-10-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477357#comment-13477357
 ] 

Sergey Shelukhin commented on HBASE-6583:
-

verified it actually works

 Enhance Hbase load test tool to automatically create cf's if not present
 

 Key: HBASE-6583
 URL: https://issues.apache.org/jira/browse/HBASE-6583
 Project: HBase
  Issue Type: Bug
  Components: test
Reporter: Karthik Ranganathan
Assignee: Sergey Shelukhin
  Labels: noob
 Attachments: HBASE-6583.patch, HBASE-6583.patch


 The load test tool currently disables the table and applies any changes to 
 the cf descriptor if any, but does not create the cf if not present.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6974) Metric for blocked updates

2012-10-16 Thread Michael Drzal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Drzal updated HBASE-6974:
-

Attachment: HBASE-6974.patch

First shot at this.

 Metric for blocked updates
 --

 Key: HBASE-6974
 URL: https://issues.apache.org/jira/browse/HBASE-6974
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Michael Drzal
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6974.patch


 When the disc subsystem cannot keep up with a sustained high write load, a 
 region will eventually block updates to throttle clients.
 (HRegion.checkResources).
 It would be nice to have a metric for this, so that these occurrences can be 
 tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6974) Metric for blocked updates

2012-10-16 Thread Michael Drzal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Drzal updated HBASE-6974:
-

Status: Patch Available  (was: In Progress)

 Metric for blocked updates
 --

 Key: HBASE-6974
 URL: https://issues.apache.org/jira/browse/HBASE-6974
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Michael Drzal
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6974.patch


 When the disc subsystem cannot keep up with a sustained high write load, a 
 region will eventually block updates to throttle clients.
 (HRegion.checkResources).
 It would be nice to have a metric for this, so that these occurrences can be 
 tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-6577:
--


 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477379#comment-13477379
 ] 

Lars Hofhansl commented on HBASE-6577:
--

This just came up on the mailing list again:
{code}
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$EncodedScannerV2.loadBlockAndSeekToKey(HFileReaderV2.java:1027)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.seekTo(HFileReaderV2.java:461)
at
org.apache.hadoop.hbase.io.hfile.HFileReaderV2$AbstractScannerV2.reseekTo(HFileReaderV2.java:493)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseekAtOrAfter(StoreFileScanner.java:242)
at
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(StoreFileScanner.java:167)
at
org.apache.hadoop.hbase.regionserver.NonLazyKeyValueScanner.doRealSeek(NonLazyKeyValueScanner.java:54)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.reseek(StoreScanner.java:521)
- locked 0x00059584fab8 (a
org.apache.hadoop.hbase.regionserver.StoreScanner)
at
org.apache.hadoop.hbase.regionserver.StoreScanner.next(StoreScanner.java:402)
- locked 0x00059584fab8 (a
org.apache.hadoop.hbase.regionserver.StoreScanner)
at
org.apache.hadoop.hbase.regionserver.HRegion$RegionScannerImpl.nextRow(HRegion.java:3507)
at
...
{code}

zahoor mentioned there that his KVs have very many version 1500+.
Presumably each new column (likely) starts on a new (HBase) block, because of 
the many versions, which is why we see a lot of seeking.

I wonder whether a solution like the following would work:
In HRegionScannerImpl.nextRow(...) we try the current naive iteration for N 
KVs (let's say 100). If by then we have not reached the next row, we'll issue a 
direct seek.
That way if there are few version we avoid unnecessary seeks, but with many 
version we can seek past a lot of KVs (and thus also avoid unnecessary seeks).

I can make a patch for that.

[~jdcryans] Would you be able the recreate the issue you saw with the initial 
version of this patch in production?

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.3, 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6577:
-

Fix Version/s: 0.96.0
   0.94.3

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.3, 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6974) Metric for blocked updates

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477397#comment-13477397
 ] 

Lars Hofhansl commented on HBASE-6974:
--

Looks good. Few minor comments:
* I think you snug a divider by 1024 in there to convert from ms to s :)
* We should also collect another metric when this situation happens in the 
memstore flusher (here it happens because of global memory pressure)
* Let's use EnvironmentEdge.currentTimeMillis()
* Nit: a call to currentTimeMillis is not free, we should only call it in the 
!blocked part inside the while loop (which means it cannot be final, has to be 
initialized with 0, etc)


 Metric for blocked updates
 --

 Key: HBASE-6974
 URL: https://issues.apache.org/jira/browse/HBASE-6974
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Michael Drzal
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6974.patch


 When the disc subsystem cannot keep up with a sustained high write load, a 
 region will eventually block updates to throttle clients.
 (HRegion.checkResources).
 It would be nice to have a metric for this, so that these occurrences can be 
 tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6577:
-

Status: Patch Available  (was: Reopened)

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.3, 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt, 6577-v3.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6577:
-

Attachment: 6577-v3.txt

Something like this.
The 100 should probably be configurable.

This should take care of the case of a few version and the case of very many 
versions.

... let me know what you think.

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.3, 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt, 6577-v3.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6974) Metric for blocked updates

2012-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477416#comment-13477416
 ] 

Hadoop QA commented on HBASE-6974:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12549387/HBASE-6974.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
82 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3058//console

This message is automatically generated.

 Metric for blocked updates
 --

 Key: HBASE-6974
 URL: https://issues.apache.org/jira/browse/HBASE-6974
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Michael Drzal
Priority: Critical
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6974.patch


 When the disc subsystem cannot keep up with a sustained high write load, a 
 region will eventually block updates to throttle clients.
 (HRegion.checkResources).
 It would be nice to have a metric for this, so that these occurrences can be 
 tracked.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5355) Compressed RPC's for HBase

2012-10-16 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477428#comment-13477428
 ] 

Devaraj Das commented on HBASE-5355:


[~lhofhansl], do you think it is okay to commit the patch since this can be 
configured to be off anyway? From the comments on this jira and from the 
Facebook reviewboard, it seems like Facebook folks have stood to gain from this 
feature - https://reviews.facebook.net/D1671#summary (and hence this could help 
other similar deployments too). What do you think?

 Compressed RPC's for HBase
 --

 Key: HBASE-5355
 URL: https://issues.apache.org/jira/browse/HBASE-5355
 Project: HBase
  Issue Type: Improvement
  Components: IPC/RPC
Affects Versions: 0.89.20100924
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBASE-5355-0.94.patch


 Some application need ability to do large batched writes and reads from a 
 remote MR cluster. These eventually get bottlenecked on the network. These 
 results are also pretty compressible sometimes.
 The aim here is to add the ability to do compressed calls to the server on 
 both the send and receive paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-10-16 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477455#comment-13477455
 ] 

Elliott Clark commented on HBASE-6410:
--

https://reviews.apache.org/r/7616/

 Move RegionServer Metrics to metrics2
 -

 Key: HBASE-6410
 URL: https://issues.apache.org/jira/browse/HBASE-6410
 Project: HBase
  Issue Type: Sub-task
  Components: metrics
Affects Versions: 0.96.0
Reporter: Elliott Clark
Assignee: Elliott Clark
Priority: Blocker
 Attachments: HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410.patch


 Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5355) Compressed RPC's for HBase

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477456#comment-13477456
 ] 

Lars Hofhansl commented on HBASE-5355:
--

Would need to digest the patch some more, but I do not see any principle reason 
against it. Would need support the SecureRpcEngine too.
Also it would be nice to get some numbers about much latency is increased.

 Compressed RPC's for HBase
 --

 Key: HBASE-5355
 URL: https://issues.apache.org/jira/browse/HBASE-5355
 Project: HBase
  Issue Type: Improvement
  Components: IPC/RPC
Affects Versions: 0.89.20100924
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBASE-5355-0.94.patch


 Some application need ability to do large batched writes and reads from a 
 remote MR cluster. These eventually get bottlenecked on the network. These 
 results are also pretty compressible sometimes.
 The aim here is to add the ability to do compressed calls to the server on 
 both the send and receive paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477458#comment-13477458
 ] 

Hadoop QA commented on HBASE-6577:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12549397/6577-v3.txt
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
82 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 5 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3059//console

This message is automatically generated.

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.94.3, 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt, 6577-v3.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5355) Compressed RPC's for HBase

2012-10-16 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5355?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477465#comment-13477465
 ] 

Devaraj Das commented on HBASE-5355:


Thanks, [~lhofhansl], I have done the required work for making it work in trunk 
(via HBASE-6966). As far as I am concerned, I'd like to get the patch in 0.96. 
I'll try to get some latency numbers soon using that patch.

 Compressed RPC's for HBase
 --

 Key: HBASE-5355
 URL: https://issues.apache.org/jira/browse/HBASE-5355
 Project: HBase
  Issue Type: Improvement
  Components: IPC/RPC
Affects Versions: 0.89.20100924
Reporter: Karthik Ranganathan
Assignee: Karthik Ranganathan
 Attachments: HBASE-5355-0.94.patch


 Some application need ability to do large batched writes and reads from a 
 remote MR cluster. These eventually get bottlenecked on the network. These 
 results are also pretty compressible sometimes.
 The aim here is to add the ability to do compressed calls to the server on 
 both the send and receive paths.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6577:
-

Fix Version/s: (was: 0.94.3)

I tried to reproduce the issue on the mailing (using a PrefixFilter), but I 
couldn't. Probably too risky at this point for 0.94.

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt, 6577-v3.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6858) Fix the incorrect BADVERSION checking in the recoverable zookeeper

2012-10-16 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6858?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477494#comment-13477494
 ] 

Hudson commented on HBASE-6858:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #223 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/223/])
HBASE-6858 Fix the incorrect BADVERSION checking in the recoverable 
zookeeper (Revision 1398920)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/RecoverableZooKeeper.java


 Fix the incorrect BADVERSION checking in the recoverable zookeeper
 --

 Key: HBASE-6858
 URL: https://issues.apache.org/jira/browse/HBASE-6858
 Project: HBase
  Issue Type: Bug
  Components: Zookeeper
Reporter: Liyin Tang
Assignee: Liyin Tang
Priority: Critical
 Fix For: 0.96.0

 Attachments: HBASE-6858.patch, HBASE-6858_v2.patch, 
 HBASE-6858_v3.patch, trunk-6858.patch, trunk-6858_v2.patch, 
 trunk-6858_v3.patch


 Thanks for Stack and Kaka's reporting that there is a bug in the recoverable 
 zookeeper when handling BADVERSION exception for setData(). It shall compare 
 the ID payload of the data in zk with its own identifier.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6999) Start/end row should be configurable in TableInputFormat

2012-10-16 Thread Mikhail Bautin (JIRA)
Mikhail Bautin created HBASE-6999:
-

 Summary: Start/end row should be configurable in TableInputFormat
 Key: HBASE-6999
 URL: https://issues.apache.org/jira/browse/HBASE-6999
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Priority: Minor




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6815) [WINDOWS] Provide hbase scripts in order to start HBASE on Windows in a single user mode

2012-10-16 Thread Enis Soztutar (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6815?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Enis Soztutar updated HBASE-6815:
-

Assignee: Slavik Krassovsky

 [WINDOWS] Provide hbase scripts in order to start HBASE on Windows in a 
 single user mode
 

 Key: HBASE-6815
 URL: https://issues.apache.org/jira/browse/HBASE-6815
 Project: HBase
  Issue Type: Sub-task
Affects Versions: 0.94.3, 0.96.0
Reporter: Enis Soztutar
Assignee: Slavik Krassovsky

 Provide .cmd scripts in order to start HBASE on Windows in a single user mode

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6793) Make hbase-examples module

2012-10-16 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-6793:


Attachment: HBASE-6793.patch

Here's the patch #1.
I'd prefer to move the remaining examples into the module as a next step, 
because this is already too big. Or I can keep working on the same patch, but 
if something intervenes there's a risk of this sitting around in JIRA :)

The patch:
1) Creates the module.
2) Ports the mapreduce examples as is.
3) Ports, adds thrift-generated code, and touches up thrift (not thrift2 yet) 
examples:
  a) Java example builds and runs out of the box. 
  b) Perl/PHP/Ruby/Python examples run, but some of them are out of date: they 
bail out when something that should produce error doesn't. I found some old 
Jira-s to fix that for Java and CPP; these should be updated similarly. I think 
this should also be a separate Jira (or -s).
  c) CPP example cannot be built in mvn in absence of native thrift/boost, so 
it's copied in sources and user can set up the above and run `make`.

 Make hbase-examples module
 --

 Key: HBASE-6793
 URL: https://issues.apache.org/jira/browse/HBASE-6793
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Sergey Shelukhin
  Labels: noob
 Attachments: HBASE-6793.patch


 There are some examples under /examples/, which are not compiled as a part of 
 the build. We can move them to an hbase-examples module.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6793) Make hbase-examples module

2012-10-16 Thread Sergey Shelukhin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477531#comment-13477531
 ] 

Sergey Shelukhin commented on HBASE-6793:
-

https://reviews.apache.org/r/7626/

 Make hbase-examples module
 --

 Key: HBASE-6793
 URL: https://issues.apache.org/jira/browse/HBASE-6793
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Sergey Shelukhin
  Labels: noob
 Attachments: HBASE-6793.patch


 There are some examples under /examples/, which are not compiled as a part of 
 the build. We can move them to an hbase-examples module.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-4962) Optimize time range scans using a delete Bloom filter

2012-10-16 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin resolved HBASE-4962.
---

Resolution: Duplicate

 Optimize time range scans using a delete Bloom filter
 -

 Key: HBASE-4962
 URL: https://issues.apache.org/jira/browse/HBASE-4962
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Assignee: Pritam Damania
Priority: Minor

 To speed up time range scans we need to seek to the maximum timestamp of the 
 requested range,instead of going to the first KV of the (row, column) pair 
 and iterating from there. If we don't know the (row, column), e.g. if it is 
 not specified in the query, we need to go to end of the current row/column 
 pair first, get a KV from there, and do another seek to (row', column', 
 timerange_max) from there. We can only skip over to the timerange_max 
 timestamp when we know that there are no DeleteColumn records at the top of 
 that row/column with a higher timestamp. We can utilize another Bloom filter 
 keyed on (row, column) to quickly find that out.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5032) Add other DELETE type information into the delete bloom filter to optimize the time range query

2012-10-16 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-5032:
--

Assignee: Adela Maznikar  (was: Liyin Tang)

 Add other DELETE type information into the delete bloom filter to optimize 
 the time range query
 ---

 Key: HBASE-5032
 URL: https://issues.apache.org/jira/browse/HBASE-5032
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Adela Maznikar

 To speed up time range scans we need to seek to the maximum timestamp of the 
 requested range,instead of going to the first KV of the (row, column) pair 
 and iterating from there. If we don't know the (row, column), e.g. if it is 
 not specified in the query, we need to go to end of the current row/column 
 pair first, get a KV from there, and do another seek to (row', column', 
 timerange_max) from there. We can only skip over to the timerange_max 
 timestamp when we know that there are no DeleteColumn records at the top of 
 that row/column with a higher timestamp. We can utilize another Bloom filter 
 keyed on (row, column) to quickly find that out. (From HBASE-4962)
 So the motivation is to save seek ops for scanning time-range queries if we 
 know there is no delete for this row/column. 
 From the implementation prospective, we have already had a delete family 
 bloom filter which contains all the delete family key values. So we can reuse 
 the same bloom filter for all other kinds of delete information such as 
 delete columns or delete. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6793) Make hbase-examples module

2012-10-16 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6793?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477545#comment-13477545
 ] 

Jesse Yates commented on HBASE-6793:


[~sershe] good stuff sergey! its definitely a massive patch - feel free to file 
follow-ons for the rest of the stuff.

 Make hbase-examples module
 --

 Key: HBASE-6793
 URL: https://issues.apache.org/jira/browse/HBASE-6793
 Project: HBase
  Issue Type: Improvement
Affects Versions: 0.96.0
Reporter: Enis Soztutar
Assignee: Sergey Shelukhin
  Labels: noob
 Attachments: HBASE-6793.patch


 There are some examples under /examples/, which are not compiled as a part of 
 the build. We can move them to an hbase-examples module.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6983) Metric for unencoded size of cached blocks

2012-10-16 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HBASE-6983:
---

Attachment: D5979.1.patch

mbautin requested code review of [jira] [HBASE-6983] [89-fb] Metric for 
unencoded size of cached blocks.
Reviewers: Kannan, Karthik, Liyin, aaiyer, mcorgan, JIRA

  We need to measure the amount of unencoded data in the block cache when data 
block encoding is enabled.

TEST PLAN
  Unit tests

REVISION DETAIL
  https://reviews.facebook.net/D5979

AFFECTED FILES
  
src/main/java/org/apache/hadoop/hbase/io/encoding/BufferedDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/CopyKeyDataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/encoding/DataBlockEncoder.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/BlockCache.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/CachedBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/LruBlockCache.java
  src/main/java/org/apache/hadoop/hbase/io/hfile/SimpleBlockCache.java
  src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
  src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
  src/test/java/org/apache/hadoop/hbase/io/hfile/TestCacheOnWrite.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/EncodedSeekPerformanceTest.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java
  
src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java

To: JIRA


 Metric for unencoded size of cached blocks
 --

 Key: HBASE-6983
 URL: https://issues.apache.org/jira/browse/HBASE-6983
 Project: HBase
  Issue Type: Improvement
Reporter: Mikhail Bautin
Priority: Minor
 Attachments: D5979.1.patch


 We need to measure the amount of unencoded data in the block cache when data 
 block encoding is enabled.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477554#comment-13477554
 ] 

Anoop Sam John commented on HBASE-6942:
---

When user say he want to delete family cf1 and cf2 (with passing TS or not)
Then user need to create the Scan object appropriately. Include the cf1 and cf2 
in the Scan
Now from the KVs we can create the Delete object
{code}
case FAMILY:
Setbyte[] families = new TreeSetbyte[](Bytes.BYTES_COMPARATOR);
for (KeyValue kv : deleteRow) {
  if (families.add(kv.getFamily())) {
delete.deleteFamily(kv.getFamily(), ts);
  }
}
break;
{code}
Add family of all the KVs into Delete..Used to set to avoid duplicate calls.
Am I making you clear Ted?

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477558#comment-13477558
 ] 

Anoop Sam John commented on HBASE-6942:
---

One another question
Do some one tried passing an enum type via the CP Endpoints?
I think it wont work.. I was checking why and found it is as per the code in 
HbaseObjectWritable.

In the kernel code also only one enum is passed across wire I think 
ie.RegionOpeningState
This one is specifically added to CODE_TO_CLASS and CLASS_TO_CODE Maps in 
HbaseObjectWritable

Is it a bug we need to address? Or some where we are telling that enums can not 
be used?

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477566#comment-13477566
 ] 

Lars Hofhansl commented on HBASE-6942:
--

You can send the enum's ordinal number across.

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5032) Add other DELETE type information into the delete bloom filter to optimize the time range query

2012-10-16 Thread Kannan Muthukkaruppan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kannan Muthukkaruppan updated HBASE-5032:
-

Description: 
To speed up time range scans we need to seek to the maximum timestamp of the 
requested range,instead of going to the first KV of the (row, column) pair and 
iterating from there. If we don't know the (row, column), e.g. if it is not 
specified in the query, we need to go to end of the current row/column pair 
first, get a KV from there, and do another seek to (row', column', 
timerange_max) from there. We can only skip over to the timerange_max timestamp 
when we know that there are no DeleteColumn records at the top of that 
row/column with a higher timestamp. We can utilize another Bloom filter keyed 
on (row, column) to quickly find that out. (From HBASE-4962)

So the motivation is to save seek ops for scanning time-range queries if we 
know there is no delete for this row/column. 

From the implementation perspective, we have already had a delete family bloom 
filter which contains all the delete family key values. So we can reuse the 
same bloom filter for all other kinds of delete information such as delete 
columns or delete. 







  was:
To speed up time range scans we need to seek to the maximum timestamp of the 
requested range,instead of going to the first KV of the (row, column) pair and 
iterating from there. If we don't know the (row, column), e.g. if it is not 
specified in the query, we need to go to end of the current row/column pair 
first, get a KV from there, and do another seek to (row', column', 
timerange_max) from there. We can only skip over to the timerange_max timestamp 
when we know that there are no DeleteColumn records at the top of that 
row/column with a higher timestamp. We can utilize another Bloom filter keyed 
on (row, column) to quickly find that out. (From HBASE-4962)

So the motivation is to save seek ops for scanning time-range queries if we 
know there is no delete for this row/column. 

From the implementation prospective, we have already had a delete family bloom 
filter which contains all the delete family key values. So we can reuse the 
same bloom filter for all other kinds of delete information such as delete 
columns or delete. 








 Add other DELETE type information into the delete bloom filter to optimize 
 the time range query
 ---

 Key: HBASE-5032
 URL: https://issues.apache.org/jira/browse/HBASE-5032
 Project: HBase
  Issue Type: Improvement
Reporter: Liyin Tang
Assignee: Adela Maznikar

 To speed up time range scans we need to seek to the maximum timestamp of the 
 requested range,instead of going to the first KV of the (row, column) pair 
 and iterating from there. If we don't know the (row, column), e.g. if it is 
 not specified in the query, we need to go to end of the current row/column 
 pair first, get a KV from there, and do another seek to (row', column', 
 timerange_max) from there. We can only skip over to the timerange_max 
 timestamp when we know that there are no DeleteColumn records at the top of 
 that row/column with a higher timestamp. We can utilize another Bloom filter 
 keyed on (row, column) to quickly find that out. (From HBASE-4962)
 So the motivation is to save seek ops for scanning time-range queries if we 
 know there is no delete for this row/column. 
 From the implementation perspective, we have already had a delete family 
 bloom filter which contains all the delete family key values. So we can reuse 
 the same bloom filter for all other kinds of delete information such as 
 delete columns or delete. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477567#comment-13477567
 ] 

Ted Yu commented on HBASE-6942:
---

Do we support delete family cf1 and delete column qualifier cq2 of family cf2 
in one request ?

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477568#comment-13477568
 ] 

Anoop Sam John commented on HBASE-6942:
---

bq.Do we support delete family cf1 and delete column qualifier cq2 of family 
cf2 in one request ?
No Ted.. In this we can not do that...

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477570#comment-13477570
 ] 

Anoop Sam John commented on HBASE-6942:
---

See my previous comment
https://issues.apache.org/jira/browse/HBASE-6942?focusedCommentId=13476126page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13476126

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477571#comment-13477571
 ] 

Anoop Sam John commented on HBASE-6942:
---

bq.You can send the enum's ordinal number across.
Yes Lars. Then we can not accept Enum types as a parameter in the CP Endpoint.. 
So user also need to pass the ordinal(as we don't have a client side wrapper 
API to call this Endpoint as of now)
I shall do it like that now..


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6986) Reenable TestClientTimeouts for security build

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477572#comment-13477572
 ] 

Lars Hofhansl commented on HBASE-6986:
--

+1 on punting here.
We do not want to introduce more fragility into the HBase just so it can be 
tested.
Maybe this is a case for a more powerful mocking framework so that the RPC 
engines can be mocked with failure insertion.


 Reenable TestClientTimeouts for security build
 --

 Key: HBASE-6986
 URL: https://issues.apache.org/jira/browse/HBASE-6986
 Project: HBase
  Issue Type: Sub-task
Reporter: Gregory Chanan
Assignee: Gregory Chanan
Priority: Minor
 Fix For: 0.94.3


 TestClientTimeouts was disabled to get 0.94.2 out the door because it didn't 
 work in security build.  Investigate and reenable.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477573#comment-13477573
 ] 

Ted Yu commented on HBASE-6942:
---

If we pass Scan and Delete to endpoint, we can handle arbitrary deletion 
requests.

@Anoop:
I clicked on the link above - it is not obvious which comment you were 
referring to.
Please refer to comment by its time.

Thanks

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477574#comment-13477574
 ] 

Lars Hofhansl commented on HBASE-6942:
--

Alternatively we use four integer constants, or do what you had suggested 
earlier and pass a Delete template object (although I still think that would be 
confusing).

Since these are endpoints, it is also possible to just have 4 different 
endpoint that share some methods between them.


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6980) Parallel Flushing Of Memstores

2012-10-16 Thread Kannan Muthukkaruppan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477575#comment-13477575
 ] 

Kannan Muthukkaruppan commented on HBASE-6980:
--

Ramakrishna,

Thanks for your email.

#1. It is not clear why we even write a META entry for flushes...

{code}
private WALEdit completeCacheFlushLogEdit() {
KeyValue kv = new KeyValue(METAROW, METAFAMILY, null,
  System.currentTimeMillis(), COMPLETE_CACHE_FLUSH);
WALEdit e = new WALEdit();
e.add(kv);
return e;
  }
{code}

The replayRecoveredEdits() logic skips over these entries anyway. And the only 
reference I see for this special entry in HLog is in unit tests.

#2. Yes, currently there is a lot of comments (related to lastSeqWritten) 
before the function HLog.java:startCacheFlush(), but the logic is not very 
clear to me. The changes were committed as part of HBASE-3845. I think we 
should be able to simplify that logic. I think I see some potential bugs there 
even it stands now-- will need to spend some more time looking at this, and 
will write down an update here.

But bottom line, I still don't see any good fundamental reason we need to hold 
this lock for the duration of the entire flush (even given the lastSeqWritten 
map logic).


 Parallel Flushing Of Memstores
 --

 Key: HBASE-6980
 URL: https://issues.apache.org/jira/browse/HBASE-6980
 Project: HBase
  Issue Type: New Feature
Reporter: Kannan Muthukkaruppan
Assignee: Kannan Muthukkaruppan

 For write dominated workloads, single threaded memstore flushing is an 
 unnecessary bottleneck. With a single flusher thread, we are basically not 
 setup to take advantage of the aggregate throughput that multi-disk nodes 
 provide.
 * For puts with WAL enabled, the bottleneck is more likely the single WAL 
 per region server. So this particular fix may not buy as much unless we 
 unlock that bottleneck with multiple commit logs per region server. (Topic 
 for a separate JIRA-- HBASE-6981).
 * But for puts with WAL disabled (e.g., when using HBASE-5783 style fast bulk 
 imports), we should be able to support much better ingest rates with parallel 
 flushing of memstores.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477579#comment-13477579
 ] 

Anoop Sam John commented on HBASE-6942:
---

Lars
Will be better to pass the type constants (Int constants)

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477578#comment-13477578
 ] 

Anoop Sam John commented on HBASE-6942:
---

Ted
Some drawbacks due to not taking Delete object 
1. When it is a timestamp based delete same TS to be used for all the columns 
where as in normal delete diff TS can be used
2. Types can not be mixed. In normal delete one CF delete and other one's 
column delete and yet others version delete can be combined

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477581#comment-13477581
 ] 

Lars Hofhansl commented on HBASE-6942:
--

Yes. I *really* do not want not to make more complicated than it is.
If somebody wants to delete a couple of column families and a couple of 
columns, it can be done with multiple roundtrips.

Now, if the code can be simplified by passing a Delete object, then we should 
do that.


 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6942) Endpoint implementation for bulk delete rows

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6942?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477583#comment-13477583
 ] 

Anoop Sam John commented on HBASE-6942:
---

I will make a patch based on the delete template also... 
So will be easy to compare..
I will make those today
Sorry was busy with meeting yday..

 Endpoint implementation for bulk delete rows
 

 Key: HBASE-6942
 URL: https://issues.apache.org/jira/browse/HBASE-6942
 Project: HBase
  Issue Type: Improvement
  Components: Coprocessors, Performance
Reporter: Anoop Sam John
Assignee: Anoop Sam John
 Fix For: 0.94.3, 0.96.0

 Attachments: HBASE-6942.patch, HBASE-6942_V2.patch, 
 HBASE-6942_V3.patch, HBASE-6942_V4.patch, HBASE-6942_V5.patch


 We can provide an end point implementation for doing a bulk deletion of 
 rows(based on a scan) at the server side. This can reduce the time taken for 
 such an operation as right now it need to do a scan to client and issue 
 delete(s) using rowkeys.
 Query like  delete from table1 where...

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6577) RegionScannerImpl.nextRow() should seek to next row

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477600#comment-13477600
 ] 

Anoop Sam John commented on HBASE-6577:
---

bq.In HRegionScannerImpl.nextRow(...) we try the current naive iteration for 
N KVs (let's say 100). If by then we have not reached the next row, we'll issue 
a direct seek.
That way if there are few version we avoid unnecessary seeks

Lars with HBASE-6032 in trunk, will it be a problem with calls to seek? I hope 
with this change now seek within same block will not have overhead. So may be 
we do not need configurable KV number(like 100)..  Pls correct me if my 
understanding is wrong

 RegionScannerImpl.nextRow() should seek to next row
 ---

 Key: HBASE-6577
 URL: https://issues.apache.org/jira/browse/HBASE-6577
 Project: HBase
  Issue Type: Bug
Reporter: Lars Hofhansl
Assignee: Lars Hofhansl
 Fix For: 0.96.0

 Attachments: 6577-0.94.txt, 6577.txt, 6577-v2.txt, 6577-v3.txt


 RegionScannerImpl.nextRow() is called when a filter filters the entire row. 
 In that case we should seek to the next row rather then iterating over all 
 versions of all columns to get there.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6991) Escape \ in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

2012-10-16 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6991:
--

Summary: Escape \ in Bytes.toStringBinary() and its counterpart 
Bytes.toBytesBinary()  (was: Bytes.toStringBinary() and its counterpart 
Bytes.toBytesBinary() are not always consistant)

 Escape \ in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
 --

 Key: HBASE-6991
 URL: https://issues.apache.org/jira/browse/HBASE-6991
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore

 Since \ is used to escape non-printable character but not treated as 
 special character in conversion, it could lead to unexpected conversion.
 For example, please consider the following code snippet.
 {code}
 public void testConversion() {
   byte[] original = {
   '\\', 'x', 'A', 'D'
   };
   String stringFromBytes = Bytes.toStringBinary(original);
   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
   System.out.println(Original:  + Arrays.toString(original));
   System.out.println(Converted:  + Arrays.toString(converted));
   System.out.println(Reversible?:  + (Bytes.compareTo(original, converted) 
 == 0));
 }
 Output:
 ---
 Original: [92, 120, 65, 68]
 Converted: [-83]
 Reversible?: false
 {code}
 The \ character needs to be treated as special and must be encoded as a 
 non-printable character (\x5C) to avoid any kind of unambiguity during 
 conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6991) Escape \ in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

2012-10-16 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6991:
--

Attachment: HBASE-6991_trunk.patch

Attaching the patch which modifies toStringBinary() to treat \ as 
non-printable character and translate it to \x5C

 Escape \ in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
 --

 Key: HBASE-6991
 URL: https://issues.apache.org/jira/browse/HBASE-6991
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
 Attachments: HBASE-6991_trunk.patch


 Since \ is used to escape non-printable character but not treated as 
 special character in conversion, it could lead to unexpected conversion.
 For example, please consider the following code snippet.
 {code}
 public void testConversion() {
   byte[] original = {
   '\\', 'x', 'A', 'D'
   };
   String stringFromBytes = Bytes.toStringBinary(original);
   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
   System.out.println(Original:  + Arrays.toString(original));
   System.out.println(Converted:  + Arrays.toString(converted));
   System.out.println(Reversible?:  + (Bytes.compareTo(original, converted) 
 == 0));
 }
 Output:
 ---
 Original: [92, 120, 65, 68]
 Converted: [-83]
 Reversible?: false
 {code}
 The \ character needs to be treated as special and must be encoded as a 
 non-printable character (\x5C) to avoid any kind of unambiguity during 
 conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-10-16 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477610#comment-13477610
 ] 

Lars Hofhansl commented on HBASE-6032:
--

How come we missed this for 0.94?
This looks like an important performance improvement.


 Port HFileBlockIndex improvement from HBASE-5987
 

 Key: HBASE-6032
 URL: https://issues.apache.org/jira/browse/HBASE-6032
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.96.0

 Attachments: 6032-ports-5987.txt, 6032-ports-5987-v2.txt


 Excerpt from HBASE-5987:
 First, we propose to lookahead for one more block index so that the 
 HFileScanner would know the start key value of next data block. So if the 
 target key value for the scan(reSeekTo) is smaller than that start kv of 
 next data block, it means the target key value has a very high possibility in 
 the current data block (if not in current data block, then the start kv of 
 next data block should be returned. +Indexing on the start key has some 
 defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
 the contrary, if the target key value is bigger, then it shall query the 
 HFileBlockIndex. This improvement shall help to reduce the hotness of 
 HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
 Cache lookup.
 This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-10-16 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477612#comment-13477612
 ] 

ramkrishna.s.vasudevan commented on HBASE-6032:
---

We need this for 0.94 i think as per the changes in HBASE-6577.

 Port HFileBlockIndex improvement from HBASE-5987
 

 Key: HBASE-6032
 URL: https://issues.apache.org/jira/browse/HBASE-6032
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.96.0

 Attachments: 6032-ports-5987.txt, 6032-ports-5987-v2.txt


 Excerpt from HBASE-5987:
 First, we propose to lookahead for one more block index so that the 
 HFileScanner would know the start key value of next data block. So if the 
 target key value for the scan(reSeekTo) is smaller than that start kv of 
 next data block, it means the target key value has a very high possibility in 
 the current data block (if not in current data block, then the start kv of 
 next data block should be returned. +Indexing on the start key has some 
 defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
 the contrary, if the target key value is bigger, then it shall query the 
 HFileBlockIndex. This improvement shall help to reduce the hotness of 
 HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
 Cache lookup.
 This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6991) Escape \ in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()

2012-10-16 Thread Aditya Kishore (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Aditya Kishore updated HBASE-6991:
--

Fix Version/s: 0.96.0
 Hadoop Flags: Incompatible change
   Status: Patch Available  (was: Open)

The patch include the following changes:

1. Gets rid of unnecessary byte[] to String conversion. The ISO-8859-1 
charset does not do any transformation anyway. This also does away with the 
need of try-catch block.
{code}
-String first = new String(b, off, len, ISO-8859-1);
-for (int i = 0; i  first.length() ; ++i ) {
-  int ch = first.charAt(i)  0xFF;

+for (int i = off; i  off + len ; ++i ) {
+  int ch = b[i]  0xFF;
{code}

2. Removed \ from the set of printable non-alphanumeric characters so that it 
can be escaped using the \xXX format.
{code}
-  ||  `~!@#$%^*()-_=+[]{}\\|;:'\,./?.indexOf(ch) = 0 ) {

+  ||  `~!@#$%^*()-_=+[]{}|;:'\,./?.indexOf(ch) = 0 ) {
{code}

3. Added new test case to verify that the conversion is reversible for random 
array of bytes. Without this change the test always fails. The test add 1 extra 
second to the test run.

{code:title=hbase-common/src/test/java/org/apache/hadoop/hbase/util/TestBytes.java}
+  public void testToStringBytesBinaryReversible() {
+//  let's run test with 1000 randomly generated byte arrays
+Random rand = new Random(System.currentTimeMillis());
+byte[] randomBytes = new byte[1000];
+for (int i = 0; i  1000; i++) {
+  rand.nextBytes(randomBytes);
+  verifyReversibleForBytes(randomBytes);
+}
+
+//  some specific cases
+verifyReversibleForBytes(new  byte[] {});
+verifyReversibleForBytes(new  byte[] {'\\', 'x', 'A', 'D'});
+verifyReversibleForBytes(new  byte[] {'\\', 'x', 'A', 'D', '\\'});
+  }
+
+  private void verifyReversibleForBytes(byte[] originalBytes) {
+String convertedString = Bytes.toStringBinary(originalBytes);
+byte[] convertedBytes = Bytes.toBytesBinary(convertedString);
+if (Bytes.compareTo(originalBytes, convertedBytes) != 0) {
+  fail(Not reversible for\nbyte[]:  + Arrays.toString(originalBytes) +
+  ,\nStringBinary:  + convertedString);
+}
+  }
{code}

4. And finally, fixes the two test cases which were breaking because they 
assumed that \ is encoded as \.
{code}
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlock.java
hbase-server/src/test/java/org/apache/hadoop/hbase/io/hfile/TestHFileBlockCompatibility.java

-+ \\xD46\\xEA5\\xEA3\\xEA7\\xE7\\x00LI\\s\\xA0\\x0F\\x00\\x00
++ \\xD46\\xEA5\\xEA3\\xEA7\\xE7\\x00LI\\x5Cs\\xA0\\x0F\\x00\\x00
{code}

Setting the Incompatible change flag since any other code which makes the 
same assumption as the two test cases needs fix.

 Escape \ in Bytes.toStringBinary() and its counterpart Bytes.toBytesBinary()
 --

 Key: HBASE-6991
 URL: https://issues.apache.org/jira/browse/HBASE-6991
 Project: HBase
  Issue Type: Bug
  Components: util
Affects Versions: 0.96.0
Reporter: Aditya Kishore
Assignee: Aditya Kishore
 Fix For: 0.96.0

 Attachments: HBASE-6991_trunk.patch


 Since \ is used to escape non-printable character but not treated as 
 special character in conversion, it could lead to unexpected conversion.
 For example, please consider the following code snippet.
 {code}
 public void testConversion() {
   byte[] original = {
   '\\', 'x', 'A', 'D'
   };
   String stringFromBytes = Bytes.toStringBinary(original);
   byte[] converted = Bytes.toBytesBinary(stringFromBytes);
   System.out.println(Original:  + Arrays.toString(original));
   System.out.println(Converted:  + Arrays.toString(converted));
   System.out.println(Reversible?:  + (Bytes.compareTo(original, converted) 
 == 0));
 }
 Output:
 ---
 Original: [92, 120, 65, 68]
 Converted: [-83]
 Reversible?: false
 {code}
 The \ character needs to be treated as special and must be encoded as a 
 non-printable character (\x5C) to avoid any kind of unambiguity during 
 conversion.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-10-16 Thread Anoop Sam John (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477624#comment-13477624
 ] 

Anoop Sam John commented on HBASE-6032:
---

+1 for having this for 0.94 version..
In fact I was trying to make a port and test and as per that raise a new issue 
for porting..

 Port HFileBlockIndex improvement from HBASE-5987
 

 Key: HBASE-6032
 URL: https://issues.apache.org/jira/browse/HBASE-6032
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.96.0

 Attachments: 6032-ports-5987.txt, 6032-ports-5987-v2.txt


 Excerpt from HBASE-5987:
 First, we propose to lookahead for one more block index so that the 
 HFileScanner would know the start key value of next data block. So if the 
 target key value for the scan(reSeekTo) is smaller than that start kv of 
 next data block, it means the target key value has a very high possibility in 
 the current data block (if not in current data block, then the start kv of 
 next data block should be returned. +Indexing on the start key has some 
 defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
 the contrary, if the target key value is bigger, then it shall query the 
 HFileBlockIndex. This improvement shall help to reduce the hotness of 
 HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
 Cache lookup.
 This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-10-16 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6032:
-

Attachment: 6032.094.txt
6032v3.txt

Version of patch that will apply to 0.94

 Port HFileBlockIndex improvement from HBASE-5987
 

 Key: HBASE-6032
 URL: https://issues.apache.org/jira/browse/HBASE-6032
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.96.0

 Attachments: 6032.094.txt, 6032-ports-5987.txt, 
 6032-ports-5987-v2.txt, 6032v3.txt


 Excerpt from HBASE-5987:
 First, we propose to lookahead for one more block index so that the 
 HFileScanner would know the start key value of next data block. So if the 
 target key value for the scan(reSeekTo) is smaller than that start kv of 
 next data block, it means the target key value has a very high possibility in 
 the current data block (if not in current data block, then the start kv of 
 next data block should be returned. +Indexing on the start key has some 
 defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
 the contrary, if the target key value is bigger, then it shall query the 
 HFileBlockIndex. This improvement shall help to reduce the hotness of 
 HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
 Cache lookup.
 This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6032) Port HFileBlockIndex improvement from HBASE-5987

2012-10-16 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13477638#comment-13477638
 ] 

stack commented on HBASE-6032:
--

I've not run the tests but +1 on commit if all tests run (nice tests included 
w/ this patch)

 Port HFileBlockIndex improvement from HBASE-5987
 

 Key: HBASE-6032
 URL: https://issues.apache.org/jira/browse/HBASE-6032
 Project: HBase
  Issue Type: Task
Reporter: Ted Yu
Assignee: Ted Yu
 Fix For: 0.96.0

 Attachments: 6032.094.txt, 6032-ports-5987.txt, 
 6032-ports-5987-v2.txt, 6032v3.txt


 Excerpt from HBASE-5987:
 First, we propose to lookahead for one more block index so that the 
 HFileScanner would know the start key value of next data block. So if the 
 target key value for the scan(reSeekTo) is smaller than that start kv of 
 next data block, it means the target key value has a very high possibility in 
 the current data block (if not in current data block, then the start kv of 
 next data block should be returned. +Indexing on the start key has some 
 defects here+) and it shall NOT query the HFileBlockIndex in this case. On 
 the contrary, if the target key value is bigger, then it shall query the 
 HFileBlockIndex. This improvement shall help to reduce the hotness of 
 HFileBlockIndex and avoid some unnecessary IdLock Contention or Index Block 
 Cache lookup.
 This JIRA is to port the fix to HBase trunk, etc.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira