[jira] [Resolved] (HBASE-4073) Don't open a region we already have opened!

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4073.
--

Resolution: Invalid

HBASE-4083 added the check I want including handling of case where region 
already open when we ask to open it.

> Don't open a region we already have opened!
> ---
>
> Key: HBASE-4073
> URL: https://issues.apache.org/jira/browse/HBASE-4073
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> See this thread: 
> http://search-hadoop.com/?q=Errors+after+major+compaction&fc_project=HBase
> In it, Eran has snippets from a log that show us being asked open a region we 
> already have opened.
> We need to make sure that the root issue is addressed -- the races around 
> assignment and its timeouts -- and then after that do something like a check 
> if we already have region open before we queue an open (we can't return 
> message to the master from down inside the regionserver event handlers).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4083) If Enable table is not completed and is partial, then scanning of the table is not working

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071532#comment-13071532
 ] 

stack commented on HBASE-4083:
--

Applied to TRUNK.  Thanks Ram.

(i swapped order under openRegion so we check if in transition first and THEN 
check if its open rather than other way round).

> If Enable table is not completed and is partial, then scanning of the table 
> is not working 
> ---
>
> Key: HBASE-4083
> URL: https://issues.apache.org/jira/browse/HBASE-4083
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.5
>
> Attachments: HBASE-4083-1.patch, HBASE-4083_0.90.patch, 
> HBASE-4083_0.90_1.patch, HBASE-4083_trunk.patch, HBASE-4083_trunk_1.patch
>
>
> Consider the following scenario
> Start the Master, Backup master and RegionServer.
> Create a table which in turn creates a region.
> Disable the table.
> Enable the table again. 
> Kill the Active master exactly at the point before the actual region 
> assignment is started.
> Restart or switch master.
> Scan the table.
> NotServingRegionExcepiton is thrown.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071531#comment-13071531
 ] 

stack commented on HBASE-3899:
--

Vlad, I have a test failing after applying this.  Does it fail for you?

{code}
Running org.apache.hadoop.hbase.master.TestHMasterRPCException
Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.716 sec <<< 
FAILURE!
{code}

Thanks.


> enhance HBase RPC to support free-ing up server handler threads even if 
> response is not ready
> -
>
> Key: HBASE-3899
> URL: https://issues.apache.org/jira/browse/HBASE-3899
> Project: HBase
>  Issue Type: Improvement
>  Components: ipc
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.92.0
>
> Attachments: HBASE-3899-2.patch, HBASE-3899.patch, asyncRpc.txt, 
> asyncRpc.txt
>
>
> In the current implementation, the server handler thread picks up an item 
> from the incoming callqueue, processes it and then wraps the response as a 
> Writable and sends it back to the IPC server module. This wastes 
> thread-resources when the thread is blocked for disk IO (transaction logging, 
> read into block cache, etc).
> It would be nice if we can make the RPC Server Handler threads pick up a call 
> from the IPC queue, hand it over to the application (e.g. HRegion), the 
> application can queue it to be processed asynchronously and send a response 
> back to the IPC server module saying that the response is not ready. The RPC 
> Server Handler thread is now ready to pick up another request from the 
> incoming callqueue. When the queued call is processed by the application, it 
> indicates to the IPC module that the response is now ready to be sent back to 
> the client.
> The RPC client continues to experience the same behaviour as before. A RPC 
> client is synchronous and blocks till the response arrives.
> This RPC enhancement allows us to do very powerful things with the 
> RegionServer. In future, we can make enhance the RegionServer's threading 
> model to a message-passing model for better performance. We will not be 
> limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4073) Don't open a region we already have opened!

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4073?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071517#comment-13071517
 ] 

stack commented on HBASE-4073:
--

I see we have check if a region is in transition on the current regionserver 
but we should also check if its already open on this server before queuing a 
new open executor (I seem to recall reviewing a patch lately that added this 
but can't think what it was)

> Don't open a region we already have opened!
> ---
>
> Key: HBASE-4073
> URL: https://issues.apache.org/jira/browse/HBASE-4073
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.92.0
>
>
> See this thread: 
> http://search-hadoop.com/?q=Errors+after+major+compaction&fc_project=HBase
> In it, Eran has snippets from a log that show us being asked open a region we 
> already have opened.
> We need to make sure that the root issue is addressed -- the races around 
> assignment and its timeouts -- and then after that do something like a check 
> if we already have region open before we queue an open (we can't return 
> message to the master from down inside the regionserver event handlers).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4003) Cleanup Calls Conservatively On Timeout

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4003?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071516#comment-13071516
 ] 

stack commented on HBASE-4003:
--

What shall we do with this one Karthick?  Do you have a new version?

> Cleanup Calls Conservatively On Timeout
> ---
>
> Key: HBASE-4003
> URL: https://issues.apache.org/jira/browse/HBASE-4003
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.3
>Reporter: Karthick Sankarachary
>Assignee: Karthick Sankarachary
> Fix For: 0.92.0
>
> Attachments: HBASE-4003.patch
>
>
> In the event of a socket timeout, the {{HBaseClient}} iterates over the 
> outstanding calls (on that socket), and notifies them that a 
> {{SocketTimeoutException}} has occurred. Ideally, we should be cleanup up 
> just those calls that have been outstanding for longer than the specified 
> socket timeout.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4008) Problem while stopping HBase

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071515#comment-13071515
 ] 

stack commented on HBASE-4008:
--

I think isAborted belongs in the Abortable Interface.  That'd be a big change 
though.  Wouldn't wish that on you (Could do it in another issue -- this is not 
the first time isAborted has been wanted).

Otherwise, do you think it should be protected rather than public since it is 
only used by HMasterCommandLine, at least currently.

Else, patch looks great.  Does it work for you Akash?



> Problem while stopping HBase
> 
>
> Key: HBASE-4008
> URL: https://issues.apache.org/jira/browse/HBASE-4008
> Project: HBase
>  Issue Type: Bug
>  Components: scripts
>Reporter: Akash Ashok
>Assignee: Akash Ashok
>  Labels: HMaster
> Fix For: 0.92.0
>
> Attachments: HBase-4008-v2.patch, HBase-4008.patch
>
>
> stop-hbase.sh stops the server successfully if and only if the server is 
> instantiated properly. 
> When u Run 
> start-hbase.sh; sleep 10; stop-hbase.sh; ( This works totally fine and has no 
> issues )
> Whereas when u run 
> start-hbase.sh; stop-hbase.sh; ( This never stops the server and neither the 
> server gets initialized and starts properly )

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4014) Coprocessors: Flag the presence of coprocessors in logged exceptions

2011-07-26 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071514#comment-13071514
 ] 

jirapos...@reviews.apache.org commented on HBASE-4014:
--


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/969/#review1198
---


Some comments below Eugene.  This thing looks useful and almost done.  Lets get 
it in!


src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorEnvironment.java


Do we need to add this?  Doesn't every object inherit Object and so have a 
toString?



src/main/java/org/apache/hadoop/hbase/coprocessor/CoprocessorHost.java


Do you think this needed Eugene?  Is coprocessors a List?  What if you 
toString'd it?  Maybe'll do right thing (with square bracket delimiters rather 
than curly's but that might be ok)



src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java


Whats masterServices?  I think it subclasses Server?  If you do 
getServerName or something, that'll give you something better than 'master'.  
It'll include port and startcode which could be important debugging (more 
important for the RS case than for Master but could be important if 
multimasters).



src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java


Nice comment



src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java


Abort seems radical



src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java


The convention is to put } catch { on the one line rather than line break 
after the } (no biggie)



src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java


Good



src/main/java/org/apache/hadoop/hbase/regionserver/RegionServerServices.java


We are importing but we don't seem to use the imports, is that so?



src/main/java/org/apache/hadoop/hbase/zookeeper/ZooKeeperWatcher.java


Is there any diff in the above changes?



src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java


Same here



src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterCoprocessorException.java


No need for these extra new lines



src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java


Unused import?



src/test/java/org/apache/hadoop/hbase/master/TestActiveMasterManager.java


Added line we don't need?



src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java


Unused import



src/test/java/org/apache/hadoop/hbase/master/TestClockSkewDetection.java


Unused import?



src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java


Unused import?



src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java


Unused import?



src/test/java/org/apache/hadoop/hbase/regionserver/handler/TestOpenRegionHandler.java


Addition of commented out code



src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java


Unused import


- Michael


On 2011-07-14 23:39:07, Eugene Koontz wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/969/
bq.  ---
bq.  
bq.  (Updated 2011-07-14 23:39:07)
bq.  
bq.  
bq.  Review request for hbase, Gary Helmling and Mingjie Lai.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  https://issues.apache.org/jira/browse/HBASE-4014 Coprocessors: Flag the 
presence of coprocessors in logged exceptions
bq.  
bq.  The general gist here is to wrap each of 
{Master,RegionServer}CoprocessorHost's coprocessor call inside a 
bq.  
bq.  "try { ... } catch (Throwable e) { handleCoprocessorThrowable(e) }"
bq.  
bq.  block. 
bq.  
bq.  handleCoprocessorThrowable() is responsible for either passing 'e' along 
to th

[jira] [Commented] (HBASE-3810) Registering a Coprocessor at HTableDescriptor should be less strict

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071510#comment-13071510
 ] 

stack commented on HBASE-3810:
--

@Mingjie Any chance of your carrying this one over the finish line?  Your run 
20 of the 26 miles on this one!

> Registering a Coprocessor at HTableDescriptor should be less strict
> ---
>
> Key: HBASE-3810
> URL: https://issues.apache.org/jira/browse/HBASE-3810
> Project: HBase
>  Issue Type: Improvement
>  Components: coprocessors
>Affects Versions: 0.92.0
> Environment: all
>Reporter: Joerg Schad
>Assignee: Mingjie Lai
>Priority: Minor
> Fix For: 0.92.0
>
>   Original Estimate: 2h
>  Remaining Estimate: 2h
>
> Registering a Copressor in the following way will fail as the "Coprocessor$1" 
> keyword is case sensitive (instead COPROCESSOR$1 works fine). Removing this 
> restriction would improve usability.
> HTableDescriptor desc = new HTableDescriptor(tName);
> desc.setValue("Coprocessor$1",
>path.toString() + ":" + full_class_name +
>  ":" + Coprocessor.Priority.USER);

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4113) Add createAsync and splits by start and end key to the shell

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4113?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4113:
-

Fix Version/s: (was: 0.92.0)
   0.94.0

> Add createAsync and splits by start and end key to the shell
> 
>
> Key: HBASE-4113
> URL: https://issues.apache.org/jira/browse/HBASE-4113
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.92.0
>Reporter: Lars George
>Priority: Minor
> Fix For: 0.94.0
>
> Attachments: HBASE-4113-v2.patch, HBASE-4113.patch
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4024) Major compaction may not be triggered, even though region server log says it is triggered

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4024:
-

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

This was committed a while back.

> Major compaction may not be triggered, even though region server log says it 
> is triggered
> -
>
> Key: HBASE-4024
> URL: https://issues.apache.org/jira/browse/HBASE-4024
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: Suraj Varma
>Assignee: Ted Yu
>Priority: Trivial
>  Labels: newbie
> Fix For: 0.92.0
>
> Attachments: 4024-v2.txt, 4024-v3.txt, 4024.txt
>
>
> The trunk version of regionserver/Store.java, method   List 
> compactSelection(List candidates) has this code to determine 
> whether major compaction should be done or not: 
> // major compact on user action or age (caveat: we have too many files)
> boolean majorcompaction = (forcemajor || 
> isMajorCompaction(filesToCompact))
>   && filesToCompact.size() < this.maxFilesToCompact;
> The isMajorCompaction(filesToCompact) method internally determines whether or 
> not major compaction is required (and logs this as "Major compaction 
> triggered ... " log message. However, after the call, the compactSelection 
> method subsequently applies the filesToCompact.size() < 
> this.maxFilesToCompact check which can turn off major compaction. 
> This would result in a "Major compaction triggered" log message without 
> actually triggering a major compaction.
> The filesToCompact.size() check should probably be moved inside the 
> isMajorCompaction(filesToCompact) method.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4083) If Enable table is not completed and is partial, then scanning of the table is not working

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-4083:
-

  Description: 
Consider the following scenario
Start the Master, Backup master and RegionServer.
Create a table which in turn creates a region.
Disable the table.
Enable the table again. 
Kill the Active master exactly at the point before the actual region assignment 
is started.
Restart or switch master.
Scan the table.
NotServingRegionExcepiton is thrown.


  was:

Consider the following scenario
Start the Master, Backup master and RegionServer.
Create a table which in turn creates a region.
Disable the table.
Enable the table again. 
Kill the Active master exactly at the point before the actual region assignment 
is started.
Restart or switch master.
Scan the table.
NotServingRegionExcepiton is thrown.


Fix Version/s: (was: 0.94.0)
   0.90.5

> If Enable table is not completed and is partial, then scanning of the table 
> is not working 
> ---
>
> Key: HBASE-4083
> URL: https://issues.apache.org/jira/browse/HBASE-4083
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.5
>
> Attachments: HBASE-4083-1.patch, HBASE-4083_0.90.patch, 
> HBASE-4083_0.90_1.patch, HBASE-4083_trunk.patch, HBASE-4083_trunk_1.patch
>
>
> Consider the following scenario
> Start the Master, Backup master and RegionServer.
> Create a table which in turn creates a region.
> Disable the table.
> Enable the table again. 
> Kill the Active master exactly at the point before the actual region 
> assignment is started.
> Restart or switch master.
> Scan the table.
> NotServingRegionExcepiton is thrown.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4136) Load balancer may not have a chance to run due to RegionsInTransition being non-empty

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071508#comment-13071508
 ] 

stack commented on HBASE-4136:
--

Yeah, the balancer will not run while RIT.  We don't want balancer moving stuff 
when the regions in flight might be making things right, at least not yet, not 
till our balancer algorithm gets a bit smarter

> Load balancer may not have a chance to run due to RegionsInTransition being 
> non-empty
> -
>
> Key: HBASE-4136
> URL: https://issues.apache.org/jira/browse/HBASE-4136
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: Ted Yu
>
> I observed in our staging cluster that load balancer didn't run for a long 
> period of time.
> I saw the following in master log:
> {code}
> 2011-07-24 15:56:32,333 DEBUG org.apache.hadoop.hbase.master.HMaster: Not 
> running balancer because 2 region(s) in transition: 
> {4e7416833e3bbd6a8ade26f6986529bf=TABLE-1311419946465,E'\xFD\xDDu\xC3\x894\xF4$\xC0K\xA3!\x82\xB9\xD0\x7F|>\xAC\xDA81\xB6\x92\xED\xA9\x9C\xA6^\xF4,1311419961631.4e7416833e3bbd6a8ade26f6986529bf.
>  state=PENDING_CLOSE, ts=...
> {code}
> This means we need to find a better way of permitting one balance run at a 
> time. In HMaster.balance():
> {code}
>   if (this.assignmentManager.isRegionsInTransition()) {
> {code}

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4134) The total number of regions was more than the actual region count after the hbck fix

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071507#comment-13071507
 ] 

stack commented on HBASE-4134:
--

Do you want to bring it back into 0.92 Feng Xu?

> The total number of regions was more than the actual region count after the 
> hbck fix
> 
>
> Key: HBASE-4134
> URL: https://issues.apache.org/jira/browse/HBASE-4134
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: feng xu
> Fix For: 0.94.0
>
>
> 1. I found the problem(some regions were multiply assigned) while running 
> hbck to check the cluster's health. Here's the result:
> {noformat}
> ERROR: Region test1,230778,1311216270050.fff783529fcd983043610eaa1cc5c2fe. is 
> listed in META on region server 158-1-91-101:20020 but is multiply assigned 
> to region servers 158-1-91-101:20020, 158-1-91-105:20020 
> ERROR: Region test1,252103,1311216293671.fff9ed2cb69bdce535451a07686c0db5. is 
> listed in META on region server 158-1-91-101:20020 but is multiply assigned 
> to region servers 158-1-91-101:20020, 158-1-91-105:20020 
> ERROR: Region test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. is 
> listed in META on region server 158-1-91-103:20020 but is multiply assigned 
> to region servers 158-1-91-103:20020, 158-1-91-105:20020 
> Summary: 
>   -ROOT- is okay. 
> Number of regions: 1 
> Deployed on: 158-1-91-105:20020 
>   .META. is okay. 
> Number of regions: 1 
> Deployed on: 158-1-91-103:20020 
>   test1 is okay. 
> Number of regions: 25297 
> Deployed on: 158-1-91-101:20020 158-1-91-103:20020 158-1-91-105:20020 
> 14829 inconsistencies detected. 
> Status: INCONSISTENT 
> {noformat}
> 2. Then I tried to use "hbck -fix" to fix the problem. Everything seemed ok. 
> But I found that the total number of regions reported by load balancer 
> (35029) was more than the actual region count(25299) after the fixing.
> Here's the related logs snippet:
> {noformat}
> 2011-07-22 02:19:02,866 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
> Skipping load balancing.  servers=3 regions=25299 average=8433.0 
> mostloaded=8433 
> 2011-07-22 03:06:11,832 INFO org.apache.hadoop.hbase.master.LoadBalancer: 
> Skipping load balancing.  servers=3 regions=35029 average=11676.333 
> mostloaded=11677 leastloaded=11676
> {noformat}
> 3. I tracked one region's behavior during the time. Taking the region of 
> "test1,282187,1311216322104.52782c0241a598b3e37ca8729da0." as example:
> (1) It was assigned to "158-1-91-101" at first. 
> (2) HBCK sent closing request to RegionServer. And RegionServer closed it 
> silently without notice to HMaster.
> (3) The region was still carried by RS "158-1-91-103" which was known to 
> HMaster.
> (4) HBCK will trigger a new assignment.
> The fact is, the region was assigned again, but the old assignment 
> information still remained in AM#regions,AM#servers.
> That's why the problem of "region count was larger than the actual number" 
> occurred.  
> {noformat}
> Line 178967: 2011-07-22 02:47:51,247 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling new unassigned 
> node: /hbase/unassigned/52782c0241a598b3e37ca8729da0 
> (region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
> server=HBCKServerName, state=M_ZK_REGION_OFFLINE)
> Line 178968: 2011-07-22 02:47:51,247 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling HBCK triggered 
> transition=M_ZK_REGION_OFFLINE, server=HBCKServerName, 
> region=52782c0241a598b3e37ca8729da0
> Line 178969: 2011-07-22 02:47:51,248 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: HBCK repair is triggering 
> assignment of 
> region=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0.
> Line 178970: 2011-07-22 02:47:51,248 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
> was found (or we are ignoring an existing plan) for 
> test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. so generated a 
> random one; hri=test1,282187,1311216322104.52782c0241a598b3e37ca8729da0., 
> src=, dest=158-1-91-101,20020,1311231878544; 3 (online=3, exclude=null) 
> available servers
> Line 178971: 2011-07-22 02:47:51,248 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> test1,282187,1311216322104.52782c0241a598b3e37ca8729da0. to 
> 158-1-91-101,20020,1311231878544
> Line 178983: 2011-07-22 02:47:51,285 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=158-1-91-101,20020,1311231878544, 
> region=52782c0241a598b3e37ca8729da0
> Line 179001: 2011-07-22 02:47:51,318 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, 

[jira] [Commented] (HBASE-4058) Extend TestHBaseFsck with a complete .META. recovery scenario

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071506#comment-13071506
 ] 

stack commented on HBASE-4058:
--

Assigning myself. This is pretty critical one.  The online merges I just added 
back to 0.92 is prereq for this one.

> Extend TestHBaseFsck with a complete .META. recovery scenario
> -
>
> Key: HBASE-4058
> URL: https://issues.apache.org/jira/browse/HBASE-4058
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: stack
> Fix For: 0.94.0
>
>
> We should have a unit test that launches a minicluster and constructs a few 
> tables, then deletes META files on disk, then bounces the master, then 
> recovers the result with HBCK. Perhaps it is possible to extend TestHBaseFsck 
> to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1621) merge tool should work on online cluster, but disabled table

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-1621:
-

Fix Version/s: (was: 0.94.0)
   0.92.0
 Assignee: stack

Pulling this back into 0.92 and assigning myself.  This is a pretty critical 
one.  I'd like to try do it for 0.92 (Thanks Ted for being more of a man than 
me moving the issues out 0.92.. I was unable to cut more).

> merge tool should work on online cluster, but disabled table
> 
>
> Key: HBASE-1621
> URL: https://issues.apache.org/jira/browse/HBASE-1621
> Project: HBase
>  Issue Type: Bug
>Reporter: ryan rawson
>Assignee: stack
>Priority: Critical
> Fix For: 0.92.0
>
> Attachments: 1621-trunk.txt, HBASE-1621-v2.patch, HBASE-1621.patch, 
> hbase-onlinemerge.patch
>
>
> taking down the entire cluster to merge 2 regions is a pain, i dont see why 
> the table or regions specifically couldnt be taken offline, then merged then 
> brought back up.
> this might need a new API to the regionservers so they can take direction 
> from not just the master.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Assigned] (HBASE-4058) Extend TestHBaseFsck with a complete .META. recovery scenario

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack reassigned HBASE-4058:


Assignee: stack

> Extend TestHBaseFsck with a complete .META. recovery scenario
> -
>
> Key: HBASE-4058
> URL: https://issues.apache.org/jira/browse/HBASE-4058
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: stack
> Fix For: 0.94.0
>
>
> We should have a unit test that launches a minicluster and constructs a few 
> tables, then deletes META files on disk, then bounces the master, then 
> recovers the result with HBCK. Perhaps it is possible to extend TestHBaseFsck 
> to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3857) Change the HFile Format

2011-07-26 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-3857:
--

Attachment: hfile_format_v2_design_draft_0.4.odt

Attaching the HFile v2 spec in the OpenOffice format.

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf, 
> hfile_format_v2_design_draft_0.4.odt
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3807) Fix units in RS UI metrics

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3807?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3807:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Fix units in RS UI metrics
> --
>
> Key: HBASE-3807
> URL: https://issues.apache.org/jira/browse/HBASE-3807
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
> Fix For: 0.94.0
>
>
> Currently the metrics are a mix of MB and bytes.  Its confusing.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1621) merge tool should work on online cluster, but disabled table

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-1621:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> merge tool should work on online cluster, but disabled table
> 
>
> Key: HBASE-1621
> URL: https://issues.apache.org/jira/browse/HBASE-1621
> Project: HBase
>  Issue Type: Bug
>Reporter: ryan rawson
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: 1621-trunk.txt, HBASE-1621-v2.patch, HBASE-1621.patch, 
> hbase-onlinemerge.patch
>
>
> taking down the entire cluster to merge 2 regions is a pain, i dont see why 
> the table or regions specifically couldnt be taken offline, then merged then 
> brought back up.
> this might need a new API to the regionservers so they can take direction 
> from not just the master.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4058) Extend TestHBaseFsck with a complete .META. recovery scenario

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4058?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4058:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

Moving to 0.94 since there is no owner for this issue at the moment.

> Extend TestHBaseFsck with a complete .META. recovery scenario
> -
>
> Key: HBASE-4058
> URL: https://issues.apache.org/jira/browse/HBASE-4058
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
> Fix For: 0.94.0
>
>
> We should have a unit test that launches a minicluster and constructs a few 
> tables, then deletes META files on disk, then bounces the master, then 
> recovers the result with HBCK. Perhaps it is possible to extend TestHBaseFsck 
> to do this.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-1938:
-

Fix Version/s: 0.92.0
 Assignee: nkeywal  (was: stack)
 Hadoop Flags: [Reviewed]
   Status: Patch Available  (was: Open)

Assigning nkeywal. Marking patch available against 0.92.

> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: nkeywal
>Priority: Blocker
> Fix For: 0.92.0
>
> Attachments: 20110726_1938_KeyValueSkipListSet.patch, 
> 20110726_1938_MemStore.patch, 20110726_1938_MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, MemStoreScanPerformance.java, 
> caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071486#comment-13071486
 ] 

stack commented on HBASE-1938:
--

+1 on 20110726_1938_KeyValueSkipListSet.patch  It looks great.  Will commit 
when commit On 20110726_1938_MemStore.patch so can then close this issue.  Nice 
work nkeywal.

> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: 20110726_1938_KeyValueSkipListSet.patch, 
> 20110726_1938_MemStore.patch, 20110726_1938_MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, MemStoreScanPerformance.java, 
> caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071484#comment-13071484
 ] 

stack commented on HBASE-1938:
--

On 20110726_1938_MemStore.patch:

FYI, in future, just remove code rather than commment it out: i.e. +  
//long readPoint = ReadWriteConsistencyControl.getThreadReadPoint();

The 'thenext' data member looks fine to me.  Worst that could happen is that we 
lag the read point slightly though unlikely ('theNext' ain't best name but I 
see you are just taking the old local variable name so not your malnaming... no 
worries).

Any chance of our calling a 'next' without doing a 'seek' first (or a reseek)?  
Am worried we'd trip on a null theNext.

If seek has a synchronized, yeah, reseek should too -- good one.

I'm good with committing this.  We have a bunch of tests that should vomit if 
stuff comes back out of order or not what we expect (If this does prove to 
break things, then we are lacking key coverage and lets address it then).

I'll let it hang out a day.  Someone else might have an opinion in here.

> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: 20110726_1938_KeyValueSkipListSet.patch, 
> 20110726_1938_MemStore.patch, 20110726_1938_MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, MemStoreScanPerformance.java, 
> caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-07-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071474#comment-13071474
 ] 

Ted Yu commented on HBASE-3857:
---

+1 on putting this into 0.92

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-07-26 Thread Mikhail Bautin (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071473#comment-13071473
 ] 

Mikhail Bautin commented on HBASE-3857:
---

@Michael:

1. From my conversations with Nicolas and Jonathan I got a sense that HFile v2 
would be part of 0.94, but if we can get it into 0.92, that would be even 
better, because I think in that case we will have to do less merging further 
down the road. I am also doing some load-testing of the patch on a 5-node 
cluster this week.
2. I will attach the spec document in the OpenOffice format later tonight (as 
soon as I can download OpenOffice and convert the doc).

Thank you!
--Mikhail


> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3857) Change the HFile Format

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071467#comment-13071467
 ] 

stack commented on HBASE-3857:
--

@Mikhail Thanks for updated patch.  Do you have a couple of answers for my 
questions above at 
https://issues.apache.org/jira/browse/HBASE-3857?focusedCommentId=13068151&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13068151?

Thanks

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3857) Change the HFile Format

2011-07-26 Thread Mikhail Bautin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mikhail Bautin updated HBASE-3857:
--

Attachment: 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch

Adding a new patch, which should be identical to 
https://reviews.apache.org/r/1134/ (except for a couple whitespace/comment 
fixes). Please download the patch from here and not from ReviewBoard, because 
ReviewBoard seems to ignore binary changes created by git format-patch.

> Change the HFile Format
> ---
>
> Key: HBASE-3857
> URL: https://issues.apache.org/jira/browse/HBASE-3857
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 0.90.4
>Reporter: Liyin Tang
>Assignee: Mikhail Bautin
> Attachments: 0001-review_hfile-v2-r1144693_2011-07-15_11_14_44.patch, 
> 0001-review_hfile-v2-r1147350_2011-07-26_11_55_59.patch, 
> hfile_format_v2_design_draft_0.1.pdf, hfile_format_v2_design_draft_0.3.pdf
>
>
> In order to support HBASE-3763 and HBASE-3856, we need to change the format 
> of the HFile. The new format proposal is attached here. Thanks for Mikhail 
> Bautin for the documentation. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3669) Region in PENDING_OPEN keeps being bounced between RS and master

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3669:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Region in PENDING_OPEN keeps being bounced between RS and master
> 
>
> Key: HBASE-3669
> URL: https://issues.apache.org/jira/browse/HBASE-3669
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: HBASE-3669-debug-v1.patch
>
>
> After going crazy killing region servers after HBASE-3668, most of the 
> cluster recovered except for 3 regions that kept being refused by the region 
> servers.
> One the master I would see:
> {code}
> 2011-03-17 22:23:14,828 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Regions in transition timed 
> out:  
> supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.
>  state=PENDING_OPEN, ts=1300400554826
> 2011-03-17 22:23:14,828 INFO 
> org.apache.hadoop.hbase.master.AssignmentManager: Region has been 
> PENDING_OPEN for too long, reassigning 
> region=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.
> 2011-03-17 22:23:14,828 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Forcing OFFLINE; 
> was=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.
>  state=PENDING_OPEN, ts=1300400554826
> 2011-03-17 22:23:14,828 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: No previous transition plan 
> was found (or we are ignoring an existing plan) for 
> supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.
>  so generated a random one; 
> hri=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.,
>  src=, dest=sv2borg171,60020,1300399357135; 17 (online=17, exclude=null) 
> available servers
> 2011-03-17 22:23:14,828 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.
>  to sv2borg171,60020,1300399357135
> {code}
> Then on the region server:
> {code}
> 2011-03-17 22:23:14,829 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:60020-0x22d627c142707d2 Attempting to transition node 
> f11849557c64c4efdbe0498f3fe97a21 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING
> 2011-03-17 22:23:14,832 DEBUG org.apache.hadoop.hbase.zookeeper.ZKUtil: 
> regionserver:60020-0x22d627c142707d2 Retrieved 166 byte(s) of data from znode 
> /hbase/unassigned/f11849557c64c4efdbe0498f3fe97a21; 
> data=region=supr_rss_items,ea0a3ac6c8779dab:872333599:ed1a7ad00f076fd98fcd3adcd98b62c6,1285707378709.f11849557c64c4efdbe0498f3fe97a21.,
>  server=sv2borg180,60020,1300384550966, state=RS_ZK_REGION_OPENING
> 2011-03-17 22:23:14,832 WARN org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> regionserver:60020-0x22d627c142707d2 Attempt to transition the unassigned 
> node for f11849557c64c4efdbe0498f3fe97a21 from M_ZK_REGION_OFFLINE to 
> RS_ZK_REGION_OPENING failed, the node existed but was in the state 
> RS_ZK_REGION_OPENING
> 2011-03-17 22:23:14,832 WARN 
> org.apache.hadoop.hbase.regionserver.handler.OpenRegionHandler: Failed 
> transition from OFFLINE to OPENING for region=f11849557c64c4efdbe0498f3fe97a21
> {code}
> I'm not sure I fully understand what was going on... the master was suppose 
> to OFFLINE the znode but then that's not what the region server was seeing? 
> In any case, I was able to recover by doing a force unassign for each region 
> and then assign.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions

2011-07-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071406#comment-13071406
 ] 

Hudson commented on HBASE-4139:
---

Integrated in HBase-TRUNK #2054 (See 
[https://builds.apache.org/job/HBase-TRUNK/2054/])
HBASE-4139  [stargate] Update ScannerModel with support for filter package 
additions

apurtell : 
Files : 
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/TimestampsFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/rest/model/ScannerModel.java
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/BitComparator.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnRangeFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/ColumnPaginationFilter.java
* 
/hbase/trunk/src/main/java/org/apache/hadoop/hbase/filter/DependentColumnFilter.java


> [stargate] Update ScannerModel with support for filter package additions
> 
>
> Key: HBASE-4139
> URL: https://issues.apache.org/jira/browse/HBASE-4139
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4139.patch
>
>
> Filters have been added to the o.a.h.h.filters package without updating 
> o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4142) Advise against large batches in javadoc for HTable#put(List)

2011-07-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071405#comment-13071405
 ] 

Hudson commented on HBASE-4142:
---

Integrated in HBase-TRUNK #2054 (See 
[https://builds.apache.org/job/HBase-TRUNK/2054/])
HBASE-4142 Advise against large batches in javadoc for HTable#put(List)

apurtell : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/client/HTableInterface.java
* /hbase/trunk/CHANGES.txt


> Advise against large batches in javadoc for HTable#put(List)
> -
>
> Key: HBASE-4142
> URL: https://issues.apache.org/jira/browse/HBASE-4142
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4142.patch
>
>
> This came up with an internal dev group.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3809) .META. may not come back online if > number of executors servers crash and one of those > number of executors was carrying meta

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3809?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3809:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> .META. may not come back online if > number of executors servers crash and 
> one of those > number of executors was carrying meta
> ---
>
> Key: HBASE-3809
> URL: https://issues.apache.org/jira/browse/HBASE-3809
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Priority: Critical
> Fix For: 0.94.0
>
>
> This is a duplicate of another issue but at the moment I cannot find the 
> original.
> If you had a 700 node cluster and then you ran something on the cluster which 
> killed 100 nodes, and .META. had been running on one of those downed nodes, 
> well, you'll have all of your master executors processing ServerShutdowns and 
> more than likely non of the currently processing executors will be servicing 
> the shutdown of the server that was carrying .META.
> Well, for server shutdown to complete at the moment, an online .META. is 
> required.  So, in the above case, we'll be stuck. The current executors will 
> not be able to clear to make space for the processing of the server carrying 
> .META. because they need .META. to complete.
> We can make the master handlers have no bound so it will expand to accomodate 
> all crashed servers -- so it'll have the one .META. in its queue -- or we can 
> change it so shutdown handling doesn't require .META. to be on-line (its used 
> to figure the regions the server was carrying); we could use the master's 
> in-memory picture of the cluster (But IIRC, there may be holes TBD)

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3643) Close the filesystem handle when HRS is aborting

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3643:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Close the filesystem handle when HRS is aborting
> 
>
> Key: HBASE-3643
> URL: https://issues.apache.org/jira/browse/HBASE-3643
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.90.1
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.0
>
>
> I thought of a way to fix HBASE-3515 that has a very broad impact, so I'm 
> creating this jira to *raise awareness* and gather comments.
> Currently when we call HRS.abort, it's still possible to do HDFS operations 
> like rolling logs and flushing files. It also has the impact that some 
> threads cannot write to ZK (like the situation described in HBASE-3515) but 
> then can still write to HDFS. Since that call is so central, I think we 
> should {color:red} add fs.close() inside the abort method{color}.
> The impact of this is that everything else that happens after the close call, 
> like closing files or appending, will fail in the most horrible ways. On the 
> bright side, this means less disruptive changes on HDFS.
> Todd pointed at HBASE-2231 as related, but I think my solution is still too 
> sloppy as we could still finish a compaction and immediately close the 
> filesystem after that (damage's done).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4139) [stargate] Update ScannerModel with support for filter package additions

2011-07-26 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4139:
--

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Committed to 0.90 branch and trunk.

> [stargate] Update ScannerModel with support for filter package additions
> 
>
> Key: HBASE-4139
> URL: https://issues.apache.org/jira/browse/HBASE-4139
> Project: HBase
>  Issue Type: Bug
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4139.patch
>
>
> Filters have been added to the o.a.h.h.filters package without updating 
> o.a.h.h.rest.model.ScannerModel. Bring ScannerModel up to date.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3433) Remove the KV copy of every KV in Scan; introduced by HBASE-3232 (why doesn't keyonlyfilter make copies rather than mutate -- HBASE-3211)?

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3433:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Remove the KV copy of every KV in Scan; introduced by HBASE-3232 (why doesn't 
> keyonlyfilter make copies rather than mutate -- HBASE-3211)?
> --
>
> Key: HBASE-3433
> URL: https://issues.apache.org/jira/browse/HBASE-3433
> Project: HBase
>  Issue Type: Improvement
>  Components: performance, regionserver
>Reporter: stack
>Priority: Critical
> Fix For: 0.94.0
>
> Attachments: HBASE-3433-sidenote.patch
>
>
> Here is offending code from inside in StoreScanner#next:
> {code}
>   // kv is no longer immutable due to KeyOnlyFilter! use copy for safety
>   KeyValue copyKv = new KeyValue(kv.getBuffer(), kv.getOffset(), 
> kv.getLength());
> {code}
> This looks wrong given philosophy up to this has been avoidance of 
> garbage-making copies.
> Maybe this has been looked into before and this is the only thing to be done 
> but why is KeyOnlyFilter not making copies rather than mutating originals?
> Making this critical against 0.92.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071379#comment-13071379
 ] 

Hudson commented on HBASE-3899:
---

Integrated in HBase-TRUNK #2053 (See 
[https://builds.apache.org/job/HBase-TRUNK/2053/])
HBASE-3899 enhance HBase RPC to support free-ing up server handler threads 
even if response is not ready

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java
* /hbase/trunk/CHANGES.txt
* /hbase/trunk/src/test/java/org/apache/hadoop/hbase/ipc
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java


> enhance HBase RPC to support free-ing up server handler threads even if 
> response is not ready
> -
>
> Key: HBASE-3899
> URL: https://issues.apache.org/jira/browse/HBASE-3899
> Project: HBase
>  Issue Type: Improvement
>  Components: ipc
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.92.0
>
> Attachments: HBASE-3899-2.patch, HBASE-3899.patch, asyncRpc.txt, 
> asyncRpc.txt
>
>
> In the current implementation, the server handler thread picks up an item 
> from the incoming callqueue, processes it and then wraps the response as a 
> Writable and sends it back to the IPC server module. This wastes 
> thread-resources when the thread is blocked for disk IO (transaction logging, 
> read into block cache, etc).
> It would be nice if we can make the RPC Server Handler threads pick up a call 
> from the IPC queue, hand it over to the application (e.g. HRegion), the 
> application can queue it to be processed asynchronously and send a response 
> back to the IPC server module saying that the response is not ready. The RPC 
> Server Handler thread is now ready to pick up another request from the 
> incoming callqueue. When the queued call is processed by the application, it 
> indicates to the IPC module that the response is now ready to be sent back to 
> the client.
> The RPC client continues to experience the same behaviour as before. A RPC 
> client is synchronous and blocks till the response arrives.
> This RPC enhancement allows us to do very powerful things with the 
> RegionServer. In future, we can make enhance the RegionServer's threading 
> model to a message-passing model for better performance. We will not be 
> limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4142) Advise against large batches in javadoc for HTable#put(List)

2011-07-26 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4142:
--

  Resolution: Fixed
Hadoop Flags: [Reviewed]
  Status: Resolved  (was: Patch Available)

Committed to trunk and 0.90 branch.

> Advise against large batches in javadoc for HTable#put(List)
> -
>
> Key: HBASE-4142
> URL: https://issues.apache.org/jira/browse/HBASE-4142
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4142.patch
>
>
> This came up with an internal dev group.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread nkeywal (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

nkeywal updated HBASE-1938:
---

Attachment: 20110726_1938_MemStoreScanPerformance.java
20110726_1938_MemStore.patch
20110726_1938_KeyValueSkipListSet.patch

20110726_1938_KeyValueSkipListSet.patch : Use the native ValueIterator 
instead of a wrapper on EntryIterator, suppression an object creation for each 
call to the iterator.
20110726_1938_MemStoreScanPerformance.java : Simple test case to measure 
scan performances
20110726_1938_MemStore.patch : multiple small performance improvements on 
MemStoreScanner


> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: 20110726_1938_KeyValueSkipListSet.patch, 
> 20110726_1938_MemStore.patch, 20110726_1938_MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, MemStoreScanPerformance.java, 
> caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071363#comment-13071363
 ] 

nkeywal commented on HBASE-1938:


Thanks Stack! It could be considered as a JDK bug, as it makes the 
EntryIterator useless when you manipulate large lists. From a HBase point of 
view, there is anyway no need to use an EntryIterator so it's simpler.

Here is the patch.
- 20110726_1938_KeyValueSkipListSet.patch : Use the native ValueIterator 
instead of a wrapper on EntryIterator, suppression an object creation for each 
call to the iterator.
- 20110726_1938_MemStoreScanPerformance.java : Simple test case to measure scan 
performances
- 20110726_1938_MemStore.patch : multiple small performance improvements on 
MemStoreScanner

It's obviously worth a review, especially the last one... Unit tests run fine.

> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4140) book: Update our hadoop vendor section

2011-07-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071353#comment-13071353
 ] 

Hudson commented on HBASE-4140:
---

Integrated in HBase-TRUNK #2052 (See 
[https://builds.apache.org/job/HBase-TRUNK/2052/])
HBASE-4140 book: Update our hadoop vendor section

stack : 
Files : 
* /hbase/trunk/src/docbkx/configuration.xml


> book: Update our hadoop vendor section
> --
>
> Key: HBASE-4140
> URL: https://issues.apache.org/jira/browse/HBASE-4140
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: hadoop.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-3266) Master does not seem to properly scan ZK for running RS during startup

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu resolved HBASE-3266.
---

Resolution: Not A Problem

>From Todd:
3266 is probably no longer valid given heartbeats don't exist in trunk.

> Master does not seem to properly scan ZK for running RS during startup
> --
>
> Key: HBASE-3266
> URL: https://issues.apache.org/jira/browse/HBASE-3266
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.0
>Reporter: Todd Lipcon
>Priority: Critical
> Fix For: 0.92.0
>
>
> I was in the situation described by HBASE-3265, where I had a number of RS 
> waiting on ROOT, but the master hadn't seen any RS checkins, so was waiting 
> on checkins. To get past this, I restarted one of the region servers. The 
> restarted server checked in, and the master began its startup.
> At this point the master started scanning /hbase/.logs for things to split. 
> It correctly identified that the RS on haus01 was running (this is the one I 
> restarted):
> 2010-11-23 00:21:25,595 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://haus01.sf.cloudera.com:11020/hbase-normal/.logs/haus01.sf.cloudera.com,60020,1290500443143
>  belongs to an existing region server
> but then incorrectly decided that the RS on haus02 was down:
> 2010-11-23 00:21:25,595 INFO org.apache.hadoop.hbase.master.MasterFileSystem: 
> Log folder 
> hdfs://haus01.sf.cloudera.com:11020/hbase-normal/.logs/haus02.sf.cloudera.com,60020,1290498411450
>  doesn't belong to a known region server, splitting
> However ZK shows that this RS is up:
> [zk: haus01.sf.cloudera.com:(CONNECTED) 3] ls /hbase/rs
> [haus04.sf.cloudera.com,60020,1290498411533, 
> haus05.sf.cloudera.com,60020,1290498411520, 
> haus03.sf.cloudera.com,60020,1290498411518, 
> haus01.sf.cloudera.com,60020,1290500443143, 
> haus02.sf.cloudera.com,60020,1290498411450]
> splitLogsAfterStartup seems to check ServerManager.onlineServers, which best 
> I can tell is derived from heartbeats and not from ZK (sorry if I got some of 
> this wrong, still new to this new codebase)
> Of course, the master went into an infinite splitting loop at this point 
> since haus02 is up and renewing its DFS lease on its logs.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2445) Clean up client retry policies

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-2445:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Clean up client retry policies
> --
>
> Key: HBASE-2445
> URL: https://issues.apache.org/jira/browse/HBASE-2445
> Project: HBase
>  Issue Type: Improvement
>  Components: client
>Reporter: Todd Lipcon
>Priority: Critical
>  Labels: moved_from_0_20_5
> Fix For: 0.94.0
>
>
> Right now almost all retry behavior is governed by a single parameter that 
> determines the number of retries. In a few places, there are also conf for 
> the number of millis to sleep between retries. This isn't quite flexible 
> enough. If we can refactor some of the retry logic into a RetryPolicy class, 
> we could introduce exponential backoff where appropriate, clean up some of 
> the config, etc

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4076) hbase should pick up HADOOP_CONF_DIR on its classpath

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-4076:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> hbase should pick up HADOOP_CONF_DIR on its classpath
> -
>
> Key: HBASE-4076
> URL: https://issues.apache.org/jira/browse/HBASE-4076
> Project: HBase
>  Issue Type: Improvement
>  Components: scripts
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
> Fix For: 0.94.0
>
>
> Currently, hbase doesn't automatically include the hadoop config on its 
> classpath. It usually works out OK since we specify hbase.rootdir with a 
> fullly qualified hdfs://nnhost/path URL. But, in secure environments for 
> example, there are some other Hadoop configs that need to be available to 
> HBase (eg NN krb5 principal info).
> We should change the HBase scripts to automatically pick up HADOOP_CONF_DIR 
> or HADOOP_HOME/conf on the classpath when available.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-3899.
--

   Resolution: Fixed
Fix Version/s: (was: 0.94.0)
   0.92.0

Applied to TRUNK.  Thanks for the patch Vlad (and to the reviewers)

> enhance HBase RPC to support free-ing up server handler threads even if 
> response is not ready
> -
>
> Key: HBASE-3899
> URL: https://issues.apache.org/jira/browse/HBASE-3899
> Project: HBase
>  Issue Type: Improvement
>  Components: ipc
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.92.0
>
> Attachments: HBASE-3899-2.patch, HBASE-3899.patch, asyncRpc.txt, 
> asyncRpc.txt
>
>
> In the current implementation, the server handler thread picks up an item 
> from the incoming callqueue, processes it and then wraps the response as a 
> Writable and sends it back to the IPC server module. This wastes 
> thread-resources when the thread is blocked for disk IO (transaction logging, 
> read into block cache, etc).
> It would be nice if we can make the RPC Server Handler threads pick up a call 
> from the IPC queue, hand it over to the application (e.g. HRegion), the 
> application can queue it to be processed asynchronously and send a response 
> back to the IPC server module saying that the response is not ready. The RPC 
> Server Handler thread is now ready to pick up another request from the 
> incoming callqueue. When the queued call is processed by the application, it 
> indicates to the IPC module that the response is now ready to be sent back to 
> the client.
> The RPC client continues to experience the same behaviour as before. A RPC 
> client is synchronous and blocks till the response arrives.
> This RPC enhancement allows us to do very powerful things with the 
> RegionServer. In future, we can make enhance the RegionServer's threading 
> model to a message-passing model for better performance. We will not be 
> limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-2730) Expose RS work queue contents on web UI

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2730?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-2730:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Expose RS work queue contents on web UI
> ---
>
> Key: HBASE-2730
> URL: https://issues.apache.org/jira/browse/HBASE-2730
> Project: HBase
>  Issue Type: New Feature
>  Components: regionserver
>Reporter: Todd Lipcon
>Priority: Critical
> Fix For: 0.94.0
>
>
> Would be nice to be able to see the contents of the various work queues - eg 
> to know what regions are pending compaction/split/flush/etc. This is handy 
> for debugging why a region might be blocked, etc.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3929) Add option to HFile tool to produce basic stats

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3929:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Add option to HFile tool to produce basic stats
> ---
>
> Key: HBASE-3929
> URL: https://issues.apache.org/jira/browse/HBASE-3929
> Project: HBase
>  Issue Type: New Feature
>  Components: io
>Affects Versions: 0.92.0
>Reporter: Todd Lipcon
>Assignee: Todd Lipcon
> Fix For: 0.94.0
>
> Attachments: hbase-3929-draft.txt
>
>
> In looking at HBASE-3421 I wrote a small tool to scan an HFile and produce 
> some basic statistics about it:
> - min/mean/max key size, value size (uncompressed)
> - min/mean/max number of columns per row (uncompressed)
> - min/mean/max number of bytes per row (uncompressed)
> - the key of the largest row

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4142) Advise against large batches in javadoc for HTable#put(List)

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071339#comment-13071339
 ] 

stack commented on HBASE-4142:
--

+1

> Advise against large batches in javadoc for HTable#put(List)
> -
>
> Key: HBASE-4142
> URL: https://issues.apache.org/jira/browse/HBASE-4142
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4142.patch
>
>
> This came up with an internal dev group.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-26 Thread Vlad Dogaru (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vlad Dogaru updated HBASE-3899:
---

Attachment: HBASE-3899-2.patch

Latest patch form review board.

> enhance HBase RPC to support free-ing up server handler threads even if 
> response is not ready
> -
>
> Key: HBASE-3899
> URL: https://issues.apache.org/jira/browse/HBASE-3899
> Project: HBase
>  Issue Type: Improvement
>  Components: ipc
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: HBASE-3899-2.patch, HBASE-3899.patch, asyncRpc.txt, 
> asyncRpc.txt
>
>
> In the current implementation, the server handler thread picks up an item 
> from the incoming callqueue, processes it and then wraps the response as a 
> Writable and sends it back to the IPC server module. This wastes 
> thread-resources when the thread is blocked for disk IO (transaction logging, 
> read into block cache, etc).
> It would be nice if we can make the RPC Server Handler threads pick up a call 
> from the IPC queue, hand it over to the application (e.g. HRegion), the 
> application can queue it to be processed asynchronously and send a response 
> back to the IPC server module saying that the response is not ready. The RPC 
> Server Handler thread is now ready to pick up another request from the 
> incoming callqueue. When the queued call is processed by the application, it 
> indicates to the IPC module that the response is now ready to be sent back to 
> the client.
> The RPC client continues to experience the same behaviour as before. A RPC 
> client is synchronous and blocks till the response arrives.
> This RPC enhancement allows us to do very powerful things with the 
> RegionServer. In future, we can make enhance the RegionServer's threading 
> model to a message-passing model for better performance. We will not be 
> limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3899) enhance HBase RPC to support free-ing up server handler threads even if response is not ready

2011-07-26 Thread jirapos...@reviews.apache.org (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3899?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071286#comment-13071286
 ] 

jirapos...@reviews.apache.org commented on HBASE-3899:
--



bq.  On 2011-07-26 01:38:40, Todd Lipcon wrote:
bq.  > Looks good. Have you run the full test suite with the current iteration 
of the patch?

Yes, tests are fine with the current patch.


- Vlad


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1174/#review1185
---


On 2011-07-26 01:19:52, Vlad Dogaru wrote:
bq.  
bq.  ---
bq.  This is an automatically generated e-mail. To reply, visit:
bq.  https://reviews.apache.org/r/1174/
bq.  ---
bq.  
bq.  (Updated 2011-07-26 01:19:52)
bq.  
bq.  
bq.  Review request for hbase.
bq.  
bq.  
bq.  Summary
bq.  ---
bq.  
bq.  Free up RPC server Handler thread if the called routine specifies the call 
should be delayed. The RPC client sees no difference, changes are server-side 
only. This is based on the previous submitted patch from Dhruba.
bq.  
bq.  
bq.  This addresses bug HBASE-3899.
bq.  https://issues.apache.org/jira/browse/HBASE-3899
bq.  
bq.  
bq.  Diffs
bq.  -
bq.  
bq.src/main/java/org/apache/hadoop/hbase/ipc/Delayable.java PRE-CREATION 
bq.src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java 61d3915 
bq.src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java 0da7f9e 
bq.src/test/java/org/apache/hadoop/hbase/ipc/TestDelayedRpc.java 
PRE-CREATION 
bq.  
bq.  Diff: https://reviews.apache.org/r/1174/diff
bq.  
bq.  
bq.  Testing
bq.  ---
bq.  
bq.  Unit tests run. Also, the patch includes a new unit test.
bq.  
bq.  
bq.  Thanks,
bq.  
bq.  Vlad
bq.  
bq.



> enhance HBase RPC to support free-ing up server handler threads even if 
> response is not ready
> -
>
> Key: HBASE-3899
> URL: https://issues.apache.org/jira/browse/HBASE-3899
> Project: HBase
>  Issue Type: Improvement
>  Components: ipc
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: HBASE-3899.patch, asyncRpc.txt, asyncRpc.txt
>
>
> In the current implementation, the server handler thread picks up an item 
> from the incoming callqueue, processes it and then wraps the response as a 
> Writable and sends it back to the IPC server module. This wastes 
> thread-resources when the thread is blocked for disk IO (transaction logging, 
> read into block cache, etc).
> It would be nice if we can make the RPC Server Handler threads pick up a call 
> from the IPC queue, hand it over to the application (e.g. HRegion), the 
> application can queue it to be processed asynchronously and send a response 
> back to the IPC server module saying that the response is not ready. The RPC 
> Server Handler thread is now ready to pick up another request from the 
> incoming callqueue. When the queued call is processed by the application, it 
> indicates to the IPC module that the response is now ready to be sent back to 
> the client.
> The RPC client continues to experience the same behaviour as before. A RPC 
> client is synchronous and blocks till the response arrives.
> This RPC enhancement allows us to do very powerful things with the 
> RegionServer. In future, we can make enhance the RegionServer's threading 
> model to a message-passing model for better performance. We will not be 
> limited by the number of threads in the RegionServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4142) Advise against large batches in javadoc for HTable#put(List)

2011-07-26 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4142:
--

Fix Version/s: 0.92.0
   0.90.4
   Status: Patch Available  (was: Open)

> Advise against large batches in javadoc for HTable#put(List)
> -
>
> Key: HBASE-4142
> URL: https://issues.apache.org/jira/browse/HBASE-4142
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Fix For: 0.90.4, 0.92.0
>
> Attachments: HBASE-4142.patch
>
>
> This came up with an internal dev group.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4142) Advise against large batches in javadoc for HTable#put(List)

2011-07-26 Thread Andrew Purtell (JIRA)
Advise against large batches in javadoc for HTable#put(List)
-

 Key: HBASE-4142
 URL: https://issues.apache.org/jira/browse/HBASE-4142
 Project: HBase
  Issue Type: Improvement
Reporter: Andrew Purtell
Assignee: Andrew Purtell
Priority: Trivial
 Attachments: HBASE-4142.patch

This came up with an internal dev group.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-4142) Advise against large batches in javadoc for HTable#put(List)

2011-07-26 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-4142:
--

Attachment: HBASE-4142.patch

> Advise against large batches in javadoc for HTable#put(List)
> -
>
> Key: HBASE-4142
> URL: https://issues.apache.org/jira/browse/HBASE-4142
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Trivial
> Attachments: HBASE-4142.patch
>
>
> This came up with an internal dev group.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071252#comment-13071252
 ] 

Hudson commented on HBASE-3845:
---

Integrated in HBase-TRUNK #2051 (See 
[https://builds.apache.org/job/HBase-TRUNK/2051/])
HBASE-3845 data loss because lastSeqWritten can miss memstore edits

stack : 
Files : 
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
* /hbase/trunk/CHANGES.txt
* 
/hbase/trunk/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestWALReplay.java
* /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java


> data loss because lastSeqWritten can miss memstore edits
> 
>
> Key: HBASE-3845
> URL: https://issues.apache.org/jira/browse/HBASE-3845
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: Prakash Khemani
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
> HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, 
> HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, 
> HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and 
> Kannan internally and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. 
> That way I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a 
> log-sequence-id for a region that is not the earliest log-sequence-id for 
> that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
> that we only keep track  of the earliest log-sequence-number that is present 
> in the memstore.
> Every time the memstore is flushed we remove the region's entry in 
> lastSequenceWritten and wait for the next append to populate this entry 
> again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under 
> HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added 
> into the memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry 
> from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). 
> But this will be the log seq id of the current append. All the edits that 
> were added in step 2 are missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I 
> will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071251#comment-13071251
 ] 

stack commented on HBASE-3845:
--

I applied patch to trunk.   Waiting till 0.90.4 clears the blocks before 
applying to 0.90.5.  Thanks for the patches Prakash and Ramakrishna.

> data loss because lastSeqWritten can miss memstore edits
> 
>
> Key: HBASE-3845
> URL: https://issues.apache.org/jira/browse/HBASE-3845
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: Prakash Khemani
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
> HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, 
> HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, 
> HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and 
> Kannan internally and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. 
> That way I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a 
> log-sequence-id for a region that is not the earliest log-sequence-id for 
> that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
> that we only keep track  of the earliest log-sequence-number that is present 
> in the memstore.
> Every time the memstore is flushed we remove the region's entry in 
> lastSequenceWritten and wait for the next append to populate this entry 
> again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under 
> HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added 
> into the memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry 
> from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). 
> But this will be the log seq id of the current append. All the edits that 
> were added in step 2 are missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I 
> will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4140) book: Update our hadoop vendor section

2011-07-26 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071253#comment-13071253
 ] 

Hudson commented on HBASE-4140:
---

Integrated in HBase-TRUNK #2051 (See 
[https://builds.apache.org/job/HBase-TRUNK/2051/])
HBASE-4140 book: Update our hadoop vendor section

stack : 
Files : 
* /hbase/trunk/src/docbkx/configuration.xml


> book: Update our hadoop vendor section
> --
>
> Key: HBASE-4140
> URL: https://issues.apache.org/jira/browse/HBASE-4140
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: hadoop.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071250#comment-13071250
 ] 

stack commented on HBASE-3845:
--

Nice explanatory comments.  This is radical '+  
Runtime.getRuntime().halt(1);' but I can live with it (should never happen it 
seems).  getSnapshotName could use Bytes utility copying bytes but its fine as 
is.

I'm game for applying this version.  The patches do similar but this is a 
little more thorough with more explanation.  Sounds like it got a bit of airing 
on a real cluster too.



> data loss because lastSeqWritten can miss memstore edits
> 
>
> Key: HBASE-3845
> URL: https://issues.apache.org/jira/browse/HBASE-3845
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: Prakash Khemani
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
> HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, 
> HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, 
> HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and 
> Kannan internally and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. 
> That way I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a 
> log-sequence-id for a region that is not the earliest log-sequence-id for 
> that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
> that we only keep track  of the earliest log-sequence-number that is present 
> in the memstore.
> Every time the memstore is flushed we remove the region's entry in 
> lastSequenceWritten and wait for the next append to populate this entry 
> again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under 
> HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added 
> into the memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry 
> from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). 
> But this will be the log seq id of the current append. All the edits that 
> were added in step 2 are missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I 
> will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Resolved] (HBASE-4140) book: Update our hadoop vendor section

2011-07-26 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-4140.
--

   Resolution: Fixed
Fix Version/s: 0.92.0
 Assignee: stack
 Hadoop Flags: [Reviewed]

Committed to trunk w/ Ted's amendment (thanks for review).  Let me push this 
out to website in an hour or so.

> book: Update our hadoop vendor section
> --
>
> Key: HBASE-4140
> URL: https://issues.apache.org/jira/browse/HBASE-4140
> Project: HBase
>  Issue Type: Improvement
>Reporter: stack
>Assignee: stack
> Fix For: 0.92.0
>
> Attachments: hadoop.txt
>
>


--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071227#comment-13071227
 ] 

stack commented on HBASE-1938:
--

@nkeywal I'm not as lazy this morning as I was yesterday so took a look at the 
java src.  Indeed that looks like a nice optimization.  Great stuff.

> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071209#comment-13071209
 ] 

stack commented on HBASE-1938:
--

@nkeywal nice one!

> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071196#comment-13071196
 ] 

nkeywal commented on HBASE-1938:


I have an improvement that could make a real difference.

In Hbase, there is an iterator called MapEntryIterator, that acts in reality as 
a ValueIterator
{noformat}static class MapEntryIterator implements Iterator
private final Iterator> iterator;

public KeyValue next() {
  return this.iterator.next().getValue();
}
{noformat} 

However, with the current implementation of the JDK, there is an important 
difference between an iterator on values and an iterator on entries. From 
java.util.concurrent we can see:


The ValueIterator is straighforward:
{noformat}final class ValueIterator extends Iter {
public V next() {
V v = nextValue;
advance();
return v;
}
}{noformat}

While there is some defensive programming taking place for the EntryIterator, 
with the creation of an immutable object. 
{noformat}final class EntryIterator extends Iter> {
public Map.Entry next() {
Node n = next;
V v = nextValue;
advance();
return new AbstractMap.SimpleImmutableEntry(n.key, v);
}
}{noformat} 

As a consequence, there is at least one object creation for every line in the 
hbase scanner. This creation is actually useless as we throw away the object 
immediatly. So, during the test several GC occur. I modified the 
MapEntryIterator implementation to iterate on the values.

{noformat}static class MapEntryIterator implements Iterator {
private final Iterator iterator;

public KeyValue next() {
  return this.iterator.next();
}{noformat}

The scan time is divided by 3 on the test. It can obviously be put to any 
arbitrary improvement ratio as it's driven by the GC execution, but it should 
be valuable in production as well.

I am currently running the unit tests, I will add the patch if the execution is 
ok.



> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.

2011-07-26 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071180#comment-13071180
 ] 

Ted Yu commented on HBASE-4138:
---

I think the above plan looks good.
Prepare patch for TRUNK for now. If users complain about this in 0.90.4, we can 
consider back porting.

> If zookeeper.znode.parent is not specifed explicitly in Client code then 
> HTable object loops continuously waiting for the root region by using /hbase 
> as the base node.
> ---
>
> Key: HBASE-4138
> URL: https://issues.apache.org/jira/browse/HBASE-4138
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.4
>
>
> Change the zookeeper.znode.parent property (default is /hbase).
> Now do not specify this change in the client code.
> Use the HTable Object.
> The HTable is not able to find the root region and keeps continuously looping.
> Find the stack trace:
> 
> Object.wait(long) line: not available [native method]  
> RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
> RootRegionTracker.waitRootRegionLocation(long) line: 73
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 578
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
> byte[], byte[], boolean, Object) line: 687
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 589
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
> byte[], byte[], boolean, Object) line: 687
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 593
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HTable.(Configuration, byte[]) line: 171 
> HTable.(Configuration, String) line: 145 
> HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-4138) If zookeeper.znode.parent is not specifed explicitly in Client code then HTable object loops continuously waiting for the root region by using /hbase as the base node.

2011-07-26 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071153#comment-13071153
 ] 

ramkrishna.s.vasudevan commented on HBASE-4138:
---

I am planning to make the following changes, 
If the ZookeeperWatcher is called by the HMaster then I will be creating a new 
constructor allowing it to create the base node.
For all other components RS, HBaseAdmin, HTable I will not allow the base node 
creation.  Including Replication related code.  
This will ensure that only master creates the base node and no other component 
can create it.
Is this change fine ? I have tested it in my testing environment.  
There are some related testcases which I had to change so the changes may be in 
 more places.
If it is fine, should i give patch for trunk and 0.90.x version or only for 
trunk so that I can prepare the patch and upload it asap.  
Thanks in advance.


> If zookeeper.znode.parent is not specifed explicitly in Client code then 
> HTable object loops continuously waiting for the root region by using /hbase 
> as the base node.
> ---
>
> Key: HBASE-4138
> URL: https://issues.apache.org/jira/browse/HBASE-4138
> Project: HBase
>  Issue Type: Bug
>  Components: client
>Affects Versions: 0.90.3
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.90.4
>
>
> Change the zookeeper.znode.parent property (default is /hbase).
> Now do not specify this change in the client code.
> Use the HTable Object.
> The HTable is not able to find the root region and keeps continuously looping.
> Find the stack trace:
> 
> Object.wait(long) line: not available [native method]  
> RootRegionTracker(ZooKeeperNodeTracker).blockUntilAvailable(long) line: 122
> RootRegionTracker.waitRootRegionLocation(long) line: 73
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 578
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
> byte[], byte[], boolean, Object) line: 687
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 589
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HConnectionManager$HConnectionImplementation.locateRegionInMeta(byte[],
> byte[], byte[], boolean, Object) line: 687
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[], boolean) line: 593
> HConnectionManager$HConnectionImplementation.locateRegion(byte[],
> byte[]) line: 558
> HTable.(Configuration, byte[]) line: 171 
> HTable.(Configuration, String) line: 145 
> HBaseTest.test() line: 45

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3890) Scheduled tasks in distributed log splitting not in sync with ZK

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3890?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3890:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

Lars agrees to punt this one.

> Scheduled tasks in distributed log splitting not in sync with ZK
> 
>
> Key: HBASE-3890
> URL: https://issues.apache.org/jira/browse/HBASE-3890
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars George
> Fix For: 0.94.0
>
>
> This is in continuation to HBASE-3889:
> Note that there must be more slightly off here. Although the splitlogs znode 
> is now empty the master is still stuck here:
> {noformat}
> Doing distributed log split in 
> hdfs://localhost:8020/hbase/.logs/10.0.0.65,60020,1305406356765
> - Waiting for distributed tasks to finish. scheduled=2 done=1 error=0   4380s
> Master startup
> - Splitting logs after master startup   4388s
> {noformat}
> There seems to be an issue with what is in ZK and what the TaskBatch holds. 
> In my case it could be related to the fact that the task was already in ZK 
> after many faulty restarts because of the NPE. Maybe it was added once (since 
> that is keyed by path, and that is unique on my machine), but the reference 
> count upped twice? Now that the real one is done, the done counter has been 
> increased, but will never match the scheduled.
> The code could also check if ZK is actually depleted, and therefore treat the 
> scheduled task as bogus? This of course only treats the symptom, not the root 
> cause of this condition. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3462) Fix table.jsp in regards to splitting a region/table with an optional splitkey

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3462:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> Fix table.jsp in regards to splitting a region/table with an optional splitkey
> --
>
> Key: HBASE-3462
> URL: https://issues.apache.org/jira/browse/HBASE-3462
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Affects Versions: 0.90.0
>Reporter: Lars George
> Fix For: 0.94.0
>
>
> After HBASE-3328 and HBASE-3437 went in there is also the table.jsp that 
> needs updating to support the same features. Also, at the same time update 
> the wording, for example 
> {quote}
> This action will force a split of all eligible regions of the table, or, if a 
> key is supplied, only the region containing the given key. An eligible region 
> is one that does not contain any references to other regions. Split requests 
> for noneligible regions will be ignored.
> {quote}
> I think it means it splits either all regions (that are splittable) or a 
> specific one. It says though "the region containing the given key", that 
> seems wrong in any event. Currently we do a split on the tablename when 
> nothing was specified or else do an internal get(region), which is an exact 
> match on the rows in .META.. In other words you need to match the region name 
> exactly or else it fails. It reports it has accepted the request but logs 
> internally
> {code}
> 2011-01-21 15:37:24,340 INFO org.apache.hadoop.hbase.client.HBaseAdmin: No 
> server in .META. for csfsef; pair=null
> {code}
> Error reporting could be better but because of the async nature this is more 
> difficult, yet it would be nice there is some concept of a Future to be able 
> to poll the result if needed.
> Finally, when you go back to the previous page after submitting the split the 
> entered values show up in the "compact" input fields, at least on my Chrome. 
> The inputs in both forms are named the same so it seems to confuse it. This 
> could be improved a lot by making the landing page reload the main one 
> automatically or refresh on reload instead of submitting the request again.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3891) TaskMonitor is used wrong in some places

2011-07-26 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3891?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-3891:
--

Fix Version/s: (was: 0.92.0)
   0.94.0

> TaskMonitor is used wrong in some places
> 
>
> Key: HBASE-3891
> URL: https://issues.apache.org/jira/browse/HBASE-3891
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Affects Versions: 0.92.0
>Reporter: Lars George
> Fix For: 0.94.0
>
>
> I have a long running log replay in progress but none of the updates show. 
> This is caused by reusing the MonitorTask references wrong, and manifests 
> itself like this in the logs:
> {noformat}
> 2011-05-16 15:22:18,127 WARN org.apache.hadoop.hbase.monitoring.TaskMonitor: 
> Status org.apache.hadoop.hbase.monitoring.MonitoredTaskImpl@51bfa303 appears 
> to have been leaked
> 2011-05-16 15:22:18,128 DEBUG 
> org.apache.hadoop.hbase.monitoring.MonitoredTask: cleanup.
> {noformat}
> The cleanup sets the completion timestamp and causes the task to be purged 
> from the list. After that the UI for example does not show any further 
> running tasks, although from the logs I can see (with my log additions):
> {noformat}
> 2011-05-16 15:29:52,296 DEBUG 
> org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: Compaction 
> complete: 103.1m in 18542ms
> 2011-05-16 15:29:52,296 DEBUG 
> org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: Running 
> coprocessor post-compact hooks
> 2011-05-16 15:29:52,296 DEBUG 
> org.apache.hadoop.hbase.monitoring.MonitoredTask: setStatus: Compaction 
> complete
> 2011-05-16 15:29:52,297 DEBUG 
> org.apache.hadoop.hbase.monitoring.MonitoredTask: markComplete: Compaction 
> complete
> {noformat}
> They are silently ignored as the TaskMonitor has dropped their reference. We 
> need to figure out why a supposedly completed task monitor was reused.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-26 Thread Prakash Khemani (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071086#comment-13071086
 ] 

Prakash Khemani commented on HBASE-3845:


patch deployed internally in facebook 
0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch

> data loss because lastSeqWritten can miss memstore edits
> 
>
> Key: HBASE-3845
> URL: https://issues.apache.org/jira/browse/HBASE-3845
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: Prakash Khemani
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
> HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, 
> HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, 
> HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and 
> Kannan internally and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. 
> That way I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a 
> log-sequence-id for a region that is not the earliest log-sequence-id for 
> that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
> that we only keep track  of the earliest log-sequence-number that is present 
> in the memstore.
> Every time the memstore is flushed we remove the region's entry in 
> lastSequenceWritten and wait for the next append to populate this entry 
> again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under 
> HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added 
> into the memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry 
> from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). 
> But this will be the log seq id of the current append. All the edits that 
> were added in step 2 are missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I 
> will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Updated] (HBASE-3845) data loss because lastSeqWritten can miss memstore edits

2011-07-26 Thread Prakash Khemani (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3845?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prakash Khemani updated HBASE-3845:
---

Attachment: 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch

patch deployed internally in facebook

> data loss because lastSeqWritten can miss memstore edits
> 
>
> Key: HBASE-3845
> URL: https://issues.apache.org/jira/browse/HBASE-3845
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.90.3
>Reporter: Prakash Khemani
>Assignee: ramkrishna.s.vasudevan
>Priority: Critical
> Fix For: 0.90.5
>
> Attachments: 
> 0001-HBASE-3845-data-loss-because-lastSeqWritten-can-miss.patch, 
> HBASE-3845_1.patch, HBASE-3845_2.patch, HBASE-3845_4.patch, 
> HBASE-3845_5.patch, HBASE-3845_6.patch, HBASE-3845__trunk.patch, 
> HBASE-3845_trunk_2.patch, HBASE-3845_trunk_3.patch
>
>
> (I don't have a test case to prove this yet but I have run it by Dhruba and 
> Kannan internally and wanted to put this up for some feedback.)
> In this discussion let us assume that the region has only one column family. 
> That way I can use region/memstore interchangeably.
> After a memstore flush it is possible for lastSeqWritten to have a 
> log-sequence-id for a region that is not the earliest log-sequence-id for 
> that region's memstore.
> HLog.append() does a putIfAbsent into lastSequenceWritten. This is to ensure 
> that we only keep track  of the earliest log-sequence-number that is present 
> in the memstore.
> Every time the memstore is flushed we remove the region's entry in 
> lastSequenceWritten and wait for the next append to populate this entry 
> again. This is where the problem happens.
> step 1:
> flusher.prepare() snapshots the memstore under 
> HRegion.updatesLock.writeLock().
> step 2 :
> as soon as the updatesLock.writeLock() is released new entries will be added 
> into the memstore.
> step 3 :
> wal.completeCacheFlush() is called. This method removes the region's entry 
> from lastSeqWritten.
> step 4:
> the next append will create a new entry for the region in lastSeqWritten(). 
> But this will be the log seq id of the current append. All the edits that 
> were added in step 2 are missing.
> ==
> as a temporary measure, instead of removing the region's entry in step 3 I 
> will replace it with the log-seq-id of the region-flush-event.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-2888) Review all our metrics

2011-07-26 Thread Lars George (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2888?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13071072#comment-13071072
 ] 

Lars George commented on HBASE-2888:


Another issue I found using the metrics is that all new client API calls are 
lumping everything under "multi" (for the RPC calls), for all batchable 
operations (which are most). This makes it difficult to track actual 
get/put/delete calls and their counts. Obviously this is OK in terms of RPC 
call counts, but these metrics were useful to see what the request pattern 
looks like. Trunk adds read/write counts to the RS metrics, which is good. But 
maybe we should have a more detailed count as well? Only if needed.

> Review all our metrics
> --
>
> Key: HBASE-2888
> URL: https://issues.apache.org/jira/browse/HBASE-2888
> Project: HBase
>  Issue Type: Improvement
>  Components: master, metrics
>Reporter: Jean-Daniel Cryans
>
> HBase publishes a bunch of metrics, some useful some wasteful, that should be 
> improved to deliver a better ops experience. Examples:
>  - Block cache hit ratio converges at some point and stops moving
>  - fsReadLatency goes down when compactions are running
>  - storefileIndexSizeMB is the exact same number once a system is serving 
> production load
> We could use new metrics too.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Created] (HBASE-4141) Fix LRU stats message

2011-07-26 Thread Lars George (JIRA)
Fix LRU stats message
-

 Key: HBASE-4141
 URL: https://issues.apache.org/jira/browse/HBASE-4141
 Project: HBase
  Issue Type: Bug
  Components: regionserver
Reporter: Lars George
Priority: Trivial


Currently the DEBUG message looks like this:

{noformat}
2011-07-26 04:21:52,344 DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: 
LRU Stats: total=3.24 MB, free=391.76 MB, max=395 MB, blocks=0, 
accesses=118458, hits=0, hitRatio=0.00%%, cachingAccesses=0, cachingHits=0, 
cachingHitsRatio=�%, evictions=0, evicted=0, evictedPerRun=NaN
{noformat}

Note the double percent on "hitRatio", and the stray character at 
"cachingHitsRatio".

The former is a added by the code in LruBlockCache.java:

{code}
...
"hitRatio=" +
  (stats.getHitCount() == 0 ? "0" : 
(StringUtils.formatPercent(stats.getHitRatio(), 2) + "%, ")) +
...
{code}

The StringUtils already adds a percent sign, so the trailing one here can be 
dropped.

The latter I presume is caused by the value not between 0.0 and 1.0. This 
should be checked and "NaN" or so displayed instead as is done for other values.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira




[jira] [Commented] (HBASE-1938) Make in-memory table scanning faster

2011-07-26 Thread nkeywal (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-1938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13070987#comment-13070987
 ] 

nkeywal commented on HBASE-1938:


I will write the patch, that will be simpler :-).

What is yet not clear to me is what we can expect when there are multiple 
threads using the same scanner, as the "readPoint" is a TLS. However, the patch 
will not change the current behavior.



> Make in-memory table scanning faster
> 
>
> Key: HBASE-1938
> URL: https://issues.apache.org/jira/browse/HBASE-1938
> Project: HBase
>  Issue Type: Improvement
>  Components: performance
>Reporter: stack
>Assignee: stack
>Priority: Blocker
> Attachments: MemStoreScanPerformance.java, 
> MemStoreScanPerformance.java, caching-keylength-in-kv.patch, test.patch
>
>
> This issue is about profiling hbase to see if I can make hbase scans run 
> faster when all is up in memory.  Talking to some users, they are seeing 
> about 1/4 million rows a second.  It should be able to go faster than this 
> (Scanning an array of objects, they can do about 4-5x this).

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira