[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489278#comment-13489278
 ] 

Hadoop QA commented on HBASE-6410:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551821/HBASE-6410-16.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 164 
new or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
87 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 7 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.util.TestMiniClusterLoadSequential
  org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3213//console

This message is automatically generated.

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, 
> HBASE-6410-16.patch, HBASE-6410-1.patch, HBASE-6410-2.patch, 
> HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, 
> HBASE-6410-6.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-11-01 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6410:
-

Attachment: HBASE-6410-16.patch

Fix for the test failure.

I changed one of the hadoop shims but forgot to change the other.  That will be 
something to watch out for on metrics changes from now on.

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, 
> HBASE-6410-16.patch, HBASE-6410-1.patch, HBASE-6410-2.patch, 
> HBASE-6410-3.patch, HBASE-6410-4.patch, HBASE-6410-5.patch, 
> HBASE-6410-6.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489256#comment-13489256
 ] 

Ted Yu commented on HBASE-2645:
---

Thanks for the verification, Stack. 

Appreciate it. 

> HLog writer can do 1-2 sync operations after lease has been recovered for 
> split process.
> 
>
> Key: HBASE-2645
> URL: https://issues.apache.org/jira/browse/HBASE-2645
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.90.4
>Reporter: Cosmin Lehene
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, 
> org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt
>
>
> TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. 
> This test starts a thread that writes one edit to the log, syncs and counts. 
> During this, a HLog.splitLog operation is started. splitLog recovers the log 
> lease before reading the log, so that the original regionserver could not 
> wake up and write after the split process started.  
> The test compares the number of edits reported by the split process and by 
> the writer thread. Writer thread (called zombie in the test) should report <= 
>  than the splitLog (sync() might raise after the last edit gets written and 
> the edit won't get counted by zombie thread). However it appears that the 
> zombie counts 1-2 more edits. So it looks like it can sync without a lease.
> This might be a hdfs-0.20 related issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489252#comment-13489252
 ] 

stack commented on HBASE-2645:
--

It fails for me too locally if I keep running it.  If I do the below, it 
usually passes but it can also fail:

{code}
$ for i in `seq 0 9`; do echo "here=$i"; mvn  test 
-Dtest=TestHLogSplit#testLogCannotBeWrittenOnceParsed  -PlocalTests; mv 
hbase-server/target/surefire-reports hbase-server/target/$i; done &> 
/tmp/test.txt &
{code}

I tried w/ 1.0.4 and it also fails.

Will revert for now.  If a bug in HDFS, will file an issue.

> HLog writer can do 1-2 sync operations after lease has been recovered for 
> split process.
> 
>
> Key: HBASE-2645
> URL: https://issues.apache.org/jira/browse/HBASE-2645
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.90.4
>Reporter: Cosmin Lehene
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, 
> org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt
>
>
> TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. 
> This test starts a thread that writes one edit to the log, syncs and counts. 
> During this, a HLog.splitLog operation is started. splitLog recovers the log 
> lease before reading the log, so that the original regionserver could not 
> wake up and write after the split process started.  
> The test compares the number of edits reported by the split process and by 
> the writer thread. Writer thread (called zombie in the test) should report <= 
>  than the splitLog (sync() might raise after the last edit gets written and 
> the edit won't get counted by zombie thread). However it appears that the 
> zombie counts 1-2 more edits. So it looks like it can sync without a lease.
> This might be a hdfs-0.20 related issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489247#comment-13489247
 ] 

stack commented on HBASE-7083:
--

bq. As to focing reassign, it is needed since the region state in memory may 
show the missing daughter is open.

[~jxiang] When would the above case be true?  It was not in .META. so its edit 
failed?  How then could it be open in the master given its a new region?

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489248#comment-13489248
 ] 

stack commented on HBASE-7083:
--

Also, if open already in master, could it be open on a regionserver?  So this 
assign will mean double assignment?

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489245#comment-13489245
 ] 

stack commented on HBASE-6410:
--

Above failure looks like you Elliott.  If you fix it and check that my comments 
on your last post to RB are just RB messing up your patch, I'd be +1 committing 
and then addressing teething issues in new JIRAs.

Any one else want to have a look at this?  I think it should go in sooner 
rather than later since its covers alot of ground so will rot quickly.

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, 
> HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, 
> HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489244#comment-13489244
 ] 

Hudson commented on HBASE-6925:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/])
HBASE-6925 Change socket write size from 8K to 64K for HBaseServer 
(Karthik) (Revision 1404777)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java


> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489243#comment-13489243
 ] 

Hudson commented on HBASE-6060:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/])
HBASE-6060 Regions's in OPENING state from failed regionservers takes a 
long time to recover (Revision 1404759)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489242#comment-13489242
 ] 

Hudson commented on HBASE-7077:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/])
HBASE-7077 ADDENDUM, add TestCategory (Revision 1404640)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHBase7051.java


> Test for: CheckAndPut should properly read MVCC
> ---
>
> Key: HBASE-7077
> URL: https://issues.apache.org/jira/browse/HBASE-7077
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Gregory Chanan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch
>
>
> checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed 
> appends and increments.
> Also need a test, here's one we could use (originally proposed in HBASE-7051):
> The current value of some cell is 10.
> I issue two concurrent requests:
> A) a check and put where check value = 10, put value = 11
> B) a put where put value = 50
> The only result at the end of these operations that seems reasonable to me is 
> the value of the cell being 50. If A occurred first (ACID wise), then our 
> values go 10->11->50. If B occurred first, then our values go 10->50 (and the 
> checkAndPut fails)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489241#comment-13489241
 ] 

Hudson commented on HBASE-7082:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #247 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/247/])
HBASE-7082 TestHFileCleaner#testHFileCleaning fails due to cleaner is reset 
(Revision 1404722)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java


> TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
> 
>
> Key: HBASE-7082
> URL: https://issues.apache.org/jira/browse/HBASE-7082
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: trunk-7082.patch
>
>
> TestHFileCleaner#testHFileCleaning fails if it runs after 
> testRemovesEmptyDirectories which resets the cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7083:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Integrated into trunk.  Thanks Stack and Ted for the review.
The failed unit tests are flaky, seem to be fine locally.

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489238#comment-13489238
 ] 

stack commented on HBASE-7055:
--

The last posted patch is what was put on RB Sergey?  It does not include the 
rename of CompactionManager?  You want to add that?  I'd be +1 on committing 
after the rename.  You want to attach the patch to a different issue since this 
issue is about forward porting the 89fb tier compaction?

(Liyin and a few of the lads were by today and said they have a doc. that talks 
about how this new compaction works -- its benefits, etc. -- which they will 
post... IIRC, I think he said it is not yet in production).

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk
> ---
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Comment Edited] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489221#comment-13489221
 ] 

Ted Yu edited comment on HBASE-6796 at 11/2/12 3:37 AM:


I tried to reproduce the test failure in TestHFileCleaner#testHFileCleaning but 
was not successful.
I looked at the test output:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/566/testReport/org.apache.hadoop.hbase.master.cleaner/TestHFileCleaner/testHFileCleaning/
{code}
2012-11-02 01:29:55,534 INFO  [pool-1-thread-1] hbase.ResourceChecker(145): 
before master.cleaner.TestHFileCleaner#testTTLCleaner: 43 threads, 127 file 
descriptors 0 connections, 
2012-11-02 01:29:55,898 DEBUG [pool-1-thread-1] 
cleaner.TimeToLiveHFileCleaner(68): Life:108, ttl:100, current:1351819795897, 
from: 1351819795789
2012-11-02 01:29:55,900 INFO  [pool-1-thread-1] hbase.ResourceChecker(145): 
after master.cleaner.TestHFileCleaner#testTTLCleaner: 44 threads (was 43), 127 
file descriptors (was 127). 0 connections,  -thread leak?- 
{code}
We can see that one thread didn't exit.

Still, it is not very obvious which thread hung :-(

  was (Author: yuzhih...@gmail.com):
I tried to reproduce the test failure in TestHFileCleaner#testHFileCleaning 
but was not successful.
I looked at the test output:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/566/testReport/org.apache.hadoop.hbase.master.cleaner/TestHFileCleaner/testHFileCleaning/
{code}
2012-11-02 01:29:57,917 INFO  [pool-1-thread-1] hbase.ResourceChecker(145): 
before master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 45 threads, 
127 file descriptors 0 connections, 
...
2012-11-02 01:29:58,730 INFO  [pool-1-thread-1] hbase.ResourceChecker(145): 
after master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 44 threads 
(was 45), 127 file descriptors (was 127). 0 connections, 
{code}
We can see that one thread didn't exit.
I checked test output on MacBook. The thread count in both occasions was the 
same.

Still, it is not very obvious which thread hung :-(
  
> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489221#comment-13489221
 ] 

Ted Yu commented on HBASE-6796:
---

I tried to reproduce the test failure in TestHFileCleaner#testHFileCleaning but 
was not successful.
I looked at the test output:
https://builds.apache.org/view/G-L/view/HBase/job/HBase-0.94/566/testReport/org.apache.hadoop.hbase.master.cleaner/TestHFileCleaner/testHFileCleaning/
{code}
2012-11-02 01:29:57,917 INFO  [pool-1-thread-1] hbase.ResourceChecker(145): 
before master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 45 threads, 
127 file descriptors 0 connections, 
...
2012-11-02 01:29:58,730 INFO  [pool-1-thread-1] hbase.ResourceChecker(145): 
after master.cleaner.TestHFileCleaner#testRemovesEmptyDirectories: 44 threads 
(was 45), 127 file descriptors (was 127). 0 connections, 
{code}
We can see that one thread didn't exit.
I checked test output on MacBook. The thread count in both occasions was the 
same.

Still, it is not very obvious which thread hung :-(

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489208#comment-13489208
 ] 

Hadoop QA commented on HBASE-6410:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551809/HBASE-6410-15.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 164 
new or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
87 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 7 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.regionserver.TestMetricsRegionServer

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3212//console

This message is automatically generated.

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, 
> HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, 
> HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489200#comment-13489200
 ] 

Hudson commented on HBASE-6925:
---

Integrated in HBase-0.94 #566 (See 
[https://builds.apache.org/job/HBase-0.94/566/])
HBASE-6925 Change socket write size from 8K to 64K for HBaseServer 
(Karthik) (Revision 1404776)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java


> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-11-01 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6410:
-

Attachment: HBASE-6410-15.patch

Added the comments that Stack asked for.
Added more tests that hit the region server.

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-13.patch, HBASE-6410-15.patch, 
> HBASE-6410-1.patch, HBASE-6410-2.patch, HBASE-6410-3.patch, 
> HBASE-6410-4.patch, HBASE-6410-5.patch, HBASE-6410-6.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489176#comment-13489176
 ] 

Cheng Hao commented on HBASE-6852:
--

Thanks Lars and Ted, I will try to reproduce the failure locally first, and 
then to see if any logical bug of the schema metrics flushing.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.4
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7055) port HBASE-6371 tier-based compaction from 0.89-fb to trunk

2012-11-01 Thread Sergey Shelukhin (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-7055:


Attachment: HBASE-6371-v4-refactor-only-squashed.patch

> port HBASE-6371 tier-based compaction from 0.89-fb to trunk
> ---
>
> Key: HBASE-7055
> URL: https://issues.apache.org/jira/browse/HBASE-7055
> Project: HBase
>  Issue Type: Task
>  Components: Compaction
>Affects Versions: 0.96.0
>Reporter: Sergey Shelukhin
>Assignee: Sergey Shelukhin
> Fix For: 0.96.0
>
> Attachments: HBASE-6371-squashed.patch, HBASE-6371-v2-squashed.patch, 
> HBASE-6371-v3-refactor-only-squashed.patch, 
> HBASE-6371-v4-refactor-only-squashed.patch
>
>
> There's divergence in the code :(
> See HBASE-6371 for details.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6305) TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds.

2012-11-01 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489152#comment-13489152
 ] 

Jimmy Xiang commented on HBASE-6305:


How about where this conf is used?

> TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds.
> 
>
> Key: HBASE-6305
> URL: https://issues.apache.org/jira/browse/HBASE-6305
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.92.2, 0.94.1
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.3
>
> Attachments: hbase-6305-94.patch, HBASE-6305-94-v2.patch, 
> HBASE-6305-94-v2.patch, HBASE-6305-v1.patch
>
>
> trunk: mvn clean test -Dhadoop.profile=2.0 -Dtest=TestLocalHBaseCluster
> 0.94: mvn clean test -Dhadoop.profile=23 -Dtest=TestLocalHBaseCluster
> {code}
> testLocalHBaseCluster(org.apache.hadoop.hbase.TestLocalHBaseCluster)  Time 
> elapsed: 0.022 sec  <<< ERROR!
> java.lang.RuntimeException: Master not initialized after 200 seconds
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:208)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:424)
> at 
> org.apache.hadoop.hbase.TestLocalHBaseCluster.testLocalHBaseCluster(TestLocalHBaseCluster.java:66)
> ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489148#comment-13489148
 ] 

Hudson commented on HBASE-6925:
---

Integrated in HBase-TRUNK #3507 (See 
[https://builds.apache.org/job/HBase-TRUNK/3507/])
HBASE-6925 Change socket write size from 8K to 64K for HBaseServer 
(Karthik) (Revision 1404777)

 Result = SUCCESS
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/HBaseServer.java


> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations

2012-11-01 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489136#comment-13489136
 ] 

Jean-Daniel Cryans commented on HBASE-4583:
---

bq. You guys used this more extensively at SU than anybody else, I think. Would 
you stick to your statement that the upsert complexity and breaking of the 
general HBase VERSIONS contract is warranted for the performance gain?

We're kind of dependent of the current performance now, unless we start double 
counting or missing increments then statu quo I'd say is preferred. It's still 
a bit unclear to me what the implications of not having this jira mean. And 
also what strick and non-strict means.

Let me phrase it like this then: I think everyone can agree that we don't want 
data loss (eg no missed increments), so taking a performance hit is likely 
necessary. If it's only a case where a client reading a counter would miss the 
most updated values, then that's something at least I can live with.

> Integrate RWCC with Append and Increment operations
> ---
>
> Key: HBASE-4583
> URL: https://issues.apache.org/jira/browse/HBASE-4583
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.96.0
>
> Attachments: 4583-trunk-less-radical.txt, 
> 4583-trunk-less-radical-v2.txt, 4583-trunk-less-radical-v3.txt, 
> 4583-trunk-less-radical-v4.txt, 4583-trunk-less-radical-v5.txt, 
> 4583-trunk-less-radical-v6.txt, 4583-trunk-radical.txt, 
> 4583-trunk-radical_v2.txt, 4583-trunk-v3.txt, 4583.txt, 4583-v2.txt, 
> 4583-v3.txt, 4583-v4.txt
>
>
> Currently Increment and Append operations do not work with RWCC and hence a 
> client could see the results of multiple such operation mixed in the same 
> Get/Scan.
> The semantics might be a bit more interesting here as upsert adds and removes 
> to and from the memstore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489119#comment-13489119
 ] 

Lars Hofhansl commented on HBASE-4583:
--

You guys used this more extensively at SU than anybody else, I think. Would you 
stick to your statement that the upsert complexity and breaking of the general 
HBase VERSIONS contract is warranted for the performance gain?

It would probably be am even bigger problem if there's other load (that fills 
the memstore) that now unnecessarily needs to be flushed because of many 
increments.


> Integrate RWCC with Append and Increment operations
> ---
>
> Key: HBASE-4583
> URL: https://issues.apache.org/jira/browse/HBASE-4583
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.96.0
>
> Attachments: 4583-trunk-less-radical.txt, 
> 4583-trunk-less-radical-v2.txt, 4583-trunk-less-radical-v3.txt, 
> 4583-trunk-less-radical-v4.txt, 4583-trunk-less-radical-v5.txt, 
> 4583-trunk-less-radical-v6.txt, 4583-trunk-radical.txt, 
> 4583-trunk-radical_v2.txt, 4583-trunk-v3.txt, 4583.txt, 4583-v2.txt, 
> 4583-v3.txt, 4583-v4.txt
>
>
> Currently Increment and Append operations do not work with RWCC and hence a 
> client could see the results of multiple such operation mixed in the same 
> Get/Scan.
> The semantics might be a bit more interesting here as upsert adds and removes 
> to and from the memstore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4583) Integrate RWCC with Append and Increment operations

2012-11-01 Thread Jean-Daniel Cryans (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4583?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489112#comment-13489112
 ] 

Jean-Daniel Cryans commented on HBASE-4583:
---

bq. Jean-Daniel Cryans I was wondering whether you could do your same 
performance test with the "radical" version of this patch (4583-trunk-v3.txt)?

What I saw is it's 10% slower, but also that it was flushing tiny HFiles.

> Integrate RWCC with Append and Increment operations
> ---
>
> Key: HBASE-4583
> URL: https://issues.apache.org/jira/browse/HBASE-4583
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Lars Hofhansl
> Fix For: 0.96.0
>
> Attachments: 4583-trunk-less-radical.txt, 
> 4583-trunk-less-radical-v2.txt, 4583-trunk-less-radical-v3.txt, 
> 4583-trunk-less-radical-v4.txt, 4583-trunk-less-radical-v5.txt, 
> 4583-trunk-less-radical-v6.txt, 4583-trunk-radical.txt, 
> 4583-trunk-radical_v2.txt, 4583-trunk-v3.txt, 4583.txt, 4583-v2.txt, 
> 4583-v3.txt, 4583-v4.txt
>
>
> Currently Increment and Append operations do not work with RWCC and hence a 
> client could see the results of multiple such operation mixed in the same 
> Get/Scan.
> The semantics might be a bit more interesting here as upsert adds and removes 
> to and from the memstore.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6305) TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds.

2012-11-01 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6305?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489109#comment-13489109
 ] 

Himanshu Vashishtha commented on HBASE-6305:


[~lhofhansl] The reason I didn't touch the HBaseTestingUtility class is the 
method is used by number of other methods (36 references to getTestDataDir()). 
I think it is beyond the scope of this jira. Thoughts? 

> TestLocalHBaseCluster hangs with hadoop 2.0/0.23 builds.
> 
>
> Key: HBASE-6305
> URL: https://issues.apache.org/jira/browse/HBASE-6305
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Affects Versions: 0.92.2, 0.94.1
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.94.3
>
> Attachments: hbase-6305-94.patch, HBASE-6305-94-v2.patch, 
> HBASE-6305-94-v2.patch, HBASE-6305-v1.patch
>
>
> trunk: mvn clean test -Dhadoop.profile=2.0 -Dtest=TestLocalHBaseCluster
> 0.94: mvn clean test -Dhadoop.profile=23 -Dtest=TestLocalHBaseCluster
> {code}
> testLocalHBaseCluster(org.apache.hadoop.hbase.TestLocalHBaseCluster)  Time 
> elapsed: 0.022 sec  <<< ERROR!
> java.lang.RuntimeException: Master not initialized after 200 seconds
> at 
> org.apache.hadoop.hbase.util.JVMClusterUtil.startup(JVMClusterUtil.java:208)
> at 
> org.apache.hadoop.hbase.LocalHBaseCluster.startup(LocalHBaseCluster.java:424)
> at 
> org.apache.hadoop.hbase.TestLocalHBaseCluster.testLocalHBaseCluster(TestLocalHBaseCluster.java:66)
> ...
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489105#comment-13489105
 ] 

Lars Hofhansl commented on HBASE-6796:
--

That test run didn't look so good.
[~jesse_yates] Mind having a look?

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489094#comment-13489094
 ] 

Hudson commented on HBASE-6796:
---

Integrated in HBase-0.94 #565 (See 
[https://builds.apache.org/job/HBase-0.94/565/])
HBASE-6796 Backport HBASE-5547, Don't delete HFiles in backup mode. (Jesse 
Yates) (Revision 1404762)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/BaseConfigurable.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/Chore.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/TimeToLiveLogCleaner.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseHFileCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseLogCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/FileCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/LogCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveHFileCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveLogCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java
* /hbase/branches/0.94/src/main/resources/hbase-default.xml
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup/TestHFileArchiving.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestCleanerChore.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/CheckedArchivingHFileCleaner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/HFileArchiveTestingUtil.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestHFileArchiveUtil.java


> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5547) Don't delete HFiles when in "backup mode"

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489093#comment-13489093
 ] 

Hudson commented on HBASE-5547:
---

Integrated in HBase-0.94 #565 (See 
[https://builds.apache.org/job/HBase-0.94/565/])
HBASE-6796 Backport HBASE-5547, Don't delete HFiles in backup mode. (Jesse 
Yates) (Revision 1404762)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/BaseConfigurable.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/Chore.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/HConstants.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/backup/HFileArchiver.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/CatalogJanitor.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/LogCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/MasterFileSystem.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/TimeToLiveLogCleaner.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseHFileCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/BaseLogCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/CleanerChore.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/FileCleanerDelegate.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/HFileCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/LogCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveHFileCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/cleaner/TimeToLiveLogCleaner.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/Store.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/StoreFile.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/master/ReplicationLogCleaner.java
* /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/FSUtils.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/util/HFileArchiveUtil.java
* /hbase/branches/0.94/src/main/resources/hbase-default.xml
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/HBaseTestingUtility.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/backup/TestHFileArchiving.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/TestLogsCleaner.java
* /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestCleanerChore.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestLogsCleaner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/CheckedArchivingHFileCleaner.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/HFileArchiveTestingUtil.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestFSTableDescriptors.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/util/TestHFileArchiveUtil.java


> Don't delete HFiles when in "backup mode"
> -
>
> Key: HBASE-5547
> URL: https://issues.apache.org/jira/browse/HBASE-5547
> Project: HBase
>  Issue Type: New Feature
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.96.0
>
> Attachments: 5547.addendum-v3, 5547-addendum-v4.txt, 5547-v12.txt, 
> 5547-v16.txt, hbase-5447-v8.patch, hbase-5447-v8.patch, 
> hbase-5547-0.94-backport-v0.patch, hbase-5547-v9.patch, 
> java_HBASE-5547.addendum, java_HBASE-5547.addendum-v1, 
> java_HBASE-5547.addendum-v2, java_HBASE-5547_v13.patch, 
> java_HBASE-5547_v14.patch, java_HBASE-5547_v15.patch, 
> java_HBASE-5547_v4.patch, java_HBASE-5547_v5.patch, java_HBASE-5547_v6.patch, 
> jav

[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489092#comment-13489092
 ] 

Hudson commented on HBASE-7017:
---

Integrated in HBase-0.94 #565 (See 
[https://builds.apache.org/job/HBase-0.94/565/])
HBASE-7017 Backport '[replication] The replication-executor should make 
sure the file that it is replicating is closed before declaring success on that 
file' to 0.94 (Devaraj Das) (Revision 1404764)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/Replication.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSourceManager.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/replication/regionserver/TestReplicationSourceManager.java


> Backport "[replication] The replication-executor should make sure the file 
> that it is replicating is closed before declaring success on that file" to 
> 0.94
> --
>
> Key: HBASE-7017
> URL: https://issues.apache.org/jira/browse/HBASE-7017
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.3
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489080#comment-13489080
 ] 

Hudson commented on HBASE-6060:
---

Integrated in HBase-TRUNK #3506 (See 
[https://builds.apache.org/job/HBase-TRUNK/3506/])
HBASE-6060 Regions's in OPENING state from failed regionservers takes a 
long time to recover (Revision 1404759)

 Result = FAILURE
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/AssignmentManager.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionState.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/handler/ServerShutdownHandler.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489070#comment-13489070
 ] 

Hadoop QA commented on HBASE-7083:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551760/trunk-7083.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:red}-1 tests included{color}.  The patch doesn't appear to include 
any new or modified tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
85 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit
  org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3211//console

This message is automatically generated.

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-3996) Support multiple tables and scanners as input to the mapper in map/reduce jobs

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-3996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-3996:
-

Fix Version/s: (was: 0.94.3)
   0.94.4

Oh well... Probably not getting to it.

> Support multiple tables and scanners as input to the mapper in map/reduce jobs
> --
>
> Key: HBASE-3996
> URL: https://issues.apache.org/jira/browse/HBASE-3996
> Project: HBase
>  Issue Type: Improvement
>  Components: mapreduce
>Reporter: Eran Kutner
>Assignee: Lars Hofhansl
> Fix For: 0.96.0, 0.94.4
>
> Attachments: 3996-v2.txt, 3996-v3.txt, 3996-v4.txt, 3996-v5.txt, 
> 3996-v6.txt, 3996-v7.txt, HBase-3996.patch
>
>
> It seems that in many cases feeding data from multiple tables or multiple 
> scanners on a single table can save a lot of time when running map/reduce 
> jobs.
> I propose a new MultiTableInputFormat class that would allow doing this.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6925.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.94 and 0.96.

Thanks Karthik.

> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489051#comment-13489051
 ] 

Jesse Yates commented on HBASE-6796:


Thanks Lars!

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489042#comment-13489042
 ] 

Karthik Ranganathan commented on HBASE-6925:


Go for the commit Lars!

> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489041#comment-13489041
 ] 

Ted Yu commented on HBASE-2645:
---

Here is information about my environment:
{code}
java version "1.6.0_35"
Java(TM) SE Runtime Environment (build 1.6.0_35-b10-428-11M3811)
Java HotSpot(TM) 64-Bit Server VM (build 20.10-b01-428, mixed mode)

Darwin T32 11.4.2 Darwin Kernel Version 11.4.2: Thu Aug 23 16:25:48 PDT 2012; 
root:xnu-1699.32.7~1/RELEASE_X86_64 x86_64
{code}
Initially I wasn't sure whether the test failure only happened with JDK 1.6

Later I saw the test failure on Jenkins where JDK 1.7 was used.

> HLog writer can do 1-2 sync operations after lease has been recovered for 
> split process.
> 
>
> Key: HBASE-2645
> URL: https://issues.apache.org/jira/browse/HBASE-2645
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.90.4
>Reporter: Cosmin Lehene
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, 
> org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt
>
>
> TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. 
> This test starts a thread that writes one edit to the log, syncs and counts. 
> During this, a HLog.splitLog operation is started. splitLog recovers the log 
> lease before reading the log, so that the original regionserver could not 
> wake up and write after the split process started.  
> The test compares the number of edits reported by the split process and by 
> the writer thread. Writer thread (called zombie in the test) should report <= 
>  than the splitLog (sync() might raise after the last edit gets written and 
> the edit won't get counted by zombie thread). However it appears that the 
> zombie counts 1-2 more edits. So it looks like it can sync without a lease.
> This might be a hdfs-0.20 related issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489033#comment-13489033
 ] 

Hudson commented on HBASE-6852:
---

Integrated in HBase-0.94 #564 (See 
[https://builds.apache.org/job/HBase-0.94/564/])
HBASE-6852, REVERT again, due to unexplained test failures that only occur 
on the jenkins machines (Revision 1404691)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV1.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileReaderV2.java
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/metrics/SchemaMetrics.java
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/metrics/TestSchemaMetrics.java


> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.4
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489030#comment-13489030
 ] 

stack commented on HBASE-2645:
--

[~ted_yu], your repeat of info in here, info that is public and available to 
all is of no help; its just spam in our mailboxes.  Neither is your cut-off 
paste of a failure on your local machine w/ no context such as what your local 
machine is, if it fails always, etc.  I committed the patch because it passed 
locally on repeated runs and passed hadoopqa.

> HLog writer can do 1-2 sync operations after lease has been recovered for 
> split process.
> 
>
> Key: HBASE-2645
> URL: https://issues.apache.org/jira/browse/HBASE-2645
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.90.4
>Reporter: Cosmin Lehene
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, 
> org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt
>
>
> TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. 
> This test starts a thread that writes one edit to the log, syncs and counts. 
> During this, a HLog.splitLog operation is started. splitLog recovers the log 
> lease before reading the log, so that the original regionserver could not 
> wake up and write after the split process started.  
> The test compares the number of edits reported by the split process and by 
> the writer thread. Writer thread (called zombie in the test) should report <= 
>  than the splitLog (sync() might raise after the last edit gets written and 
> the edit won't get counted by zombie thread). However it appears that the 
> zombie counts 1-2 more edits. So it looks like it can sync without a lease.
> This might be a hdfs-0.20 related issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-2611) Handle RS that fails while processing the failure of another one

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-2611:
-

Fix Version/s: (was: 0.94.3)
   0.94.4

Alas, looks like we won't get to this... again.

> Handle RS that fails while processing the failure of another one
> 
>
> Key: HBASE-2611
> URL: https://issues.apache.org/jira/browse/HBASE-2611
> Project: HBase
>  Issue Type: Sub-task
>  Components: Replication
>Reporter: Jean-Daniel Cryans
>Assignee: Chris Trezzo
> Fix For: 0.94.4
>
> Attachments: HBase-2611-upstream-v1.patch
>
>
> HBASE-2223 doesn't manage region servers that fail while doing the transfer 
> of HLogs queues from other region servers that failed. Devise a reliable way 
> to do it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-7017.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.94. Thanks for patch (and backport)

> Backport "[replication] The replication-executor should make sure the file 
> that it is replicating is closed before declaring success on that file" to 
> 0.94
> --
>
> Key: HBASE-7017
> URL: https://issues.apache.org/jira/browse/HBASE-7017
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.3
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7083:
---

Status: Patch Available  (was: Open)

Try hadoop-qa again.

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7083:
---

Attachment: trunk-7083.patch

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl resolved HBASE-6796.
--

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.94... Pfeeww.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7083:
---

Status: Open  (was: Patch Available)

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch, trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6060:
---

Resolution: Fixed
Status: Resolved  (was: Patch Available)

Integrated into trunk. Thanks all.

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489019#comment-13489019
 ] 

Lars Hofhansl commented on HBASE-6796:
--

According to HBASE-4737 Medium should be < 45s. So I'll make it a LargeTest and 
then commit... Finally.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489017#comment-13489017
 ] 

Lars Hofhansl commented on HBASE-6796:
--

One last question: Should TestHFileArchiving be a LargeTest? It runs for 122s 
on my machine.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489012#comment-13489012
 ] 

Lars Hofhansl commented on HBASE-6796:
--

Or better even, use a better testdir for that test.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489011#comment-13489011
 ] 

Lars Hofhansl commented on HBASE-6796:
--

Test works if the datadir is cleaned first (either in an @After method or 
inline in this one test.)
I'll just paste that fix on this patch and commit.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489010#comment-13489010
 ] 

Jesse Yates commented on HBASE-6796:


Hmm, I added "testReadingArchiveDirectoryFromFS()". And that method literally 
has nothing to do with the added code. Looks like a flapper already in the 
codebase. 

On a related note, it looks like Jenkins isn't running all the tests for 0.94. 
See 
https://builds.apache.org/job/HBase-0.94/563/testReport/org.apache.hadoop.hbase.util/
 which should have run TestFSTableDescriptors...but shows neither hide nor hair 
of it (or many other expected tests).

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489005#comment-13489005
 ] 

Hudson commented on HBASE-7082:
---

Integrated in HBase-TRUNK #3505 (See 
[https://builds.apache.org/job/HBase-TRUNK/3505/])
HBASE-7082 TestHFileCleaner#testHFileCleaning fails due to cleaner is reset 
(Revision 1404722)

 Result = SUCCESS
jxiang : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/master/cleaner/TestHFileCleaner.java


> TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
> 
>
> Key: HBASE-7082
> URL: https://issues.apache.org/jira/browse/HBASE-7082
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: trunk-7082.patch
>
>
> TestHFileCleaner#testHFileCleaning fails if it runs after 
> testRemovesEmptyDirectories which resets the cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489003#comment-13489003
 ] 

Lars Hofhansl commented on HBASE-6796:
--

And of course it only happens sometimes and never in the debugger. Sigh.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94

2012-11-01 Thread Devaraj Das (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13489000#comment-13489000
 ] 

Devaraj Das commented on HBASE-7017:


[~lhofhansl] - Yes..

> Backport "[replication] The replication-executor should make sure the file 
> that it is replicating is closed before declaring success on that file" to 
> 0.94
> --
>
> Key: HBASE-7017
> URL: https://issues.apache.org/jira/browse/HBASE-7017
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.3
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488997#comment-13488997
 ] 

stack commented on HBASE-6925:
--

Looks like HBaseClient in 89fb and in trunk or 0.94 are very different.  That 
might explain it Lars (Looking at the 89fb client, I can see how brother 
Karthik's change could make such a big difference).

> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488995#comment-13488995
 ] 

Lars Hofhansl commented on HBASE-6925:
--

Thanks Karthik, I'm going to commit, unless you want to :)

> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488993#comment-13488993
 ] 

Lars Hofhansl commented on HBASE-6796:
--

I'm actually getting a failure with TestFSTableDescriptors :(
{code}
testCreateAndUpdate(org.apache.hadoop.hbase.util.TestFSTableDescriptors)  Time 
elapsed: 0.025 sec  <<< FAILURE!
java.lang.AssertionError: statuses.length=5
at org.junit.Assert.fail(Assert.java:93)
at org.junit.Assert.assertTrue(Assert.java:43)
at 
org.apache.hadoop.hbase.util.TestFSTableDescriptors.testCreateAndUpdate(TestFSTableDescriptors.java:63)
{code}

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Karthik Ranganathan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488991#comment-13488991
 ] 

Karthik Ranganathan commented on HBASE-6925:


No, I dont think that would matter, this is more about the socket transfer size 
into an underlying buffer.

> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488982#comment-13488982
 ] 

Lars Hofhansl commented on HBASE-7017:
--

I assume this was with 6758-0.94.txt from HBASE-6758?


> Backport "[replication] The replication-executor should make sure the file 
> that it is replicating is closed before declaring success on that file" to 
> 0.94
> --
>
> Key: HBASE-7017
> URL: https://issues.apache.org/jira/browse/HBASE-7017
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.3
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488973#comment-13488973
 ] 

Lars Hofhansl commented on HBASE-6925:
--

I tried a local test (distributed HBase, but only a single RegionServer and 
Datanode all located on the same machine)... I could not discern any change in 
performance (as maybe expected from local-only test). I also changed the buffer 
size in Result.java, but again no difference.


> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6796:
---

Attachment: hbase-6796-v2.patch

Attaching patch on top of current 0.94 branch. Ran "mvn clean test 
-Dtest=TestLogsCleaner,TestHFileArchiving,TesstCatalogJanitor,TestCleanerChore,TestHFileCleaner,TestFSTableDescriptors,TestHFileArchiveUtil
 -P localTests" with no test failures.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch, hbase-6796-v2.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488939#comment-13488939
 ] 

Jesse Yates commented on HBASE-6796:


hmmm, I'll try a rebase and see.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488936#comment-13488936
 ] 

Lars Hofhansl commented on HBASE-6796:
--

Hmm... Patch does not apply cleanly to 0.94

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488933#comment-13488933
 ] 

Jesse Yates commented on HBASE-6796:


[~lhofhansl] seemed like it would be less work for you this way.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488930#comment-13488930
 ] 

Hadoop QA commented on HBASE-6060:
--

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12551735/trunk-6060_v3.3.patch
  against trunk revision .

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 3 new 
or modified tests.

{color:green}+1 hadoop2.0{color}.  The patch compiles against the hadoop 
2.0 profile.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
85 warning messages.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:red}-1 findbugs{color}.  The patch appears to introduce 3 new 
Findbugs (version 1.3.9) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/3210//console

This message is automatically generated.

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent

[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488931#comment-13488931
 ] 

Lars Hofhansl commented on HBASE-6796:
--

Oh I see, you did that with all the related changes. Fine then.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6925) Change socket write size from 8K to 64K for HBaseServer

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488928#comment-13488928
 ] 

Lars Hofhansl commented on HBASE-6925:
--

I'd definitely be interested in the various changes that went into this.

HBASE-4956 just fixes Result.readFields to read the resultbuffer in 8k chunks?
Would that defeat the purpose of this change?

I can do some quick test on a single host, but that won't be very 
representative.

> Change socket write size from 8K to 64K for HBaseServer
> ---
>
> Key: HBASE-6925
> URL: https://issues.apache.org/jira/browse/HBASE-6925
> Project: HBase
>  Issue Type: Sub-task
>  Components: Performance
>Reporter: Karthik Ranganathan
>Assignee: Karthik Ranganathan
>Priority: Critical
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-6925.patch
>
>
> Creating a JIRA for this, but the change is trivial: change NIO_BUFFER_LIMIT 
> from 8K to 64K in HBaseServer. This seems to increase scan throughput.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488929#comment-13488929
 ] 

Lars Hofhansl commented on HBASE-6796:
--

Wanna do it this way, or should I rather commit HBASE-7082 to 0.94? I'm fine 
either way.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7082:
---

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Integrated into trunk. Thanks Himanshu and Jesse for reviewing it.

> TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
> 
>
> Key: HBASE-7082
> URL: https://issues.apache.org/jira/browse/HBASE-7082
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Trivial
> Fix For: 0.96.0
>
> Attachments: trunk-7082.patch
>
>
> TestHFileCleaner#testHFileCleaning fails if it runs after 
> testRemovesEmptyDirectories which resets the cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6796) Backport HBASE-5547, Don't delete HFiles in backup mode.

2012-11-01 Thread Jesse Yates (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jesse Yates updated HBASE-6796:
---

Attachment: hbase-6796-v1.patch

Updating patch to include fix from HBASE-7082.

> Backport HBASE-5547, Don't delete HFiles in backup mode.
> 
>
> Key: HBASE-6796
> URL: https://issues.apache.org/jira/browse/HBASE-6796
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: Jesse Yates
> Fix For: 0.94.3
>
> Attachments: hbase-5547-0.94-backport-v0.patch, hbase-6796-v0.patch, 
> hbase-6796-v1.patch
>
>
> See HBASE-5547

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset

2012-11-01 Thread Jesse Yates (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488912#comment-13488912
 ] 

Jesse Yates commented on HBASE-7082:


+1 yup, that looks good.

> TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
> 
>
> Key: HBASE-7082
> URL: https://issues.apache.org/jira/browse/HBASE-7082
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Trivial
> Attachments: trunk-7082.patch
>
>
> TestHFileCleaner#testHFileCleaning fails if it runs after 
> testRemovesEmptyDirectories which resets the cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6060:
---

Status: Patch Available  (was: Open)

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7082) TestHFileCleaner#testHFileCleaning fails due to cleaner is reset

2012-11-01 Thread Himanshu Vashishtha (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1348#comment-1348
 ] 

Himanshu Vashishtha commented on HBASE-7082:


+1. lgtm.

> TestHFileCleaner#testHFileCleaning fails due to cleaner is reset
> 
>
> Key: HBASE-7082
> URL: https://issues.apache.org/jira/browse/HBASE-7082
> Project: HBase
>  Issue Type: Test
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Trivial
> Attachments: trunk-7082.patch
>
>
> TestHFileCleaner#testHFileCleaning fails if it runs after 
> testRemovesEmptyDirectories which resets the cleaner.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6060:
---

Assignee: Jimmy Xiang  (was: rajeshbabu)
  Status: Open  (was: Patch Available)

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: Jimmy Xiang
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-6060:
---

Attachment: trunk-6060_v3.3.patch

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: rajeshbabu
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch, trunk-6060_v3.3.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6852:
-

Fix Version/s: (was: 0.94.3)
   0.94.4

Moving to 0.94.4.

Cheng, looking at the patch, do you think it's possible due to the way we flush 
the metrics for the all schema metrics?

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.4
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl reopened HBASE-6852:
--


Reverted a 2nd time.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.4
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread rajeshbabu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488858#comment-13488858
 ] 

rajeshbabu commented on HBASE-6060:
---

@Jimmy
Its good. +1

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: rajeshbabu
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint

2012-11-01 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488849#comment-13488849
 ] 

Francis Liu commented on HBASE-7042:


{quote}
It is possible to reuse Exec and ExecResult. Just ignore / don't set the row 
fields etc. Having separate classes for master exec only is not a clean API in 
my opinion. 
{quote}
Apart from being a bit clunky I made them separate so that each class can 
evolve without master/region usages stepping/confusing on each other. If you 
think this is ok I'm fine with reusing the Exec and ExecResult.

> Master Coprocessor Endpoint
> ---
>
> Key: HBASE-7042
> URL: https://issues.apache.org/jira/browse/HBASE-7042
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Francis Liu
>Assignee: Francis Liu
> Fix For: 0.96.0
>
> Attachments: HBASE-7042_94.patch
>
>
> Having support for a master coprocessor endpoint would enable developers to 
> easily extended HMaster functionality/features. As is the case for region 
> server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488846#comment-13488846
 ] 

Lars Hofhansl commented on HBASE-7017:
--

Thanks DD. I will commit this today, unless I hear any objections.

> Backport "[replication] The replication-executor should make sure the file 
> that it is replicating is closed before declaring success on that file" to 
> 0.94
> --
>
> Key: HBASE-7017
> URL: https://issues.apache.org/jira/browse/HBASE-7017
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.3
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint

2012-11-01 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488845#comment-13488845
 ] 

Francis Liu commented on HBASE-7042:


Though I believe your concern can be addressed by making system coprocessors 
reloadable. Which I think we should do for both master and region coprocessors 
anyway. This we can address in a separate jira?

> Master Coprocessor Endpoint
> ---
>
> Key: HBASE-7042
> URL: https://issues.apache.org/jira/browse/HBASE-7042
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Francis Liu
>Assignee: Francis Liu
> Fix For: 0.96.0
>
> Attachments: HBASE-7042_94.patch
>
>
> Having support for a master coprocessor endpoint would enable developers to 
> easily extended HMaster functionality/features. As is the case for region 
> server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7017) Backport "[replication] The replication-executor should make sure the file that it is replicating is closed before declaring success on that file" to 0.94

2012-11-01 Thread Devaraj Das (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated HBASE-7017:
---

Assignee: Devaraj Das

I was able to run a decently large test.. and the patch showed no problems. In 
particular, I ran the LoadTestTool with 100 keys, and set the config for 
hbase.regionserver.hlog.blocksize to be a low number (1 MB). The test led to 
~650 log rolls on one of the region servers I was tracking. I verified that the 
data got successfully replicated to the remote cluster.

> Backport "[replication] The replication-executor should make sure the file 
> that it is replicating is closed before declaring success on that file" to 
> 0.94
> --
>
> Key: HBASE-7017
> URL: https://issues.apache.org/jira/browse/HBASE-7017
> Project: HBase
>  Issue Type: Bug
>  Components: Replication
>Reporter: Devaraj Das
>Assignee: Devaraj Das
> Fix For: 0.94.3
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7042) Master Coprocessor Endpoint

2012-11-01 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7042?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488838#comment-13488838
 ] 

Francis Liu commented on HBASE-7042:


{quote}
The issue I see with supporting only system coprocessors is it requires 
deployment by way of XML config update as the sole option, which requires a 
process restart. Table CPs can be deployed to HDFS and then loaded dynamically 
with a schema update without needing to stop any process. Not supporting this 
latter option on the master seems a shortcut and limiting. 
{quote}
Don't we already have a precedent for this with MasterObserver? From a use case 
perspective it would seem to me that Master Coprocessors are cluster owner 
decisions and are prolly fewer and slower moving than region coprocessors which 
are owned by table owners. In the former case updating and restarting the 
master should be acceptable?

> Master Coprocessor Endpoint
> ---
>
> Key: HBASE-7042
> URL: https://issues.apache.org/jira/browse/HBASE-7042
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Francis Liu
>Assignee: Francis Liu
> Fix For: 0.96.0
>
> Attachments: HBASE-7042_94.patch
>
>
> Having support for a master coprocessor endpoint would enable developers to 
> easily extended HMaster functionality/features. As is the case for region 
> server grouping.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6060:
-

Fix Version/s: (was: 0.94.3)
   (was: 0.92.3)

Removed from 0.92 and 0.94.

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: rajeshbabu
> Fix For: 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488824#comment-13488824
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Sorry, Cheng, I am probably going to have to roll this back again.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.3
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7066) Some HMaster coprocessor exceptions are being swallowed in try catch blocks

2012-11-01 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488822#comment-13488822
 ] 

Francis Liu commented on HBASE-7066:


Lars, if you agree that this fix is the right way to go. Can we get this into 
0.94.3?

> Some HMaster coprocessor exceptions are being swallowed in try catch blocks
> ---
>
> Key: HBASE-7066
> URL: https://issues.apache.org/jira/browse/HBASE-7066
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors, security
>Affects Versions: 0.94.2, 0.96.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-7066_94.patch, HBASE-7066_trunk.patch, 
> HBASE-7066_trunk.patch
>
>
> This is causing HMaster.shutdown() and HMaster.stopMaster() to succeed even 
> when an AccessDeniedException is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Reopened] (HBASE-2645) HLog writer can do 1-2 sync operations after lease has been recovered for split process.

2012-11-01 Thread Ted Yu (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu reopened HBASE-2645:
---


The test failed in HBase-TRUNK build #3504

> HLog writer can do 1-2 sync operations after lease has been recovered for 
> split process.
> 
>
> Key: HBASE-2645
> URL: https://issues.apache.org/jira/browse/HBASE-2645
> Project: HBase
>  Issue Type: Bug
>  Components: Filters
>Affects Versions: 0.90.4
>Reporter: Cosmin Lehene
>Assignee: Todd Lipcon
>Priority: Blocker
> Fix For: 0.96.0
>
> Attachments: 2645.txt, 2645v2.txt, 2645v3.txt, 
> org.apache.hadoop.hbase.regionserver.wal.TestHLogSplit-output.txt
>
>
> TestHLogSplit.testLogCannotBeWrittenOnceParsed is failing. 
> This test starts a thread that writes one edit to the log, syncs and counts. 
> During this, a HLog.splitLog operation is started. splitLog recovers the log 
> lease before reading the log, so that the original regionserver could not 
> wake up and write after the split process started.  
> The test compares the number of edits reported by the split process and by 
> the writer thread. Writer thread (called zombie in the test) should report <= 
>  than the splitLog (sync() might raise after the last edit gets written and 
> the edit won't get counted by zombie thread). However it appears that the 
> zombie counts 1-2 more edits. So it looks like it can sync without a lease.
> This might be a hdfs-0.20 related issue. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7066) Some HMaster coprocessor exceptions are being swallowed in try catch blocks

2012-11-01 Thread Francis Liu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488818#comment-13488818
 ] 

Francis Liu commented on HBASE-7066:


More specifically if AccessController.preShutdown() throws an IOException, 
shutdown should fail since we cannot truly determine whether the shutdown 
request is authorized and thus should err on the side of caution and fail the 
request. 

There may be instances that preShutdown() hook exceptions can be ignored but 
that is not the case for AccessController which comes packaged with HBase thus 
it would make sense to guarantee that it works as advertised. We can address 
the previous scenario by fixing the coprocessor framework.

> Some HMaster coprocessor exceptions are being swallowed in try catch blocks
> ---
>
> Key: HBASE-7066
> URL: https://issues.apache.org/jira/browse/HBASE-7066
> Project: HBase
>  Issue Type: Bug
>  Components: Coprocessors, security
>Affects Versions: 0.94.2, 0.96.0
>Reporter: Francis Liu
>Assignee: Francis Liu
>Priority: Critical
> Attachments: HBASE-7066_94.patch, HBASE-7066_trunk.patch, 
> HBASE-7066_trunk.patch
>
>
> This is causing HMaster.shutdown() and HMaster.stopMaster() to succeed even 
> when an AccessDeniedException is thrown.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6060) Regions's in OPENING state from failed regionservers takes a long time to recover

2012-11-01 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6060?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488817#comment-13488817
 ] 

Jimmy Xiang commented on HBASE-6060:


@Rajesh, are you ok with the latest patch on RB for trunk?

@Lars, I am fine with letting it go with 0.92 and 0.94. My patch won't work 
well in 0.92/0.94 since it depends on other patches which are too big to 
backport.

> Regions's in OPENING state from failed regionservers takes a long time to 
> recover
> -
>
> Key: HBASE-6060
> URL: https://issues.apache.org/jira/browse/HBASE-6060
> Project: HBase
>  Issue Type: Bug
>  Components: master, regionserver
>Reporter: Enis Soztutar
>Assignee: rajeshbabu
> Fix For: 0.92.3, 0.94.3, 0.96.0
>
> Attachments: 6060-94-v3.patch, 6060-94-v4_1.patch, 
> 6060-94-v4_1.patch, 6060-94-v4.patch, 6060_alternative_suggestion.txt, 
> 6060_suggestion2_based_off_v3.patch, 6060_suggestion_based_off_v3.patch, 
> 6060_suggestion_toassign_rs_wentdown_beforerequest.patch, 6060-trunk_2.patch, 
> 6060-trunk_3.patch, 6060-trunk.patch, 6060-trunk.patch, HBASE-6060-92.patch, 
> HBASE-6060-94.patch, HBASE-6060_latest.patch, HBASE-6060_latest.patch, 
> HBASE-6060_latest.patch, HBASE-6060-trunk_4.patch, HBASE-6060_trunk_5.patch, 
> trunk-6060.patch, trunk-6060_v2.patch
>
>
> we have seen a pattern in tests, that the regions are stuck in OPENING state 
> for a very long time when the region server who is opening the region fails. 
> My understanding of the process: 
>  
>  - master calls rs to open the region. If rs is offline, a new plan is 
> generated (a new rs is chosen). RegionState is set to PENDING_OPEN (only in 
> master memory, zk still shows OFFLINE). See HRegionServer.openRegion(), 
> HMaster.assign()
>  - RegionServer, starts opening a region, changes the state in znode. But 
> that znode is not ephemeral. (see ZkAssign)
>  - Rs transitions zk node from OFFLINE to OPENING. See 
> OpenRegionHandler.process()
>  - rs then opens the region, and changes znode from OPENING to OPENED
>  - when rs is killed between OPENING and OPENED states, then zk shows OPENING 
> state, and the master just waits for rs to change the region state, but since 
> rs is down, that wont happen. 
>  - There is a AssignmentManager.TimeoutMonitor, which does exactly guard 
> against these kind of conditions. It periodically checks (every 10 sec by 
> default) the regions in transition to see whether they timedout 
> (hbase.master.assignment.timeoutmonitor.timeout). Default timeout is 30 min, 
> which explains what you and I are seeing. 
>  - ServerShutdownHandler in Master does not reassign regions in OPENING 
> state, although it handles other states. 
> Lowering that threshold from the configuration is one option, but still I 
> think we can do better. 
> Will investigate more. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488816#comment-13488816
 ] 

Jimmy Xiang commented on HBASE-7083:


Tried to submit to hadoop qa again.

As to focing reassign, it is needed since the region state in memory may show 
the missing daughter is open.
Since it is not in meta, SSH will assign it only in fixupDaughters.  Without 
forcing it, assign won't do
anything since it is already open.

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488814#comment-13488814
 ] 

Hudson commented on HBASE-7077:
---

Integrated in HBase-TRUNK #3504 (See 
[https://builds.apache.org/job/HBase-TRUNK/3504/])
HBASE-7077 ADDENDUM, add TestCategory (Revision 1404640)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestHBase7051.java


> Test for: CheckAndPut should properly read MVCC
> ---
>
> Key: HBASE-7077
> URL: https://issues.apache.org/jira/browse/HBASE-7077
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Gregory Chanan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch
>
>
> checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed 
> appends and increments.
> Also need a test, here's one we could use (originally proposed in HBASE-7051):
> The current value of some cell is 10.
> I issue two concurrent requests:
> A) a check and put where check value = 10, put value = 11
> B) a put where put value = 50
> The only result at the end of these operations that seems reasonable to me is 
> the value of the cell being 50. If A occurred first (ACID wise), then our 
> values go 10->11->50. If B occurred first, then our values go 10->50 (and the 
> checkAndPut fails)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7083:
---

Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488808#comment-13488808
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Also tried on some other machines (JDK7 and JDK6) it passes all the time.
This is extremely disconcerting.


> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.3
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-7083) SSH#fixupDaughter should force re-assign missing daughter

2012-11-01 Thread Jimmy Xiang (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-7083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jimmy Xiang updated HBASE-7083:
---

Status: Open  (was: Patch Available)

> SSH#fixupDaughter should force re-assign missing daughter
> -
>
> Key: HBASE-7083
> URL: https://issues.apache.org/jira/browse/HBASE-7083
> Project: HBase
>  Issue Type: Bug
>  Components: Region Assignment
>Reporter: Jimmy Xiang
>Assignee: Jimmy Xiang
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: trunk-7083.patch
>
>
> In looking into flaky test 
> TestSplitTransactionOnCluster#testShutdownSimpleFixup, I found out that a 
> missing daughter is not assigned by SSH properly.  It could be open on the 
> dead server.  We need to force re-assign it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488795#comment-13488795
 ] 

Ted Yu commented on HBASE-6852:
---

I wasn't able to produce the test failure on MacBook.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.3
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488794#comment-13488794
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Does it fail locally on your machine Ted?
I'm going to run the test on a different machine so that I can debug.
If that is not fruitful, I'll revert the change again... Sigh :(

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.3
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Ted Yu (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488791#comment-13488791
 ] 

Ted Yu commented on HBASE-6852:
---

The new build failed again.
After reverting patch v2, TestScannerSelectionUsingTTL passed on the above 
mentioned platform.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.3
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC

2012-11-01 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488784#comment-13488784
 ] 

Hudson commented on HBASE-7077:
---

Integrated in HBase-0.94 #563 (See 
[https://builds.apache.org/job/HBase-0.94/563/])
HBASE-7077 ADDENDUM, add TestCategory (Revision 1404641)

 Result = FAILURE
larsh : 
Files : 
* 
/hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/regionserver/TestHBase7051.java


> Test for: CheckAndPut should properly read MVCC
> ---
>
> Key: HBASE-7077
> URL: https://issues.apache.org/jira/browse/HBASE-7077
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Gregory Chanan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch
>
>
> checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed 
> appends and increments.
> Also need a test, here's one we could use (originally proposed in HBASE-7051):
> The current value of some cell is 10.
> I issue two concurrent requests:
> A) a check and put where check value = 10, put value = 11
> B) a put where put value = 50
> The only result at the end of these operations that seems reasonable to me is 
> the value of the cell being 50. If A occurred first (ACID wise), then our 
> values go 10->11->50. If B occurred first, then our values go 10->50 (and the 
> checkAndPut fails)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488763#comment-13488763
 ] 

Lars Hofhansl commented on HBASE-6852:
--

I triggered a new build. If that fails again, I am not sure what to do.
I ran the test a lot of times locally and it always passes.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Assignee: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.3
>
> Attachments: 6852-0.94_2.patch, 6852-0.94.txt, metrics_hotspots.png, 
> onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-7077) Test for: CheckAndPut should properly read MVCC

2012-11-01 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-7077?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13488757#comment-13488757
 ] 

Lars Hofhansl commented on HBASE-7077:
--

fixed

> Test for: CheckAndPut should properly read MVCC
> ---
>
> Key: HBASE-7077
> URL: https://issues.apache.org/jira/browse/HBASE-7077
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Gregory Chanan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 0.94.3, 0.96.0
>
> Attachments: HBASE-7071.patch, HBASE-7071_testcasewithassert.patch
>
>
> checkAndPut should integrate with MVCC, similar to how HBASE-4583 fixed 
> appends and increments.
> Also need a test, here's one we could use (originally proposed in HBASE-7051):
> The current value of some cell is 10.
> I issue two concurrent requests:
> A) a check and put where check value = 10, put value = 11
> B) a put where put value = 50
> The only result at the end of these operations that seems reasonable to me is 
> the value of the cell being 50. If A occurred first (ACID wise), then our 
> values go 10->11->50. If B occurred first, then our values go 10->50 (and the 
> checkAndPut fails)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >