[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460280#comment-13460280
 ] 

Cheng Hao commented on HBASE-6852:
--

Lars, the only place to use the ConcurentMap in SchemaMetrics is 
tableAndFamilyToMetrics. in this patch, I pre-create an array of AtomicLong for 
all of the possible oncachehit metrics items, which will avoids the concurrent 
issue and easy to be indexed while accessing.

Thanks stack and Lars for the suggestions, I will create another patch file 
instead.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread liang xie (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460275#comment-13460275
 ] 

liang xie commented on HBASE-6852:
--

Hi Cheng, for running time, could you exclude the system resouce factor ?  e.g. 
you ran the original version with many physical IOs, but reran the patched 
version without similar physical IO requests due to hitting OS page cache.  
In other words, could the reduced running time symptom be reproduced always, 
even you run patched version first, then rerun the original version ?  It'd 
better if you can issue "echo 1 > /proc/sys/vm/drop_caches" to free pagecache 
between each test.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460271#comment-13460271
 ] 

stack commented on HBASE-6806:
--

Hmm... it puts the commit in all issues referenced by the commit message, here 
and HBASE-4658

> HBASE-4658 breaks backward compatibility / example scripts
> --
>
> Key: HBASE-6806
> URL: https://issues.apache.org/jira/browse/HBASE-6806
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Affects Versions: 0.94.0
>Reporter: Lukas
> Fix For: 0.96.0
>
> Attachments: HBASE-6806-fix-examples.diff
>
>
> HBASE-4658 introduces the new 'attributes' argument as a non optional 
> parameter. This is not backward compatible and also breaks the code in the 
> example section. Resolution: Mark as 'optional'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4658) Put attributes are not exposed via the ThriftServer

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460269#comment-13460269
 ] 

stack commented on HBASE-4658:
--

The above comment from hudson is in wrong place.  The parse found the second 
hbase jira referenced which is this one rather than HBASE-6806.

> Put attributes are not exposed via the ThriftServer
> ---
>
> Key: HBASE-4658
> URL: https://issues.apache.org/jira/browse/HBASE-4658
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--D1563.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.2.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.2.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.2.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.3.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.3.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.3.patch, ThriftPutAttributes1.txt
>
>
> The Put api also takes in a bunch of arbitrary attributes that an application 
> can use to associate metadata with each put operation. This is not exposed 
> via Thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460266#comment-13460266
 ] 

Hadoop QA commented on HBASE-6299:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12546000/6299v4.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 139 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2912//console

This message is automatically generated.

> RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
> inconsistency in HMaster's region state and a series of successive problems.
> -
>
> Key: HBASE-6299
> URL: https://issues.apache.org/jira/browse/HBASE-6299
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.94.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Critical
> Fix For: 0.92.3, 0.94.3, 0.96.0
>
> Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, 
> HBASE-6299-v3.patch
>
>
> 1. HMaster tries to assign a region to an RS.
> 2. HMaster creates a RegionState for this region and puts it into 
> regionsInTransition.
> 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
> receives the open region request and starts to proceed, with success 
> eventually. However, due to network problems, HMaster fails to receive the 
> response for the openRegion() call, and the call times out.
> 4. HMaster attemps to assign for a second time, choosing another RS. 
> 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
> region open of the previous RS, and the RegionState has already been removed 
> from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
> node "RS_ZK_REGION_OPENING" updated by the second attempt.
> 6. The unassigned ZK node stays and a later unassign fails coz 
> RS_ZK_REGION_CLOSING cannot be created.
> {code}
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
>  
> plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
>  src=swbss-hadoop-004,60020,1340890123243, 
> dest=swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:28,882 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,13408906

[jira] [Commented] (HBASE-6524) Hooks for hbase tracing

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460265#comment-13460265
 ] 

stack commented on HBASE-6524:
--

Committed the doc. as appendix I in the manual.  Will show next time I push the 
doc.  Thanks Jonathan.

> Hooks for hbase tracing
> ---
>
> Key: HBASE-6524
> URL: https://issues.apache.org/jira/browse/HBASE-6524
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Leavitt
>Assignee: Jonathan Leavitt
> Fix For: 0.96.0
>
> Attachments: 6524.addendum, 6524-v2.txt, 6524v3.txt, 
> createTableTrace.png, hbase-6524.diff
>
>
> Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] 
> library to add dapper-like tracing to hbase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4658) Put attributes are not exposed via the ThriftServer

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460259#comment-13460259
 ] 

Hudson commented on HBASE-4658:
---

Integrated in HBase-TRUNK #3363 (See 
[https://builds.apache.org/job/HBase-TRUNK/3363/])
HBASE-6806 HBASE-4658 breaks backward compatibility / example scripts 
(Revision 1388318)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/examples/thrift/DemoClient.cpp
* /hbase/trunk/examples/thrift/DemoClient.java
* /hbase/trunk/examples/thrift/DemoClient.php
* /hbase/trunk/examples/thrift/DemoClient.pl
* /hbase/trunk/examples/thrift/DemoClient.py
* /hbase/trunk/examples/thrift/DemoClient.rb
* /hbase/trunk/examples/thrift/Makefile


> Put attributes are not exposed via the ThriftServer
> ---
>
> Key: HBASE-4658
> URL: https://issues.apache.org/jira/browse/HBASE-4658
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Reporter: dhruba borthakur
>Assignee: dhruba borthakur
> Fix For: 0.94.0
>
> Attachments: ASF.LICENSE.NOT.GRANTED--D1563.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.1.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.2.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.2.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.2.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.3.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.3.patch, 
> ASF.LICENSE.NOT.GRANTED--D1563.3.patch, ThriftPutAttributes1.txt
>
>
> The Put api also takes in a bunch of arbitrary attributes that an application 
> can use to associate metadata with each put operation. This is not exposed 
> via Thrift.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460258#comment-13460258
 ] 

Hudson commented on HBASE-6806:
---

Integrated in HBase-TRUNK #3363 (See 
[https://builds.apache.org/job/HBase-TRUNK/3363/])
HBASE-6806 HBASE-4658 breaks backward compatibility / example scripts 
(Revision 1388318)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/examples/thrift/DemoClient.cpp
* /hbase/trunk/examples/thrift/DemoClient.java
* /hbase/trunk/examples/thrift/DemoClient.php
* /hbase/trunk/examples/thrift/DemoClient.pl
* /hbase/trunk/examples/thrift/DemoClient.py
* /hbase/trunk/examples/thrift/DemoClient.rb
* /hbase/trunk/examples/thrift/Makefile


> HBASE-4658 breaks backward compatibility / example scripts
> --
>
> Key: HBASE-6806
> URL: https://issues.apache.org/jira/browse/HBASE-6806
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Affects Versions: 0.94.0
>Reporter: Lukas
> Fix For: 0.96.0
>
> Attachments: HBASE-6806-fix-examples.diff
>
>
> HBASE-4658 introduces the new 'attributes' argument as a non optional 
> parameter. This is not backward compatible and also breaks the code in the 
> example section. Resolution: Mark as 'optional'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460253#comment-13460253
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Interesting. Thanks Cheng. I wonder what causes the performance problem then. 
Is it the get/putIfAbsent of the ConcurrentMap we store the metrics in?

I'd probably feel better if you set the threshold to 100 (instead of 2000) - 
you'd still reduce the time used there by 99%.

Also looking at the places where updateOnCacheHit is called... We also 
increment an AtomicLong (cacheHits), which is never read (WTF). We should 
remove that counter while we're at it (even when AtomicLongs are not the 
problem).


> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460254#comment-13460254
 ] 

Lars Hofhansl commented on HBASE-6841:
--

Haven't been able to track down that test failure, yet. It shouldn't happen, 
but yet somehow it does.
@J-D: Since this is (presumably) a long standing condition, how do you feel 
about moving this to 0.94.3?

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6524) Hooks for hbase tracing

2012-09-20 Thread Jonathan Leavitt (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460247#comment-13460247
 ] 

Jonathan Leavitt commented on HBASE-6524:
-

Sounds good. :)

> Hooks for hbase tracing
> ---
>
> Key: HBASE-6524
> URL: https://issues.apache.org/jira/browse/HBASE-6524
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Leavitt
>Assignee: Jonathan Leavitt
> Fix For: 0.96.0
>
> Attachments: 6524.addendum, 6524-v2.txt, 6524v3.txt, 
> createTableTrace.png, hbase-6524.diff
>
>
> Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] 
> library to add dapper-like tracing to hbase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460244#comment-13460244
 ] 

stack commented on HBASE-6852:
--

bq. Do we have to think about this generally? How perfect do these metrics have 
to be?

In 0.94 we started recording way more than previous.

I like your question on how perfect do they need to be.  For metrics updated by 
1 frequently, my quess is we could miss a few.

Why we using atomic longs anyway and not cliffclick's high scale lib... its in 
our CLASSPATH...

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-5937) Refactor HLog into an interface.

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460232#comment-13460232
 ] 

stack commented on HBASE-5937:
--

[~fpj] Sorry for not getting to your log.  What have you been having to do to 
get tests to pass?  How did you fix TestMultiParallel? It is stuff to do w/ 
this refactoring?

On your question{quote}I have also looked at making getReader/getWriter 
part of HLog{quote}

What are you thinking?  Currently Reader and Writer are Interfaces defined 
inside HLog.  You get one by calling a static method on HLog.  You'd like to 
getReader non-static, an invocation on a particular instance of HLog.

That seems fine by me. It makes sense given what you are trying to do. It is 
less flexible than what we currently have but its flexible because it presumes 
a particular implementation of HLog.

{quote}HLogInputFormat: Not clear how to instantiate HLog{quote}

This is a facility little used if ever.  I'm surprised it not used more often.  
Its a repair facility.  You'd use it when you started a cluster somehow w/o 
replaying WALs.  You could use this class in a mapreduce job to quickly add the 
edits from WAL back up into the cluster.

I took a look.  What are you thinking constructors will look like for HLogs?  
There'll be a factory?  What will the factory take for arguments?

{quote}HLogPrettyPrinter: Executed through main calls in FSHLog and 
HLogPrettyPrinter, so maybe we could pass necessary parameters{quote}

This is a tool for humans to look at contents of HLogs.

{quote}HLogSplitter: Have all parameters{quote}

This is the important one (smile)

{quote}HRegion: Have HLog object{quote}

Good... Its passed the HLog, right?

{quote}ReplicationSource: Not clear how to instantiate HLog{quote}

You know what this is about, right?  This is how we do replication.  We tail 
the WALs and as the edits come in, we send them off to other clusters.  We'll 
need to be able to tail logs.  Could we pass Replication an HLog instance?

I hope you call your HLog Inteface WAL!

{quote}I was also wondering if there are important side-effects in the case we 
use the factory to get an HLog object just to get a reader or a writer{quote}

We'd have to change the current HLog constructor.  It does a bunch of work when 
created -- sets a sync'ing thread running (this syncing thread though is in 
need of some cleanup), creates dirs and sets up first WAL.  We wouldn't want it 
doing this stuff if we wanted the instance just to do getReader/getWriter on it.


{quote}I have looked into the main constructor of FSHLog and I haven't been 
able to convince myself that there is a way of executing it without throwing an 
exception unnecessarily or having side-effects.{quote}

As it is currently written, yes.

I think this work trying to make an Interface for WAL is kinda important.  
There is this bookeeping project but the multi-WAL dev -- i.e. making the 
regionserver write more than one WAL at a time (into HDFS) -- could use the 
result of this effort too.

Thanks Flavio.



> Refactor HLog into an interface.
> 
>
> Key: HBASE-5937
> URL: https://issues.apache.org/jira/browse/HBASE-5937
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Li Pi
>Assignee: Flavio Junqueira
>Priority: Minor
> Attachments: 
> org.apache.hadoop.hbase.client.TestMultiParallel-output.txt
>
>
> What the summary says. Create HLog interface. Make current implementation use 
> it.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460224#comment-13460224
 ] 

Cheng Hao commented on HBASE-6852:
--

@stack: it should make more sense if we put the close() into the 
AbastractHFileReader, but not sure if there any other concern, since the 
AbstractHFileReader doesn't have it.

And for the THRESHOLD_METRICS_FLUSH = 2k, which I used during my testing, hope 
it's big enough for reducing the overhead, and less impact for getting the 
metrics snapshot timely. sorry, I may not able to give a good experiential 
number for it.

@Lars: Yes, that's right, we're still updating an AtomicLong each time, but 
from profiling result, I didn't see the AtomicLong became the new hotspots, and 
the testing also did >10% saved in running time, which may means the overhead 
of AtomicLong could be ignored.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6299:
-

Status: Patch Available  (was: Open)

> RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
> inconsistency in HMaster's region state and a series of successive problems.
> -
>
> Key: HBASE-6299
> URL: https://issues.apache.org/jira/browse/HBASE-6299
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.90.6
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Critical
> Fix For: 0.92.3, 0.94.3, 0.96.0
>
> Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, 
> HBASE-6299-v3.patch
>
>
> 1. HMaster tries to assign a region to an RS.
> 2. HMaster creates a RegionState for this region and puts it into 
> regionsInTransition.
> 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
> receives the open region request and starts to proceed, with success 
> eventually. However, due to network problems, HMaster fails to receive the 
> response for the openRegion() call, and the call times out.
> 4. HMaster attemps to assign for a second time, choosing another RS. 
> 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
> region open of the previous RS, and the RegionState has already been removed 
> from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
> node "RS_ZK_REGION_OPENING" updated by the second attempt.
> 6. The unassigned ZK node stays and a later unassign fails coz 
> RS_ZK_REGION_CLOSING cannot be created.
> {code}
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
>  
> plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
>  src=swbss-hadoop-004,60020,1340890123243, 
> dest=swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:28,882 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,291 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
> regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
> 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
> b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
> region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
> opened the region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
> load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
> 2012-06-29 07:07:41,140 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, 
> regions=575, usedHeap=0, maxHeap=0), t

[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6299:
-

Status: Open  (was: Patch Available)

> RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
> inconsistency in HMaster's region state and a series of successive problems.
> -
>
> Key: HBASE-6299
> URL: https://issues.apache.org/jira/browse/HBASE-6299
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.94.0, 0.90.6
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Critical
> Fix For: 0.92.3, 0.94.3, 0.96.0
>
> Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, 
> HBASE-6299-v3.patch
>
>
> 1. HMaster tries to assign a region to an RS.
> 2. HMaster creates a RegionState for this region and puts it into 
> regionsInTransition.
> 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
> receives the open region request and starts to proceed, with success 
> eventually. However, due to network problems, HMaster fails to receive the 
> response for the openRegion() call, and the call times out.
> 4. HMaster attemps to assign for a second time, choosing another RS. 
> 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
> region open of the previous RS, and the RegionState has already been removed 
> from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
> node "RS_ZK_REGION_OPENING" updated by the second attempt.
> 6. The unassigned ZK node stays and a later unassign fails coz 
> RS_ZK_REGION_CLOSING cannot be created.
> {code}
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
>  
> plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
>  src=swbss-hadoop-004,60020,1340890123243, 
> dest=swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:28,882 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,291 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
> regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
> 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
> b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
> region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
> opened the region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
> load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
> 2012-06-29 07:07:41,140 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, 
> regions=575, usedHeap=0, maxHeap=0), t

[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6299:
-

Attachment: 6299v4.txt

v3 rotted. Here is v4 which applies to trunk.  Is this in the right place 
MaryAnn?  Thanks.

> RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
> inconsistency in HMaster's region state and a series of successive problems.
> -
>
> Key: HBASE-6299
> URL: https://issues.apache.org/jira/browse/HBASE-6299
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.94.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Critical
> Fix For: 0.92.3, 0.94.3, 0.96.0
>
> Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, 
> HBASE-6299-v3.patch
>
>
> 1. HMaster tries to assign a region to an RS.
> 2. HMaster creates a RegionState for this region and puts it into 
> regionsInTransition.
> 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
> receives the open region request and starts to proceed, with success 
> eventually. However, due to network problems, HMaster fails to receive the 
> response for the openRegion() call, and the call times out.
> 4. HMaster attemps to assign for a second time, choosing another RS. 
> 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
> region open of the previous RS, and the RegionState has already been removed 
> from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
> node "RS_ZK_REGION_OPENING" updated by the second attempt.
> 6. The unassigned ZK node stays and a later unassign fails coz 
> RS_ZK_REGION_CLOSING cannot be created.
> {code}
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
>  
> plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
>  src=swbss-hadoop-004,60020,1340890123243, 
> dest=swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:28,882 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,291 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
> regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
> 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
> b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
> region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
> opened the region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
> load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
> 2012-06-29 07:07:41,140 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to serverName=swbss

[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.

2012-09-20 Thread ramkrishna.s.vasudevan (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460202#comment-13460202
 ] 

ramkrishna.s.vasudevan commented on HBASE-6299:
---

Yes Stack.  +1 on this.

> RS starts region open while fails ack to HMaster.sendRegionOpen() causes 
> inconsistency in HMaster's region state and a series of successive problems.
> -
>
> Key: HBASE-6299
> URL: https://issues.apache.org/jira/browse/HBASE-6299
> Project: HBase
>  Issue Type: Bug
>  Components: master
>Affects Versions: 0.90.6, 0.94.0
>Reporter: Maryann Xue
>Assignee: Maryann Xue
>Priority: Critical
> Fix For: 0.92.3, 0.94.3, 0.96.0
>
> Attachments: HBASE-6299.patch, HBASE-6299-v2.patch, 
> HBASE-6299-v3.patch
>
>
> 1. HMaster tries to assign a region to an RS.
> 2. HMaster creates a RegionState for this region and puts it into 
> regionsInTransition.
> 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS 
> receives the open region request and starts to proceed, with success 
> eventually. However, due to network problems, HMaster fails to receive the 
> response for the openRegion() call, and the call times out.
> 4. HMaster attemps to assign for a second time, choosing another RS. 
> 5. But since the HMaster's OpenedRegionHandler has been triggered by the 
> region open of the previous RS, and the RegionState has already been removed 
> from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK 
> node "RS_ZK_REGION_OPENING" updated by the second attempt.
> 6. The unassigned ZK node stays and a later unassign fails coz 
> RS_ZK_REGION_CLOSING cannot be created.
> {code}
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for 
> region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.;
>  
> plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.,
>  src=swbss-hadoop-004,60020,1340890123243, 
> dest=swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Assigning region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to swbss-hadoop-006,60020,1340890678078
> 2012-06-29 07:03:38,870 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:28,882 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,291 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.AssignmentManager: Handling 
> transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, 
> region=b713fd655fa02395496c5a6e39ddf568
> 2012-06-29 07:06:32,299 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED 
> event for 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, 
> regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node
> 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Deleting existing unassigned node for 
> b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: 
> master:6-0x2377fee2ae80007 Successfully deleted unassigned node for 
> region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED
> 2012-06-29 07:06:32,301 DEBUG 
> org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has 
> opened the region 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  that was online on serverName=swbss-hadoop-006,60020,1340890678078, 
> load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301)
> 2012-06-29 07:07:41,140 WARN 
> org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of 
> CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.
>  to serverName=swbss-hadoop-006,60020,

[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460199#comment-13460199
 ] 

Lars Hofhansl commented on HBASE-6852:
--

@Cheng: Even with this patch we're still updating an AtomicLong each time we 
get a cache hit, right? I had assumed that that was the slow part. Is it not?


> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460198#comment-13460198
 ] 

Lars Hofhansl commented on HBASE-6852:
--

This is third time that metrics come up as a performance issue.
Do we have to think about this generally? How perfect do these metrics have to 
be?

(Assuming a 64 bit architecture) we *could* just use plain (not even volatile) 
longs and accept the fact that we'll miss some updates or overwrite others; the 
values would still be the right ballpark.


> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460193#comment-13460193
 ] 

Lars Hofhansl commented on HBASE-6852:
--

Wait. This is the cache hit path we're talking about. Didn't come up in my 
profiling at all.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6852:
-

Fix Version/s: 0.94.2

Since 0.94.2. got delayed, pulling this in.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460189#comment-13460189
 ] 

Lars Hofhansl commented on HBASE-6852:
--

@Stack: No, this is a different issue. Didn't come up in my profiling since I 
only did cache path (so far).

Good one Cheng.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.94.2, 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6806:
-

   Resolution: Fixed
Fix Version/s: 0.96.0
 Hadoop Flags: Reviewed
   Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks for the fixup Lukas.  Nice.

> HBASE-4658 breaks backward compatibility / example scripts
> --
>
> Key: HBASE-6806
> URL: https://issues.apache.org/jira/browse/HBASE-6806
> Project: HBase
>  Issue Type: Bug
>  Components: Thrift
>Affects Versions: 0.94.0
>Reporter: Lukas
> Fix For: 0.96.0
>
> Attachments: HBASE-6806-fix-examples.diff
>
>
> HBASE-4658 introduces the new 'attributes' argument as a non optional 
> parameter. This is not backward compatible and also breaks the code in the 
> example section. Resolution: Mark as 'optional'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6798) HDFS always read checksum form meta file

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460185#comment-13460185
 ] 

stack commented on HBASE-6798:
--

[~liulei.cn] so we should add new setSkipChecksum(boolean) method in 
FileSystem you mean per file?  You mean to HFileSystem?  Pardon my not 
understanding.  Thanks.

> HDFS always read checksum form meta file
> 
>
> Key: HBASE-6798
> URL: https://issues.apache.org/jira/browse/HBASE-6798
> Project: HBase
>  Issue Type: Bug
>  Components: Performance
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Attachments: 6798.txt
>
>
> I use hbase0.941 and hadoop-0.20.2-cdh3u5 version.
> The HBase support checksums in HBase block cache in HBASE-5074 jira.
> The  HBase  support checksums for decrease the iops of  HDFS, so that HDFS
> dont't need to read the checksum from meta file of block file.
> But in hadoop-0.20.2-cdh3u5 version, BlockSender still read the metadata file 
> even if the
>  hbase.regionserver.checksum.verify property is ture.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460182#comment-13460182
 ] 

stack commented on HBASE-6852:
--

Patch looks good as does the change in the character of the pasted oprofile 
output.

Did you look at adding a close to AbstractHFileReader that hfile v1 and v2 
reader close could share?  Would that make sense here?

The THRESHOLD_METRICS_FLUSH = 2k seems arbitrary.  Any reason why this number 
in particular?

Nit is that the param name isCompaction is the name of a method that returns a 
boolean result.

+1 on patch.

[~eclark] Mr. Metrics, want to take a look see at this one?






> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460176#comment-13460176
 ] 

Cheng Hao commented on HBASE-6852:
--

stack, do you mean I should submit the patch for 0.94 as well?

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460174#comment-13460174
 ] 

Cheng Hao commented on HBASE-6852:
--

It's quite similar with https://issues.apache.org/jira/browse/HBASE-6603, but 
per my testing, the 6603 doesn't improve that much in my case (full scan a 
table), but this fix did improve the performance a lot (it's 10% time shorter 
totally).

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460172#comment-13460172
 ] 

Cheng Hao commented on HBASE-6852:
--

yes, I ran the profiling in 0.94.0, but the patch is based on the trunk. it 
should also works for the later 0.94s.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460171#comment-13460171
 ] 

stack commented on HBASE-6852:
--

It doesn't look like it (after taking a look).

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460169#comment-13460169
 ] 

stack commented on HBASE-6852:
--

[~chenghao_sh] Is it 0.94.0 that you are running?

[~lhofhansl] Did we fix these in later 0.94s?

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460161#comment-13460161
 ] 

Hadoop QA commented on HBASE-6852:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12545995/onhitcache-trunk.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

+1 tests included.  The patch appears to include 3 new or modified tests.

-1 patch.  The patch command could not apply the patch.

Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2911//console

This message is automatically generated.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Fix For: 0.96.0
>
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Fix Version/s: 0.96.0
   Status: Patch Available  (was: Open)

After patch the fix, the oprofile shows the top 8 hotspots as:

samples  %image name   app name symbol name
---
59829 7.9422  17779.jo java int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, i
nt, byte[], int, int)
  59829100.000  17779.jo java int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int,
 int, byte[], int, int) [self]
---
28571 3.7927  17779.jo java int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.bin
arySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, 
org.apache.hadoop.io.RawComparator)
  28571100.000  17779.jo java int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.b
inarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, 
org.apache.hadoop.io.RawComparator) [self]
---
19331 2.5662  17779.jo java 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode org.apach
e.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
  19331100.000  17779.jo java 
org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode org.apa
che.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue)
 [self]
---
19063 2.5306  17779.jo java void 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek()
  19063100.000  17779.jo java void 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek(
) [self]
---
  1 0.0054  libjvm.sojava 
Monitor::ILock(Thread*)
  1 0.0054  libjvm.sojava 
ObjectMonitor::enter(Thread*)
  2 0.0107  libjvm.sojava 
VMThread::loop()
  1864299.9785  libjvm.sojava 
StealTask::do_it(GCTaskManager*, unsigned int)
18646 2.4752  libjvm.sojava SpinPause
  18646100.000  libjvm.sojava SpinPause 
[self]
---
15860 2.1054  17779.jo java byte[] 
org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, int,
 byte[], int, int, byte[], int, int, long, 
org.apache.hadoop.hbase.KeyValue$Type, byte[], int, int)
  15860100.000  17779.jo java byte[] 
org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, in
t, byte[], int, int, byte[], int, int, long, 
org.apache.hadoop.hbase.KeyValue$Type, byte[], int, int) [self]
---
14754 1.9586  17779.jo java 
org.apache.hadoop.hbase.io.hfile.Cacheable org.apache.hadoop.hbase.io.hfi
le.LruBlockCache.getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, 
boolean)
  14754100.000  17779.jo java 
org.apache.hadoop.hbase.io.hfile.Cacheable org.apache.hadoop.hbase.io.h
file.LruBlockCache.getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, 
boolean) [self]
---
13068 1.7348  17779.jo java 
org.apache.hadoop.hbase.io.hfile.HFileBlock org.apache.hadoop.hbase.io.hf
ile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, 
org.apache.hadoop.hbase.io.hfile.HFileBlock, boolean, boolean, boolean
)~2
  13068100.000  17779.jo java 
org.apache.hadoop.hbase.io.hfile.HFileBlock org.apache.hadoop.hbase.io.
hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, 
org.apache.hadoop.hbase.io.hfile.HFileBlock, boolean, boolean, boole
an)~2 [self]
---


> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> --

[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460158#comment-13460158
 ] 

stack commented on HBASE-4191:
--

What you think of Liyin's costing vs what you have in the Stochastic balancer 
Elliott (Do you think the HRegion#computeHDFSBlocksDistribution call will 
happen often?  Seems like its value is cached for a period of time).

> hbase load balancer needs locality awareness
> 
>
> Key: HBASE-4191
> URL: https://issues.apache.org/jira/browse/HBASE-4191
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Reporter: Ted Yu
>Assignee: Liyin Tang
>
> Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, 
> which provides the HFile level locality information.
> But in order to work with load balancer and region assignment, we need the 
> region level locality information.
> Let's define the region locality information first, which is almost the same 
> as HFile locality index.
> HRegion locality index (HRegion A, RegionServer B) = 
> (Total number of HDFS blocks that can be retrieved locally by the 
> RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the 
> Region A)
> So the HRegion locality index tells us that how much locality we can get if 
> the HMaster assign the HRegion A to the RegionServer B.
> So there will be 2 steps involved to assign regions based on the locality.
> 1) During the cluster start up time, the master will scan the hdfs to 
> calculate the "HRegion locality index" for each pair of HRegion and Region 
> Server. It is pretty expensive to scan the dfs. So we only needs to do this 
> once during the start up time.
> 2) During the cluster run time, each region server will update the "HRegion 
> locality index" as metrics periodically as HBASE-4114 did. The Region Server 
> can expose them to the Master through ZK, meta table, or just RPC messages. 
> Based on the "HRegion locality index", the assignment manager in the master 
> would have a global knowledge about the region locality distribution and can 
> run the MIN COST MAXIMUM FLOW solver to reach the global optimization.
> Let's construct the graph first:
> [Graph]
> Imaging there is a bipartite graph and the left side is the set of regions 
> and the right side is the set of region servers.
> There is a source node which links itself to each node in the region set. 
> There is a sink node which is linked from each node in the region server set.
> [Capacity]
> The capacity between the source node and region nodes is 1.
> And the capacity between the region nodes and region server nodes is also 1.
> (The purpose is each region can ONLY be assigned to one region server at one 
> time)
> The capacity between the region server nodes and sink node are the avg number 
> of regions which should be assigned each region server.
> (The purpose is balance the load for each region server)
> [Cost]
> The cost between each region and region server is the opposite of locality 
> index, which means the higher locality is, if region A is assigned to region 
> server B, the lower cost it is.
> The cost function could be more sophisticated when we put more metrics into 
> account.
> So after running the min-cost max flow solver, the master could assign the 
> regions based on the global locality optimization.
> Also the master should share this global view to secondary master in case the 
> master fail over happens.
> In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on 
> the same metrics, to proactively to scan dfs to calculate the global locality 
> information in the cluster. It will help us to verify data locality 
> information during the run time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Cheng Hao updated HBASE-6852:
-

Attachment: onhitcache-trunk.patch

The fix will cache the metrics and flush every 2000 calls, or the HFileReader 
closed.

> SchemaMetrics.updateOnCacheHit costs too much while full scanning a table 
> with all of its fields
> 
>
> Key: HBASE-6852
> URL: https://issues.apache.org/jira/browse/HBASE-6852
> Project: HBase
>  Issue Type: Improvement
>  Components: metrics
>Affects Versions: 0.94.0
>Reporter: Cheng Hao
>Priority: Minor
>  Labels: performance
> Attachments: onhitcache-trunk.patch
>
>
> The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
> table scanning.
> Here is the top 5 hotspots within regionserver while full scanning a table: 
> (Sorry for the less-well-format)
> CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
> Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit 
> mask of 0x00 (No unit mask) count 500
> samples  %image name   symbol name
> ---
> 9844713.4324  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean)
>   98447100.000  14033.jo void 
> org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
>  boolean) [self]
> ---
> 45814 6.2510  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int)
>   45814100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 43523 5.9384  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>   43523100.000  14033.jo boolean 
> org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
>  [self]
> ---
> 42548 5.8054  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int)
>   42548100.000  14033.jo int 
> org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
> byte[], int, int) [self]
> ---
> 40572 5.5358  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
>   40572100.000  14033.jo int 
> org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
>  int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields

2012-09-20 Thread Cheng Hao (JIRA)
Cheng Hao created HBASE-6852:


 Summary: SchemaMetrics.updateOnCacheHit costs too much while full 
scanning a table with all of its fields
 Key: HBASE-6852
 URL: https://issues.apache.org/jira/browse/HBASE-6852
 Project: HBase
  Issue Type: Improvement
  Components: metrics
Affects Versions: 0.94.0
Reporter: Cheng Hao
Priority: Minor


The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full 
table scanning.
Here is the top 5 hotspots within regionserver while full scanning a table: 
(Sorry for the less-well-format)

CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated)
Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask 
of 0x00 (No unit mask) count 500
samples  %image name   symbol name
---
9844713.4324  14033.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean)
  98447100.000  14033.jo void 
org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory,
 boolean) [self]
---
45814 6.2510  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
byte[], int, int)
  45814100.000  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, 
byte[], int, int) [self]
---
43523 5.9384  14033.jo boolean 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
  43523100.000  14033.jo boolean 
org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue)
 [self]
---
42548 5.8054  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int)
  42548100.000  14033.jo int 
org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, 
byte[], int, int) [self]
---
40572 5.5358  14033.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1
  40572100.000  14033.jo int 
org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[],
 int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self]


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5959) Add other load balancers

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-5959:
-

Release Note: 
Added a new StochasticLoadBalancer that when enabled will perform a randomized 
search for the optimal cluster balance.  The new balancer takes into account 
data locality, storefile size, memstore size, and the evenness of tables over 
region servers when trying potential new cluster states.

To enable the new balancer set hbase.master.loadbalancer.class to 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer . It is also 
recommended to set hbase.master.loadbalance.bytable to false .  Lots of 
different configuration options can be tuned to prioritize costs differently.  
Explanations of all of the configuration options are available  on the JavaDoc 
for StochasticLoadBalancer.

StochasticLoadBalancer is the default in 0.96.0

  was:
Added a new StochasticLoadBalancer that when enabled will perform a randomized 
search for the optimal cluster balance.  The new balancer takes into account 
data locality, storefile size, memstore size, and the evenness of tables over 
region servers when trying potential new cluster states.

To enable the new balancer set hbase.master.loadbalancer.class to 
org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer . It is also 
recommended to set hbase.master.loadbalance.bytable to false .  Lots of 
different configuration options can be tuned to prioritize costs differently.  
Explanations of all of the configuration options are available  on the JavaDoc 
for StochasticLoadBalancer.


> Add other load balancers
> 
>
> Key: HBASE-5959
> URL: https://issues.apache.org/jira/browse/HBASE-5959
> Project: HBase
>  Issue Type: New Feature
>  Components: master
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Attachments: ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.1.patch, 
> ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.2.patch, 
> ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.3.patch, 
> ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.4.patch, 
> ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.5.patch, 
> ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.6.patch, 
> ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.7.patch, HBASE-5959-0.patch, 
> HBASE-5959-11.patch, HBASE-5959-12.patch, HBASE-5959-13.patch, 
> HBASE-5959-14.patch, HBASE-5959-1.patch, HBASE-5959-2.patch, 
> HBASE-5959-3.patch, HBASE-5959-6.patch, HBASE-5959-7.patch, 
> HBASE-5959-8.patch, HBASE-5959-9.patch
>
>
> Now that balancers are pluggable we should give some options.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-3663) The starvation problem in current load balance algorithm

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460153#comment-13460153
 ] 

stack commented on HBASE-3663:
--

[~liyin] Was this patch committed to 89fb?  If so, can we close this?  If not, 
can we close this because recent versions of hbase don't have this issue? 
Thanks.

> The starvation problem in current load balance algorithm
> 
>
> Key: HBASE-3663
> URL: https://issues.apache.org/jira/browse/HBASE-3663
> Project: HBase
>  Issue Type: Bug
>Reporter: Liyin Tang
> Attachments: HBASE_3665[0.89].patch, result_new_load_balance.txt, 
> result_old_load_balance.txt
>
>
> This is an interesting starvation case. There are 2 conditions to trigger 
> this problem.
> Condition1: r/s - r/(s+1) << 1 
> Let r: the number of regions
> Let s: the number of servers
> Condition2: for each server, the load of each server is less or equal the 
> ceil of avg load.
> Here is the unit test to verify this problem: 
> For example, there are 16 servers and 62 regions. The avg load is 
> 3.875. And setting the slot to 0 to keep the load of each server either 3 or 
> 4. 
> When a new server is coming,  no server needs to assign regions to this new 
> server, since no one is larger the ceil of the avg.
> (Setting slot to 0 is to easily trigger this situation, otherwise it needs 
> much larger numbers)
> Solutions is pretty straightforward. Just compare the floor of the avg 
> instead of the ceil. This solution will evenly balance the load from the 
> servers which is little more loaded than others. 
> I also attached the comparison result  for the case mentioned above between 
> the old balance algorithm and new balance algorithm. (I set the slot = 0 when 
> testing)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6798) HDFS always read checksum form meta file

2012-09-20 Thread LiuLei (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460151#comment-13460151
 ] 

LiuLei commented on HBASE-6798:
---

Hi all, if HDFS don't  read checksum form meta file, that can dcrease iops for 
HFile, but HLog file of Hbase don't contain the checksum, so when HBase read 
the HLog, that must use checksum of HDFS, so we should add new 
setSkipChecksum(boolean) method in FileSystem, let HBase to deceid whether or 
not read the checksum from meta file.

> HDFS always read checksum form meta file
> 
>
> Key: HBASE-6798
> URL: https://issues.apache.org/jira/browse/HBASE-6798
> Project: HBase
>  Issue Type: Bug
>  Components: performance
>Affects Versions: 0.94.0, 0.94.1
>Reporter: LiuLei
>Priority: Blocker
> Attachments: 6798.txt
>
>
> I use hbase0.941 and hadoop-0.20.2-cdh3u5 version.
> The HBase support checksums in HBase block cache in HBASE-5074 jira.
> The  HBase  support checksums for decrease the iops of  HDFS, so that HDFS
> dont't need to read the checksum from meta file of block file.
> But in hadoop-0.20.2-cdh3u5 version, BlockSender still read the metadata file 
> even if the
>  hbase.regionserver.checksum.verify property is ture.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness

2012-09-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460131#comment-13460131
 ] 

Elliott Clark commented on HBASE-4191:
--

It seems like the stochastic load balancer gives HBase the locality awareness 
when balancing.  

> hbase load balancer needs locality awareness
> 
>
> Key: HBASE-4191
> URL: https://issues.apache.org/jira/browse/HBASE-4191
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Reporter: Ted Yu
>Assignee: Liyin Tang
>
> Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, 
> which provides the HFile level locality information.
> But in order to work with load balancer and region assignment, we need the 
> region level locality information.
> Let's define the region locality information first, which is almost the same 
> as HFile locality index.
> HRegion locality index (HRegion A, RegionServer B) = 
> (Total number of HDFS blocks that can be retrieved locally by the 
> RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the 
> Region A)
> So the HRegion locality index tells us that how much locality we can get if 
> the HMaster assign the HRegion A to the RegionServer B.
> So there will be 2 steps involved to assign regions based on the locality.
> 1) During the cluster start up time, the master will scan the hdfs to 
> calculate the "HRegion locality index" for each pair of HRegion and Region 
> Server. It is pretty expensive to scan the dfs. So we only needs to do this 
> once during the start up time.
> 2) During the cluster run time, each region server will update the "HRegion 
> locality index" as metrics periodically as HBASE-4114 did. The Region Server 
> can expose them to the Master through ZK, meta table, or just RPC messages. 
> Based on the "HRegion locality index", the assignment manager in the master 
> would have a global knowledge about the region locality distribution and can 
> run the MIN COST MAXIMUM FLOW solver to reach the global optimization.
> Let's construct the graph first:
> [Graph]
> Imaging there is a bipartite graph and the left side is the set of regions 
> and the right side is the set of region servers.
> There is a source node which links itself to each node in the region set. 
> There is a sink node which is linked from each node in the region server set.
> [Capacity]
> The capacity between the source node and region nodes is 1.
> And the capacity between the region nodes and region server nodes is also 1.
> (The purpose is each region can ONLY be assigned to one region server at one 
> time)
> The capacity between the region server nodes and sink node are the avg number 
> of regions which should be assigned each region server.
> (The purpose is balance the load for each region server)
> [Cost]
> The cost between each region and region server is the opposite of locality 
> index, which means the higher locality is, if region A is assigned to region 
> server B, the lower cost it is.
> The cost function could be more sophisticated when we put more metrics into 
> account.
> So after running the min-cost max flow solver, the master could assign the 
> regions based on the global locality optimization.
> Also the master should share this global view to secondary master in case the 
> master fail over happens.
> In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on 
> the same metrics, to proactively to scan dfs to calculate the global locality 
> information in the cluster. It will help us to verify data locality 
> information during the run time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6491) add limit function at ClientScanner

2012-09-20 Thread Jieshan Bean (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460126#comment-13460126
 ] 

Jieshan Bean commented on HBASE-6491:
-

@ronghai: Why not use PageFilter instead of adding this new method?


> add limit function at ClientScanner
> ---
>
> Key: HBASE-6491
> URL: https://issues.apache.org/jira/browse/HBASE-6491
> Project: HBase
>  Issue Type: New Feature
>  Components: Client
>Affects Versions: 0.96.0
>Reporter: ronghai.ma
>Assignee: ronghai.ma
>  Labels: patch
> Fix For: 0.96.0
>
> Attachments: ClientScanner.java, HBASE-6491.patch
>
>
> Add a new method in ClientScanner to implement a function like LIMIT in MySQL.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-4191) hbase load balancer needs locality awareness

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-4191:
-

Component/s: Balancer

> hbase load balancer needs locality awareness
> 
>
> Key: HBASE-4191
> URL: https://issues.apache.org/jira/browse/HBASE-4191
> Project: HBase
>  Issue Type: New Feature
>  Components: Balancer
>Reporter: Ted Yu
>Assignee: Liyin Tang
>
> Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, 
> which provides the HFile level locality information.
> But in order to work with load balancer and region assignment, we need the 
> region level locality information.
> Let's define the region locality information first, which is almost the same 
> as HFile locality index.
> HRegion locality index (HRegion A, RegionServer B) = 
> (Total number of HDFS blocks that can be retrieved locally by the 
> RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the 
> Region A)
> So the HRegion locality index tells us that how much locality we can get if 
> the HMaster assign the HRegion A to the RegionServer B.
> So there will be 2 steps involved to assign regions based on the locality.
> 1) During the cluster start up time, the master will scan the hdfs to 
> calculate the "HRegion locality index" for each pair of HRegion and Region 
> Server. It is pretty expensive to scan the dfs. So we only needs to do this 
> once during the start up time.
> 2) During the cluster run time, each region server will update the "HRegion 
> locality index" as metrics periodically as HBASE-4114 did. The Region Server 
> can expose them to the Master through ZK, meta table, or just RPC messages. 
> Based on the "HRegion locality index", the assignment manager in the master 
> would have a global knowledge about the region locality distribution and can 
> run the MIN COST MAXIMUM FLOW solver to reach the global optimization.
> Let's construct the graph first:
> [Graph]
> Imaging there is a bipartite graph and the left side is the set of regions 
> and the right side is the set of region servers.
> There is a source node which links itself to each node in the region set. 
> There is a sink node which is linked from each node in the region server set.
> [Capacity]
> The capacity between the source node and region nodes is 1.
> And the capacity between the region nodes and region server nodes is also 1.
> (The purpose is each region can ONLY be assigned to one region server at one 
> time)
> The capacity between the region server nodes and sink node are the avg number 
> of regions which should be assigned each region server.
> (The purpose is balance the load for each region server)
> [Cost]
> The cost between each region and region server is the opposite of locality 
> index, which means the higher locality is, if region A is assigned to region 
> server B, the lower cost it is.
> The cost function could be more sophisticated when we put more metrics into 
> account.
> So after running the min-cost max flow solver, the master could assign the 
> regions based on the global locality optimization.
> Also the master should share this global view to secondary master in case the 
> master fail over happens.
> In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on 
> the same metrics, to proactively to scan dfs to calculate the global locality 
> information in the cluster. It will help us to verify data locality 
> information during the run time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts

2012-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460096#comment-13460096
 ] 

Hadoop QA commented on HBASE-6806:
--

-1 overall.  Here are the results of testing the latest attachment 
  
http://issues.apache.org/jira/secure/attachment/12545720/HBASE-6806-fix-examples.diff
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 139 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.master.TestSplitLogManager

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2910//console

This message is automatically generated.

> HBASE-4658 breaks backward compatibility / example scripts
> --
>
> Key: HBASE-6806
> URL: https://issues.apache.org/jira/browse/HBASE-6806
> Project: HBase
>  Issue Type: Bug
>  Components: thrift
>Affects Versions: 0.94.0
>Reporter: Lukas
> Attachments: HBASE-6806-fix-examples.diff
>
>
> HBASE-4658 introduces the new 'attributes' argument as a non optional 
> parameter. This is not backward compatible and also breaks the code in the 
> example section. Resolution: Mark as 'optional'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460090#comment-13460090
 ] 

Hudson commented on HBASE-6848:
---

Integrated in HBase-TRUNK #3362 (See 
[https://builds.apache.org/job/HBase-TRUNK/3362/])
HBASE-6848 Make hbase-hadoop-compat findbugs clean (Revision 1388252)

 Result = SUCCESS
stack : 
Files : 
* /hbase/trunk/dev-support/findbugs-exclude.xml
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilityFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilitySingletonFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsWrapper.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/MBeanSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/metrics/ReplicationMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSourceFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricHistogram.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricsExecutor.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java


> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460089#comment-13460089
 ] 

Hudson commented on HBASE-6849:
---

Integrated in HBase-TRUNK #3362 (See 
[https://builds.apache.org/job/HBase-TRUNK/3362/])
HBASE-6849 Make StochasticLoadBalancer the default (Revision 1388267)

 Result = SUCCESS
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/LoadBalancerFactory.java


> Make StochasticLoadBalancer the default
> ---
>
> Key: HBASE-6849
> URL: https://issues.apache.org/jira/browse/HBASE-6849
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
> Attachments: HBASE-6849-0.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460084#comment-13460084
 ] 

Hudson commented on HBASE-6848:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/])
HBASE-6848 Make hbase-hadoop-compat findbugs clean (Revision 1388252)

 Result = FAILURE
stack : 
Files : 
* /hbase/trunk/dev-support/findbugs-exclude.xml
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilityFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilitySingletonFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsWrapper.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/MBeanSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/metrics/ReplicationMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSource.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSourceFactory.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricHistogram.java
* 
/hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricsExecutor.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java
* 
/hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java


> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460082#comment-13460082
 ] 

Hudson commented on HBASE-6847:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388161)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> HBASE-6649 broke replication
> 
>
> Key: HBASE-6847
> URL: https://issues.apache.org/jira/browse/HBASE-6847
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.92.3, 0.94.2, 0.96.0
>
> Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch
>
>
> After running with HBASE-6646 and replication enabled I encountered this:
> {noformat}
> 2012-09-17 20:04:08,111 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on 
> IOE: 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318,
>  entryStart=78641557, pos=78771200, end=78771200, edit=84
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> currentNbOperations:164529 and seenEntries:84 and size: 154068
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicating 84
> 2012-09-17 20:04:08,146 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for 
> position 78771200 in 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> 2012-09-17 20:04:08,158 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Removing 0 logs in the list: []
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicated in total: 93234
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200
> 2012-09-17 20:04:08,163 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in ReplicationSource, 
> currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> java.lang.IndexOutOfBoundsException
> at java.io.DataInputStream.readFully(DataInputStream.java:175)
> at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
> at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307)
> {noformat}
> There's something weird at the end of the file and it's killing replication. 
> We used to just retry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6677) Random ZooKeeper port in test can overrun max port

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460081#comment-13460081
 ] 

Hudson commented on HBASE-6677:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/])
HBASE-6677 Random ZooKeeper port in test can overrun max port (Revision 
1388125)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java


> Random ZooKeeper port in test can overrun max port
> --
>
> Key: HBASE-6677
> URL: https://issues.apache.org/jira/browse/HBASE-6677
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Gregory Chanan
>Assignee: liang xie
>Priority: Trivial
>  Labels: noob
> Fix For: 0.96.0
>
> Attachments: HBASE-6677.patch
>
>
> {code} 
>  while (true) {
> try {
>   standaloneServerFactory = new NIOServerCnxnFactory();
>   standaloneServerFactory.configure(
> new InetSocketAddress(tentativePort),
> configuration.getInt(HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS,
>   1000));
> } catch (BindException e) {
>   LOG.debug("Failed binding ZK Server to client port: " +
>   tentativePort);
>   // This port is already in use, try to use another.
>   tentativePort++;
>   continue;
> }
> break;
>   }
> {code}
> In the case of failure and all the above ports have already been binded, you 
> can extend past the max port.  Need to check against a max value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460083#comment-13460083
 ] 

Hudson commented on HBASE-6849:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/])
HBASE-6849 Make StochasticLoadBalancer the default (Revision 1388267)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/LoadBalancerFactory.java


> Make StochasticLoadBalancer the default
> ---
>
> Key: HBASE-6849
> URL: https://issues.apache.org/jira/browse/HBASE-6849
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
> Attachments: HBASE-6849-0.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460080#comment-13460080
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388161)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
> ---
>
> Key: HBASE-6649
> URL: https://issues.apache.org/jira/browse/HBASE-6649
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.92.3, 0.94.2, 0.96.0
>
> Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
> 6649-fix-io-exception-handling-1.patch, 
> 6649-fix-io-exception-handling-1-trunk.patch, 
> 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 
> 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 
> #502 test - queueFailover [Jenkins].html
>
>
> Have seen it twice in the recent past: http://bit.ly/MPCykB & 
> http://bit.ly/O79Dq7 .. 
> Looking briefly at the logs hints at a pattern - in both the failed test 
> instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460079#comment-13460079
 ] 

Hudson commented on HBASE-6698:
---

Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See 
[https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/])
HBASE-6698 Refactor checkAndPut and checkAndDelete to use 
doMiniBatchMutation
(Priya)

Submitted by:PrIya
Reviewed by:Ram, Stack, Ted, Lars (Revision 1388141)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java


> Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
> --
>
> Key: HBASE-6698
> URL: https://issues.apache.org/jira/browse/HBASE-6698
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
> HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
> HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, 
> HBASE-6698_7.patch, HBASE-6698_8.patch, HBASE-6698_8.patch, 
> HBASE-6698_8.patch, HBASE-6698.patch
>
>
> Currently the checkAndPut and checkAndDelete api internally calls the 
> internalPut and internalDelete.  May be we can just call doMiniBatchMutation
> only.  This will help in future like if we have some hooks and the CP
> handles certain cases in the doMiniBatchMutation the same can be done while
> doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6850) REST implementation internals conflict with clients that use Jersey

2012-09-20 Thread Andrew Purtell (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-6850:
--

Summary: REST implementation internals conflict with clients that use 
Jersey  (was: PlainTextMessageBodyProducer is dangerous)

> REST implementation internals conflict with clients that use Jersey
> ---
>
> Key: HBASE-6850
> URL: https://issues.apache.org/jira/browse/HBASE-6850
> Project: HBase
>  Issue Type: Bug
>  Components: Client, REST
>Affects Versions: 0.94.1
>Reporter: Jonathan Leech
>
> - It is my understanding that there is one and only one hbase jar, which 
> includes 
> org.apache.hadoop.hbase.rest.provider.producer.PlainTextMessageBodyProducer, 
> which is only used in the REST / jersey server-side implementation.
> - PlainTextMessageBodyProducer claims to provide a text/plain output for 
> absolutely any input by calling .toString() on it.
> - If I am a client to HBase, and I do my own REST / jersey, including my own 
> custom text/plain writing, by default the jersey stack finds 
> PlainTextMessageBodyProducer and uses it instead of mine.
> I could be off base here; so please feel free to change this from a Bug to a 
> Feature Request or close it, especially if my assumptions are wrong.
> Workaround: set init-param of com.sun.jersey.config.property.packages to 
> limit it to my own packages.
> Recommended fix: 
> - provide a client jar and / or a maven pom for hbase-client which doesn't 
> include server-side hbase code or dependencies.
> and / or 
> - don't return true from isWriteable() for every possible input, or create a 
> different custom mime type that other users of the API might be also using, 
> and if possible map text/plain to that type in the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6850) PlainTextMessageBodyProducer is dangerous

2012-09-20 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460074#comment-13460074
 ] 

Andrew Purtell commented on HBASE-6850:
---

IMO, this is not REST specific, but the larger issue of us packaging non client 
classes into a fatjar along with the client. 

> PlainTextMessageBodyProducer is dangerous
> -
>
> Key: HBASE-6850
> URL: https://issues.apache.org/jira/browse/HBASE-6850
> Project: HBase
>  Issue Type: Bug
>  Components: Client, REST
>Affects Versions: 0.94.1
>Reporter: Jonathan Leech
>
> - It is my understanding that there is one and only one hbase jar, which 
> includes 
> org.apache.hadoop.hbase.rest.provider.producer.PlainTextMessageBodyProducer, 
> which is only used in the REST / jersey server-side implementation.
> - PlainTextMessageBodyProducer claims to provide a text/plain output for 
> absolutely any input by calling .toString() on it.
> - If I am a client to HBase, and I do my own REST / jersey, including my own 
> custom text/plain writing, by default the jersey stack finds 
> PlainTextMessageBodyProducer and uses it instead of mine.
> I could be off base here; so please feel free to change this from a Bug to a 
> Feature Request or close it, especially if my assumptions are wrong.
> Workaround: set init-param of com.sun.jersey.config.property.packages to 
> limit it to my own packages.
> Recommended fix: 
> - provide a client jar and / or a maven pom for hbase-client which doesn't 
> include server-side hbase code or dependencies.
> and / or 
> - don't return true from isWriteable() for every possible input, or create a 
> different custom mime type that other users of the API might be also using, 
> and if possible map text/plain to that type in the server.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460069#comment-13460069
 ] 

Lars Hofhansl commented on HBASE-6841:
--

Not sure what the test issue is, yet.

Also looking at the code again I notice the prefetchRegionLimit is already 
defaulted to 10. 


> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-6849.
--

  Resolution: Fixed
Release Note: Makes the StochasticLoadBalancer the default.
Hadoop Flags: Reviewed

Committed to trunk.  Lets try it.  Can revert if its a mess before we release 
0.96 (Weird you had to disable by table explicitly, apart from setting default 
balancer -- that looks broke to me that we're doing by table outside of the 
balancer).

Thanks Elliott.

> Make StochasticLoadBalancer the default
> ---
>
> Key: HBASE-6849
> URL: https://issues.apache.org/jira/browse/HBASE-6849
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
> Attachments: HBASE-6849-0.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6806:
-

Status: Patch Available  (was: Open)

Passing by hadoopqa.

Thanks Lukas.  Let the php, perl, and ruby heads file an issue if broke.  We'll 
take your fixes for the rest.

> HBASE-4658 breaks backward compatibility / example scripts
> --
>
> Key: HBASE-6806
> URL: https://issues.apache.org/jira/browse/HBASE-6806
> Project: HBase
>  Issue Type: Bug
>  Components: thrift
>Affects Versions: 0.94.0
>Reporter: Lukas
> Attachments: HBASE-6806-fix-examples.diff
>
>
> HBASE-4658 introduces the new 'attributes' argument as a non optional 
> parameter. This is not backward compatible and also breaks the code in the 
> example section. Resolution: Mark as 'optional'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()

2012-09-20 Thread Gary Helmling (JIRA)
Gary Helmling created HBASE-6851:


 Summary: Race condition in TableAuthManager.updateGlobalCache()
 Key: HBASE-6851
 URL: https://issues.apache.org/jira/browse/HBASE-6851
 Project: HBase
  Issue Type: Bug
  Components: security
Affects Versions: 0.94.1, 0.96.0
Reporter: Gary Helmling
Priority: Critical


When new global permissions are assigned, there is a race condition, during 
which further authorization checks relying on global permissions may fail.

In TableAuthManager.updateGlobalCache(), we have:
{code:java}
USER_CACHE.clear();
GROUP_CACHE.clear();
try {
  initGlobal(conf);
} catch (IOException e) {
  // Never happens
  LOG.error("Error occured while updating the user cache", e);
}
for (Map.Entry entry : userPerms.entries()) {
  if (AccessControlLists.isGroupPrincipal(entry.getKey())) {
GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()),
new Permission(entry.getValue().getActions()));
  } else {
USER_CACHE.put(entry.getKey(), new 
Permission(entry.getValue().getActions()));
  }
}
{code}

If authorization checks come in following the .clear() but before repopulating, 
they will fail.

We should have some synchronization here to serialize multiple updates and use 
a COW type rebuild and reassign of the new maps.

This particular issue crept in with the fix in HBASE-6157, so I'm flagging for 
0.94 and 0.96.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread stack (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-6848:
-

  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Committed to trunk.  Thanks Elliott

> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460024#comment-13460024
 ] 

Lars Hofhansl commented on HBASE-6841:
--

Heh. TestHCM.testRegionCaching  looks relevant :) Looking.

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460023#comment-13460023
 ] 

stack commented on HBASE-6841:
--

+1 on patch but whats that TestHCM fail about?

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460021#comment-13460021
 ] 

Hadoop QA commented on HBASE-6841:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12545967/6841-0.96.txt
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 139 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 14 new Findbugs (version 
1.3.9) warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
   org.apache.hadoop.hbase.client.TestHCM

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2909//console

This message is automatically generated.

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6669) Add BigDecimalColumnInterpreter for doing aggregations using AggregationClient

2012-09-20 Thread Julian Wissmann (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Wissmann updated HBASE-6669:
---

Attachment: TestBDAggregateProtocol.patch

> Add BigDecimalColumnInterpreter for doing aggregations using AggregationClient
> --
>
> Key: HBASE-6669
> URL: https://issues.apache.org/jira/browse/HBASE-6669
> Project: HBase
>  Issue Type: New Feature
>  Components: client, coprocessors
>Reporter: Anil Gupta
>Priority: Minor
>  Labels: client, coprocessors
> Attachments: BigDecimalColumnInterpreter.java, 
> BigDecimalColumnInterpreter.patch, BigDecimalColumnInterpreter.patch, 
> TestBDAggregateProtocol.patch
>
>
> I recently created a Class for doing aggregations(sum,min,max,std) on values 
> stored as BigDecimal in HBase. I would like to commit the 
> BigDecimalColumnInterpreter into HBase. In my opinion this class can be used 
> by a wide variety of users. Please let me know if its not appropriate to add 
> this class in HBase.
> Thanks,
> Anil Gupta
> Software Engineer II, Intuit, Inc 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6850) PlainTextMessageBodyProducer is dangerous

2012-09-20 Thread Jonathan Leech (JIRA)
Jonathan Leech created HBASE-6850:
-

 Summary: PlainTextMessageBodyProducer is dangerous
 Key: HBASE-6850
 URL: https://issues.apache.org/jira/browse/HBASE-6850
 Project: HBase
  Issue Type: Bug
  Components: client, REST
Affects Versions: 0.94.1
Reporter: Jonathan Leech


- It is my understanding that there is one and only one hbase jar, which 
includes 
org.apache.hadoop.hbase.rest.provider.producer.PlainTextMessageBodyProducer, 
which is only used in the REST / jersey server-side implementation.

- PlainTextMessageBodyProducer claims to provide a text/plain output for 
absolutely any input by calling .toString() on it.

- If I am a client to HBase, and I do my own REST / jersey, including my own 
custom text/plain writing, by default the jersey stack finds 
PlainTextMessageBodyProducer and uses it instead of mine.

I could be off base here; so please feel free to change this from a Bug to a 
Feature Request or close it, especially if my assumptions are wrong.

Workaround: set init-param of com.sun.jersey.config.property.packages to limit 
it to my own packages.

Recommended fix: 
- provide a client jar and / or a maven pom for hbase-client which doesn't 
include server-side hbase code or dependencies.

and / or 

- don't return true from isWriteable() for every possible input, or create a 
different custom mime type that other users of the API might be also using, and 
if possible map text/plain to that type in the server.





--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6839) Operations may be executed without holding rowLock

2012-09-20 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans updated HBASE-6839:
--

Fix Version/s: 0.92.3

Adding 0.92.3 as a target since Ted committed it there.

> Operations may be executed without holding rowLock
> --
>
> Key: HBASE-6839
> URL: https://issues.apache.org/jira/browse/HBASE-6839
> Project: HBase
>  Issue Type: Bug
>  Components: regionserver
>Reporter: chunhui shen
>Assignee: chunhui shen
>Priority: Critical
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: HBASE-6839.patch
>
>
> HRegion#internalObtainRowLock will return null if timed out,
> but many place which call this method don't handle this case
> The bad result is operation will be executed even if it havn't obtained the 
> row lock. Such as put、delete、increment。。。

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-5974:
-

Fix Version/s: (was: 0.94.3)

Let's do this correctly in 0.96 (where it is OK to break wire compatibility).
Removing this from 0.94.

This means we can pass the seqno as a proper field in the Scan object.

> Scanner retry behavior with RPC timeout on next() seems incorrect
> -
>
> Key: HBASE-5974
> URL: https://issues.apache.org/jira/browse/HBASE-5974
> Project: HBase
>  Issue Type: Bug
>  Components: client, regionserver
>Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0
>Reporter: Todd Lipcon
>Assignee: Anoop Sam John
>Priority: Critical
> Fix For: 0.96.0
>
> Attachments: 5974_94-V4.patch, 5974_trunk.patch, 5974_trunk-V2.patch, 
> HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, HBASE-5974_94-V3.patch
>
>
> I'm seeing the following behavior:
> - set RPC timeout to a short value
> - call next() for some batch of rows, big enough so the client times out 
> before the result is returned
> - the HConnectionManager stuff will retry the next() call to the same server. 
> At this point, one of two things can happen: 1) the previous next() call will 
> still be processing, in which case you get a LeaseException, because it was 
> removed from the map during the processing, or 2) the next() call will 
> succeed but skip the prior batch of rows.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6504:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

> Adding GC details prevents HBase from starting in non-distributed mode
> --
>
> Key: HBASE-6504
> URL: https://issues.apache.org/jira/browse/HBASE-6504
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 0.94.0
>Reporter: Benoit Sigoure
>Assignee: Michael Drzal
>Priority: Trivial
>  Labels: noob
> Fix For: 0.96.0, 0.94.2
>
> Attachments: HBASE-6504-output.txt, HBASE-6504.patch, 
> HBASE-6504-v2.patch
>
>
> The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out 
> examples of variables that could be useful, such as adding 
> {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}.  This has 
> the annoying side effect that the JVM prints a summary of memory usage when 
> it exits, and it does so on stdout:
> {code}
> $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool 
> hbase.cluster.distributed
> false
> Heap
>  par new generation   total 19136K, used 4908K [0x00073a20, 
> 0x00073b6c, 0x00075186)
>   eden space 17024K,  28% used [0x00073a20, 0x00073a6cb0a8, 
> 0x00073b2a)
>   from space 2112K,   0% used [0x00073b2a, 0x00073b2a, 
> 0x00073b4b)
>   to   space 2112K,   0% used [0x00073b4b, 0x00073b4b, 
> 0x00073b6c)
>  concurrent mark-sweep generation total 63872K, used 0K [0x00075186, 
> 0x0007556c, 0x0007f5a0)
>  concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, 
> 0x0007f6ec, 0x0008)
> $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool 
> hbase.cluster.distributed >/dev/null
> (nothing printed)
> {code}
> And this confuses {{bin/start-hbase.sh}} when it does
> {{distMode=`$bin/hbase --config "$HBASE_CONF_DIR" 
> org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, 
> because then the {{distMode}} variable is not just set to {{false}}, it also 
> contains all this JVM spam.
> If you don't pay enough attention and realize that 3 processes are getting 
> started (ZK, HM, RS) instead of just one (HM), then you end up with this 
> confusing error message:
> {{Could not start ZK at requested port of 2181.  ZK was started at port: 
> 2182.  Aborting as clients (e.g. shell) will not be able to find this ZK 
> quorum.}}, which is even more puzzling because when you run {{netstat}} to 
> see who owns that port, then you won't find any rogue process other than the 
> one you just started.
> I'm wondering if the fix is not to just change the {{if [ "$distMode" == 
> 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work 
> around this annoying JVM misfeature that pollutes stdout.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6438:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

> RegionAlreadyInTransitionException needs to give more info to avoid 
> assignment inconsistencies
> --
>
> Key: HBASE-6438
> URL: https://issues.apache.org/jira/browse/HBASE-6438
> Project: HBase
>  Issue Type: Bug
>Reporter: ramkrishna.s.vasudevan
>Assignee: rajeshbabu
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, 
> 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, 
> HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, 
> HBASE-6438_trunk.patch
>
>
> Seeing some of the recent issues in region assignment, 
> RegionAlreadyInTransitionException is one reason after which the region 
> assignment may or may not happen(in the sense we need to wait for the TM to 
> assign).
> In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on 
> master restart.
> Consider the following case, due to some reason like master restart or 
> external assign call, we try to assign a region that is already getting 
> opened in a RS.
> Now the next call to assign has already changed the state of the znode and so 
> the current assign that is going on the RS is affected and it fails.  The 
> second assignment that started also fails getting RAITE exception.  Finally 
> both assignments not carrying on.  Idea is to find whether any such RAITE 
> exception can be retried or not.
> Here again we have following cases like where
> -> The znode is yet to transitioned from OFFLINE to OPENING in RS
> -> RS may be in the step of openRegion.
> -> RS may be trying to transition OPENING to OPENED.
> -> RS is yet to add to online regions in the RS side.
> Here in openRegion() and updateMeta() any failures we are moving the znode to 
> FAILED_OPEN.  So in these cases getting an RAITE should be ok.  But in other 
> cases the assignment is stopped.
> The idea is to just add the current state of the region assignment in the RIT 
> map in the RS side and using that info we can determine whether the 
> assignment can be retried or not on getting an RAITE.
> Considering the current work going on in AM, pls do share if this is needed 
> atleast in the 0.92/0.94 versions?  

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6792) Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6792:
-

Fix Version/s: (was: 0.94.3)
   0.94.2

> Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516
> ---
>
> Key: HBASE-6792
> URL: https://issues.apache.org/jira/browse/HBASE-6792
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Hsieh
>Assignee: Jonathan Hsieh
> Fix For: 0.92.3, 0.94.2
>
> Attachments: hbase-6792.patch
>
>
> bq. An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older 
> versions of hadoop.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6841:
-

Assignee: Lars Hofhansl
  Status: Patch Available  (was: Open)

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Assignee: Lars Hofhansl
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6841:
-

Attachment: 6841-0.96.txt

0.96 patch for Hadoop QA

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt, 6841-0.96.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459977#comment-13459977
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-0.92 #583 (See 
[https://builds.apache.org/job/HBase-0.92/583/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388159)
Fixing the CHANGES.txt after 0.92.2's release and adding HBASE-6649 (Revision 
1388157)

 Result = SUCCESS
jdcryans : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java

jdcryans : 
Files : 
* /hbase/branches/0.92/CHANGES.txt


> [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
> ---
>
> Key: HBASE-6649
> URL: https://issues.apache.org/jira/browse/HBASE-6649
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
> 6649-fix-io-exception-handling-1.patch, 
> 6649-fix-io-exception-handling-1-trunk.patch, 
> 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 
> 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 
> #502 test - queueFailover [Jenkins].html
>
>
> Have seen it twice in the recent past: http://bit.ly/MPCykB & 
> http://bit.ly/O79Dq7 .. 
> Looking briefly at the logs hints at a pattern - in both the failed test 
> instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459978#comment-13459978
 ] 

Hudson commented on HBASE-6847:
---

Integrated in HBase-0.92 #583 (See 
[https://builds.apache.org/job/HBase-0.92/583/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388159)

 Result = SUCCESS
jdcryans : 
Files : 
* /hbase/branches/0.92/CHANGES.txt
* 
/hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> HBASE-6649 broke replication
> 
>
> Key: HBASE-6847
> URL: https://issues.apache.org/jira/browse/HBASE-6847
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch
>
>
> After running with HBASE-6646 and replication enabled I encountered this:
> {noformat}
> 2012-09-17 20:04:08,111 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on 
> IOE: 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318,
>  entryStart=78641557, pos=78771200, end=78771200, edit=84
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> currentNbOperations:164529 and seenEntries:84 and size: 154068
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicating 84
> 2012-09-17 20:04:08,146 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for 
> position 78771200 in 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> 2012-09-17 20:04:08,158 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Removing 0 logs in the list: []
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicated in total: 93234
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200
> 2012-09-17 20:04:08,163 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in ReplicationSource, 
> currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> java.lang.IndexOutOfBoundsException
> at java.io.DataInputStream.readFully(DataInputStream.java:175)
> at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
> at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307)
> {noformat}
> There's something weird at the end of the file and it's killing replication. 
> We used to just retry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459962#comment-13459962
 ] 

Lars Hofhansl commented on HBASE-6841:
--

Yeah the stuff we in HTable is a disaster (if you ask me... probably has to do 
with multiple threads using HTables that share the same HConnection, not sure), 
but I think that's for another patch.

The meaning of the flag is not changed, just the default (unless I made mistake 
in the patch).


> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459952#comment-13459952
 ] 

stack commented on HBASE-6841:
--

Its kinda ugly we have setRegionCachePrefetch up in HTable...  Is the meaning 
of the enable flag up in this public API changed by this patch?  Patch looks 
good otherwise.



> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Lars Hofhansl updated HBASE-6841:
-

Attachment: 6841-0.94.txt

Trivial patch.
For 0.94. Looks like trunk has the same problem.

> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.2
>
> Attachments: 6841-0.94.txt
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459924#comment-13459924
 ] 

Hudson commented on HBASE-6847:
---

Integrated in HBase-0.94 #476 (See 
[https://builds.apache.org/job/HBase-0.94/476/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388160)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> HBASE-6649 broke replication
> 
>
> Key: HBASE-6847
> URL: https://issues.apache.org/jira/browse/HBASE-6847
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch
>
>
> After running with HBASE-6646 and replication enabled I encountered this:
> {noformat}
> 2012-09-17 20:04:08,111 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on 
> IOE: 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318,
>  entryStart=78641557, pos=78771200, end=78771200, edit=84
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> currentNbOperations:164529 and seenEntries:84 and size: 154068
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicating 84
> 2012-09-17 20:04:08,146 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for 
> position 78771200 in 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> 2012-09-17 20:04:08,158 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Removing 0 logs in the list: []
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicated in total: 93234
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200
> 2012-09-17 20:04:08,163 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in ReplicationSource, 
> currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> java.lang.IndexOutOfBoundsException
> at java.io.DataInputStream.readFully(DataInputStream.java:175)
> at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
> at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307)
> {noformat}
> There's something weird at the end of the file and it's killing replication. 
> We used to just retry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459923#comment-13459923
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-0.94 #476 (See 
[https://builds.apache.org/job/HBase-0.94/476/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388160)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
> ---
>
> Key: HBASE-6649
> URL: https://issues.apache.org/jira/browse/HBASE-6649
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
> 6649-fix-io-exception-handling-1.patch, 
> 6649-fix-io-exception-handling-1-trunk.patch, 
> 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 
> 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 
> #502 test - queueFailover [Jenkins].html
>
>
> Have seen it twice in the recent past: http://bit.ly/MPCykB & 
> http://bit.ly/O79Dq7 .. 
> Looking briefly at the logs hints at a pattern - in both the failed test 
> instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459917#comment-13459917
 ] 

Hudson commented on HBASE-6649:
---

Integrated in HBase-TRUNK #3360 (See 
[https://builds.apache.org/job/HBase-TRUNK/3360/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388161)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
> ---
>
> Key: HBASE-6649
> URL: https://issues.apache.org/jira/browse/HBASE-6649
> Project: HBase
>  Issue Type: Bug
>Reporter: Devaraj Das
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, 
> 6649-fix-io-exception-handling-1.patch, 
> 6649-fix-io-exception-handling-1-trunk.patch, 
> 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, 
> 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 
> #502 test - queueFailover [Jenkins].html
>
>
> Have seen it twice in the recent past: http://bit.ly/MPCykB & 
> http://bit.ly/O79Dq7 .. 
> Looking briefly at the logs hints at a pattern - in both the failed test 
> instances, there was an RS crash while the test was running.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459918#comment-13459918
 ] 

Hudson commented on HBASE-6847:
---

Integrated in HBase-TRUNK #3360 (See 
[https://builds.apache.org/job/HBase-TRUNK/3360/])
HBASE-6847  HBASE-6649 broke replication (Devaraj Das via JD) (Revision 
1388161)

 Result = FAILURE
jdcryans : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java


> HBASE-6649 broke replication
> 
>
> Key: HBASE-6847
> URL: https://issues.apache.org/jira/browse/HBASE-6847
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch
>
>
> After running with HBASE-6646 and replication enabled I encountered this:
> {noformat}
> 2012-09-17 20:04:08,111 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on 
> IOE: 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318,
>  entryStart=78641557, pos=78771200, end=78771200, edit=84
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> currentNbOperations:164529 and seenEntries:84 and size: 154068
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicating 84
> 2012-09-17 20:04:08,146 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for 
> position 78771200 in 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> 2012-09-17 20:04:08,158 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Removing 0 logs in the list: []
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicated in total: 93234
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200
> 2012-09-17 20:04:08,163 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in ReplicationSource, 
> currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> java.lang.IndexOutOfBoundsException
> at java.io.DataInputStream.readFully(DataInputStream.java:175)
> at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
> at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307)
> {noformat}
> There's something weird at the end of the file and it's killing replication. 
> We used to just retry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Lars Hofhansl (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459910#comment-13459910
 ] 

Lars Hofhansl commented on HBASE-6841:
--

I'd be OK with that. I'd also be worried that this is just a symptom. 
HConnectionManager.getConnection(...) and HConnection.close() should just do 
some reference counting rather than actually creating/destroying connection; 
which means sometime we're coming in there with a new Configuration every 
time...? And even that should be handled by the Configuration equivalence code 
we're using now.
So if, in this case, we'd remove prefetching, we'd still have the expensive 
Connection setup every time.

Then again and just to state the obvious, the prefetching is only useful for 
long lived connections and then only if these connections actually use a 
larg'ish portion of the prefetched entries (otherwise we're doing a lot of 
unnecessary work cache, and wasting memory).

Let's just disable it by default. I guess we'd do that by reversing the meaning 
(and name) of regionCachePrefetchDisabledTables). Happy to make a patch if you 
folks agree.


> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.2
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups

2012-09-20 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459893#comment-13459893
 ] 

Andrew Purtell commented on HBASE-6841:
---

Given J-D's observations, perhaps we should default meta prefetch to off, like 
Stack suggests. 

Some time ago I patched our private Frankenbase to allow disable of meta 
prefetch on a per table basis. This was because meta prefetch was causing heap 
limited MR clients to OOME, and for that particular application table prefetch 
wasn't helpful.


> Meta prefetching is slower than doing multiple meta lookups
> ---
>
> Key: HBASE-6841
> URL: https://issues.apache.org/jira/browse/HBASE-6841
> Project: HBase
>  Issue Type: Improvement
>Reporter: Jean-Daniel Cryans
>Priority: Critical
> Fix For: 0.94.2
>
>
> I got myself into a situation where I needed to truncate a massive table 
> while it was getting hits and surprisingly the clients were not recovering. 
> What I see in the logs is that every time we prefetch .META. we setup a new 
> HConnection because we close it on the way out. It's awfully slow.
> We should just turn it off or make it useful. jstacks coming up.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459891#comment-13459891
 ] 

Hadoop QA commented on HBASE-6848:
--

-1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12545942/HBASE-6848-0.patch
  against trunk revision .

+1 @author.  The patch does not contain any @author tags.

-1 tests included.  The patch doesn't appear to include any new or modified 
tests.
Please justify why no new tests are needed for this 
patch.
Also please list what manual steps were performed to 
verify this patch.

+1 hadoop2.0.  The patch compiles against the hadoop 2.0 profile.

-1 javadoc.  The javadoc tool appears to have generated 139 warning 
messages.

+1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

-1 findbugs.  The patch appears to introduce 7 new Findbugs (version 1.3.9) 
warnings.

+1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

 -1 core tests.  The patch failed these unit tests:
 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//testReport/
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html
Findbugs warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/2908//console

This message is automatically generated.

> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6677) Random ZooKeeper port in test can overrun max port

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459878#comment-13459878
 ] 

Hudson commented on HBASE-6677:
---

Integrated in HBase-TRUNK #3359 (See 
[https://builds.apache.org/job/HBase-TRUNK/3359/])
HBASE-6677 Random ZooKeeper port in test can overrun max port (Revision 
1388125)

 Result = FAILURE
stack : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java


> Random ZooKeeper port in test can overrun max port
> --
>
> Key: HBASE-6677
> URL: https://issues.apache.org/jira/browse/HBASE-6677
> Project: HBase
>  Issue Type: Bug
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Gregory Chanan
>Assignee: liang xie
>Priority: Trivial
>  Labels: noob
> Fix For: 0.96.0
>
> Attachments: HBASE-6677.patch
>
>
> {code} 
>  while (true) {
> try {
>   standaloneServerFactory = new NIOServerCnxnFactory();
>   standaloneServerFactory.configure(
> new InetSocketAddress(tentativePort),
> configuration.getInt(HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS,
>   1000));
> } catch (BindException e) {
>   LOG.debug("Failed binding ZK Server to client port: " +
>   tentativePort);
>   // This port is already in use, try to use another.
>   tentativePort++;
>   continue;
> }
> break;
>   }
> {code}
> In the case of failure and all the above ports have already been binded, you 
> can extend past the max port.  Need to check against a max value.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation

2012-09-20 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459877#comment-13459877
 ] 

Hudson commented on HBASE-6698:
---

Integrated in HBase-TRUNK #3359 (See 
[https://builds.apache.org/job/HBase-TRUNK/3359/])
HBASE-6698 Refactor checkAndPut and checkAndDelete to use 
doMiniBatchMutation
(Priya)

Submitted by:PrIya
Reviewed by:Ram, Stack, Ted, Lars (Revision 1388141)

 Result = FAILURE
ramkrishna : 
Files : 
* 
/hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java
* 
/hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java


> Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
> --
>
> Key: HBASE-6698
> URL: https://issues.apache.org/jira/browse/HBASE-6698
> Project: HBase
>  Issue Type: Improvement
>Reporter: ramkrishna.s.vasudevan
> Fix For: 0.96.0
>
> Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, 
> HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, 
> HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, 
> HBASE-6698_7.patch, HBASE-6698_8.patch, HBASE-6698_8.patch, 
> HBASE-6698_8.patch, HBASE-6698.patch
>
>
> Currently the checkAndPut and checkAndDelete api internally calls the 
> internalPut and internalDelete.  May be we can just call doMiniBatchMutation
> only.  This will help in future like if we have some hooks and the CP
> handles certain cases in the doMiniBatchMutation the same can be done while
> doing a put thro checkAndPut or while doing a delete thro checkAndDelete.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-09-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459866#comment-13459866
 ] 

Elliott Clark commented on HBASE-6410:
--

Since this will probably end up being a larger patch I'm going to try and keep 
all the work on github. 
https://github.com/elliottneilclark/hbase/tree/HBASE-6410

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-1.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6410:
-

Assignee: Elliott Clark  (was: Alex Baranau)

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Blocker
> Attachments: HBASE-6410-1.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Resolved] (HBASE-6847) HBASE-6649 broke replication

2012-09-20 Thread Jean-Daniel Cryans (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jean-Daniel Cryans resolved HBASE-6847.
---

  Resolution: Fixed
Hadoop Flags: Reviewed

Committed to 0.92, 0.94 and trunk. Thanks Devaraj!

> HBASE-6649 broke replication
> 
>
> Key: HBASE-6847
> URL: https://issues.apache.org/jira/browse/HBASE-6847
> Project: HBase
>  Issue Type: Bug
>Reporter: Jean-Daniel Cryans
>Assignee: Devaraj Das
>Priority: Blocker
> Fix For: 0.96.0, 0.92.3, 0.94.2
>
> Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch
>
>
> After running with HBASE-6646 and replication enabled I encountered this:
> {noformat}
> 2012-09-17 20:04:08,111 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on 
> IOE: 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318,
>  entryStart=78641557, pos=78771200, end=78771200, edit=84
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> currentNbOperations:164529 and seenEntries:84 and size: 154068
> 2012-09-17 20:04:08,120 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicating 84
> 2012-09-17 20:04:08,146 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for 
> position 78771200 in 
> hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> 2012-09-17 20:04:08,158 INFO 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: 
> Removing 0 logs in the list: []
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Replicated in total: 93234
> 2012-09-17 20:04:08,158 DEBUG 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening 
> log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200
> 2012-09-17 20:04:08,163 ERROR 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: 
> Unexpected exception in ReplicationSource, 
> currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318
> java.lang.IndexOutOfBoundsException
> at java.io.DataInputStream.readFully(DataInputStream.java:175)
> at 
> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63)
> at 
> org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901)
> at 
> org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947)
> at 
> org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394)
> at 
> org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307)
> {noformat}
> There's something weird at the end of the file and it's killing replication. 
> We used to just retry.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6849:
-

Attachment: HBASE-6849-0.patch

* Made the Stochastic LoadBalancer the default.
* Turned off by table load balancing since that messes up the 
StochasticBalancer.

> Make StochasticLoadBalancer the default
> ---
>
> Key: HBASE-6849
> URL: https://issues.apache.org/jira/browse/HBASE-6849
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
> Attachments: HBASE-6849-0.patch
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-09-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459851#comment-13459851
 ] 

Elliott Clark commented on HBASE-6410:
--

Yep I should have some time for this.  Thanks for all of the work it's really 
in a good place.


> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Alex Baranau
>Priority: Blocker
> Attachments: HBASE-6410-1.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread Elliott Clark (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459849#comment-13459849
 ] 

Elliott Clark commented on HBASE-6848:
--

Yes that's a bug fix that I found through findbugs.  Findbugs was alerting that 
ritOldestAgeGauge wasn't being used.

> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2

2012-09-20 Thread Alex Baranau (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alex Baranau updated HBASE-6410:


Attachment: HBASE-6410-1.patch

Updated patch with respect to latest changes in common classes (and the fixed 
HBASE-6501). Which look good. With factories for metrics sources this looks 
closer to what I suggested during the initial discussion.

Anyhow, this is what left:
* replacing (i.e. removing) old Rs Metrics classes
* adding more metrics in new RS MetricsSource

Elliott, if you have time for the above and have time to complete it as a part 
of this JIRA issue, feel free to take it from here.. I definitely don't want to 
be a stopper.

> Move RegionServer Metrics to metrics2
> -
>
> Key: HBASE-6410
> URL: https://issues.apache.org/jira/browse/HBASE-6410
> Project: HBase
>  Issue Type: Sub-task
>  Components: metrics
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Alex Baranau
>Priority: Blocker
> Attachments: HBASE-6410-1.patch, HBASE-6410.patch
>
>
> Move RegionServer Metrics to metrics2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459846#comment-13459846
 ] 

stack commented on HBASE-6848:
--

This intended:

{code}
-ritCountOverThresholdGauge.set(ritCount);
+ritOldestAgeGauge.set(ritCount);
{code}

Here too...

{code}
-ritCountOverThresholdGauge.set(ritCount);
+ritOldestAgeGauge.set(ritCount);
{code}

Looks like bug fix?

Else patch looks good.

> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6849:
-

Component/s: master

> Make StochasticLoadBalancer the default
> ---
>
> Key: HBASE-6849
> URL: https://issues.apache.org/jira/browse/HBASE-6849
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6849:
-

Fix Version/s: 0.96.0

> Make StochasticLoadBalancer the default
> ---
>
> Key: HBASE-6849
> URL: https://issues.apache.org/jira/browse/HBASE-6849
> Project: HBase
>  Issue Type: Improvement
>  Components: master
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 0.96.0
>
>


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services

2012-09-20 Thread Andrew Purtell (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459842#comment-13459842
 ] 

Andrew Purtell commented on HBASE-6789:
---

I'd be +1 with dropping CoprocessorProtocol from 0.96 and up, given all of the 
other (deliberate) incompatibilities posed with RPC going from 0.94 to 0.96 and 
up.

> Convert test CoprocessorProtocol implementations to protocol buffer services
> 
>
> Key: HBASE-6789
> URL: https://issues.apache.org/jira/browse/HBASE-6789
> Project: HBase
>  Issue Type: Sub-task
>  Components: coprocessors
>Reporter: Gary Helmling
> Fix For: 0.96.0
>
>
> With coprocessor endpoints now exposed as protobuf defined services, we 
> should convert over all of our built-in endpoints to PB services.
> Several CoprocessorProtocol implementations are defined for tests:
> * ColumnAggregationProtocol
> * GenericProtocol
> * TestServerCustomProtocol.PingProtocol
> These should either be converted to PB services or removed if they duplicate 
> other tests/are no longer necessary.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6524) Hooks for hbase tracing

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459838#comment-13459838
 ] 

stack commented on HBASE-6524:
--

Ok I integrate your doc into the book?

> Hooks for hbase tracing
> ---
>
> Key: HBASE-6524
> URL: https://issues.apache.org/jira/browse/HBASE-6524
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Jonathan Leavitt
>Assignee: Jonathan Leavitt
> Fix For: 0.96.0
>
> Attachments: 6524.addendum, 6524-v2.txt, 6524v3.txt, 
> createTableTrace.png, hbase-6524.diff
>
>
> Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] 
> library to add dapper-like tracing to hbase.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6848:
-

Status: Patch Available  (was: Open)

> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (HBASE-6849) Make StochasticLoadBalancer the default

2012-09-20 Thread Elliott Clark (JIRA)
Elliott Clark created HBASE-6849:


 Summary: Make StochasticLoadBalancer the default
 Key: HBASE-6849
 URL: https://issues.apache.org/jira/browse/HBASE-6849
 Project: HBase
  Issue Type: Improvement
Reporter: Elliott Clark
Assignee: Elliott Clark




--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Commented] (HBASE-6702) ResourceChecker refinement

2012-09-20 Thread stack (JIRA)

[ 
https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459841#comment-13459841
 ] 

stack commented on HBASE-6702:
--

Thanks [~nkeywal]

> ResourceChecker refinement
> --
>
> Key: HBASE-6702
> URL: https://issues.apache.org/jira/browse/HBASE-6702
> Project: HBase
>  Issue Type: Improvement
>  Components: test
>Affects Versions: 0.96.0
>Reporter: Jesse Yates
>Priority: Critical
> Fix For: 0.96.0
>
>
> This was based on some discussion from HBASE-6234.
> The ResourceChecker was added by N. Keywal to help resolve some hadoop qa 
> issues, but has since not be widely utilized. Further, with modularization we 
> have had to drop the ResourceChecker from the tests that are moved into the 
> hbase-common module because bringing the ResourceChecker up to hbase-common 
> would involved bringing all its dependencies (which are quite far reaching).
> The question then is, what should we do with it? Get rid of it? Refactor and 
> resuse? 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HBASE-6848) Make hbase-hadoop-compat findbugs clean

2012-09-20 Thread Elliott Clark (JIRA)

 [ 
https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Elliott Clark updated HBASE-6848:
-

Attachment: HBASE-6848-0.patch

I don't think that any of the issues were very big, but it's always nice to 
keep things clean.

> Make hbase-hadoop-compat findbugs clean
> ---
>
> Key: HBASE-6848
> URL: https://issues.apache.org/jira/browse/HBASE-6848
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 0.96.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
>Priority: Minor
> Fix For: 0.96.0
>
> Attachments: HBASE-6848-0.patch
>
>
> There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, 
> and hbase-hadoop2-compat.  Lets fix these up; since these are new modules it 
> would be nice to keep them with 0 findbugs errors.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


  1   2   >