[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460280#comment-13460280 ] Cheng Hao commented on HBASE-6852: -- Lars, the only place to use the ConcurentMap in SchemaMetrics is tableAndFamilyToMetrics. in this patch, I pre-create an array of AtomicLong for all of the possible oncachehit metrics items, which will avoids the concurrent issue and easy to be indexed while accessing. Thanks stack and Lars for the suggestions, I will create another patch file instead. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460275#comment-13460275 ] liang xie commented on HBASE-6852: -- Hi Cheng, for running time, could you exclude the system resouce factor ? e.g. you ran the original version with many physical IOs, but reran the patched version without similar physical IO requests due to hitting OS page cache. In other words, could the reduced running time symptom be reproduced always, even you run patched version first, then rerun the original version ? It'd better if you can issue "echo 1 > /proc/sys/vm/drop_caches" to free pagecache between each test. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts
[ https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460271#comment-13460271 ] stack commented on HBASE-6806: -- Hmm... it puts the commit in all issues referenced by the commit message, here and HBASE-4658 > HBASE-4658 breaks backward compatibility / example scripts > -- > > Key: HBASE-6806 > URL: https://issues.apache.org/jira/browse/HBASE-6806 > Project: HBase > Issue Type: Bug > Components: Thrift >Affects Versions: 0.94.0 >Reporter: Lukas > Fix For: 0.96.0 > > Attachments: HBASE-6806-fix-examples.diff > > > HBASE-4658 introduces the new 'attributes' argument as a non optional > parameter. This is not backward compatible and also breaks the code in the > example section. Resolution: Mark as 'optional' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4658) Put attributes are not exposed via the ThriftServer
[ https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460269#comment-13460269 ] stack commented on HBASE-4658: -- The above comment from hudson is in wrong place. The parse found the second hbase jira referenced which is this one rather than HBASE-6806. > Put attributes are not exposed via the ThriftServer > --- > > Key: HBASE-4658 > URL: https://issues.apache.org/jira/browse/HBASE-4658 > Project: HBase > Issue Type: Bug > Components: Thrift >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Fix For: 0.94.0 > > Attachments: ASF.LICENSE.NOT.GRANTED--D1563.1.patch, > ASF.LICENSE.NOT.GRANTED--D1563.1.patch, > ASF.LICENSE.NOT.GRANTED--D1563.1.patch, > ASF.LICENSE.NOT.GRANTED--D1563.2.patch, > ASF.LICENSE.NOT.GRANTED--D1563.2.patch, > ASF.LICENSE.NOT.GRANTED--D1563.2.patch, > ASF.LICENSE.NOT.GRANTED--D1563.3.patch, > ASF.LICENSE.NOT.GRANTED--D1563.3.patch, > ASF.LICENSE.NOT.GRANTED--D1563.3.patch, ThriftPutAttributes1.txt > > > The Put api also takes in a bunch of arbitrary attributes that an application > can use to associate metadata with each put operation. This is not exposed > via Thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460266#comment-13460266 ] Hadoop QA commented on HBASE-6299: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12546000/6299v4.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 139 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2912//console This message is automatically generated. > RS starts region open while fails ack to HMaster.sendRegionOpen() causes > inconsistency in HMaster's region state and a series of successive problems. > - > > Key: HBASE-6299 > URL: https://issues.apache.org/jira/browse/HBASE-6299 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.94.0 >Reporter: Maryann Xue >Assignee: Maryann Xue >Priority: Critical > Fix For: 0.92.3, 0.94.3, 0.96.0 > > Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, > HBASE-6299-v3.patch > > > 1. HMaster tries to assign a region to an RS. > 2. HMaster creates a RegionState for this region and puts it into > regionsInTransition. > 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS > receives the open region request and starts to proceed, with success > eventually. However, due to network problems, HMaster fails to receive the > response for the openRegion() call, and the call times out. > 4. HMaster attemps to assign for a second time, choosing another RS. > 5. But since the HMaster's OpenedRegionHandler has been triggered by the > region open of the previous RS, and the RegionState has already been removed > from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK > node "RS_ZK_REGION_OPENING" updated by the second attempt. > 6. The unassigned ZK node stays and a later unassign fails coz > RS_ZK_REGION_CLOSING cannot be created. > {code} > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for > region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; > > plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., > src=swbss-hadoop-004,60020,1340890123243, > dest=swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:28,882 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,13408906
[jira] [Commented] (HBASE-6524) Hooks for hbase tracing
[ https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460265#comment-13460265 ] stack commented on HBASE-6524: -- Committed the doc. as appendix I in the manual. Will show next time I push the doc. Thanks Jonathan. > Hooks for hbase tracing > --- > > Key: HBASE-6524 > URL: https://issues.apache.org/jira/browse/HBASE-6524 > Project: HBase > Issue Type: Sub-task >Reporter: Jonathan Leavitt >Assignee: Jonathan Leavitt > Fix For: 0.96.0 > > Attachments: 6524.addendum, 6524-v2.txt, 6524v3.txt, > createTableTrace.png, hbase-6524.diff > > > Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] > library to add dapper-like tracing to hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4658) Put attributes are not exposed via the ThriftServer
[ https://issues.apache.org/jira/browse/HBASE-4658?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460259#comment-13460259 ] Hudson commented on HBASE-4658: --- Integrated in HBase-TRUNK #3363 (See [https://builds.apache.org/job/HBase-TRUNK/3363/]) HBASE-6806 HBASE-4658 breaks backward compatibility / example scripts (Revision 1388318) Result = FAILURE stack : Files : * /hbase/trunk/examples/thrift/DemoClient.cpp * /hbase/trunk/examples/thrift/DemoClient.java * /hbase/trunk/examples/thrift/DemoClient.php * /hbase/trunk/examples/thrift/DemoClient.pl * /hbase/trunk/examples/thrift/DemoClient.py * /hbase/trunk/examples/thrift/DemoClient.rb * /hbase/trunk/examples/thrift/Makefile > Put attributes are not exposed via the ThriftServer > --- > > Key: HBASE-4658 > URL: https://issues.apache.org/jira/browse/HBASE-4658 > Project: HBase > Issue Type: Bug > Components: Thrift >Reporter: dhruba borthakur >Assignee: dhruba borthakur > Fix For: 0.94.0 > > Attachments: ASF.LICENSE.NOT.GRANTED--D1563.1.patch, > ASF.LICENSE.NOT.GRANTED--D1563.1.patch, > ASF.LICENSE.NOT.GRANTED--D1563.1.patch, > ASF.LICENSE.NOT.GRANTED--D1563.2.patch, > ASF.LICENSE.NOT.GRANTED--D1563.2.patch, > ASF.LICENSE.NOT.GRANTED--D1563.2.patch, > ASF.LICENSE.NOT.GRANTED--D1563.3.patch, > ASF.LICENSE.NOT.GRANTED--D1563.3.patch, > ASF.LICENSE.NOT.GRANTED--D1563.3.patch, ThriftPutAttributes1.txt > > > The Put api also takes in a bunch of arbitrary attributes that an application > can use to associate metadata with each put operation. This is not exposed > via Thrift. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts
[ https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460258#comment-13460258 ] Hudson commented on HBASE-6806: --- Integrated in HBase-TRUNK #3363 (See [https://builds.apache.org/job/HBase-TRUNK/3363/]) HBASE-6806 HBASE-4658 breaks backward compatibility / example scripts (Revision 1388318) Result = FAILURE stack : Files : * /hbase/trunk/examples/thrift/DemoClient.cpp * /hbase/trunk/examples/thrift/DemoClient.java * /hbase/trunk/examples/thrift/DemoClient.php * /hbase/trunk/examples/thrift/DemoClient.pl * /hbase/trunk/examples/thrift/DemoClient.py * /hbase/trunk/examples/thrift/DemoClient.rb * /hbase/trunk/examples/thrift/Makefile > HBASE-4658 breaks backward compatibility / example scripts > -- > > Key: HBASE-6806 > URL: https://issues.apache.org/jira/browse/HBASE-6806 > Project: HBase > Issue Type: Bug > Components: Thrift >Affects Versions: 0.94.0 >Reporter: Lukas > Fix For: 0.96.0 > > Attachments: HBASE-6806-fix-examples.diff > > > HBASE-4658 introduces the new 'attributes' argument as a non optional > parameter. This is not backward compatible and also breaks the code in the > example section. Resolution: Mark as 'optional' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460253#comment-13460253 ] Lars Hofhansl commented on HBASE-6852: -- Interesting. Thanks Cheng. I wonder what causes the performance problem then. Is it the get/putIfAbsent of the ConcurrentMap we store the metrics in? I'd probably feel better if you set the threshold to 100 (instead of 2000) - you'd still reduce the time used there by 99%. Also looking at the places where updateOnCacheHit is called... We also increment an AtomicLong (cacheHits), which is never read (WTF). We should remove that counter while we're at it (even when AtomicLongs are not the problem). > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460254#comment-13460254 ] Lars Hofhansl commented on HBASE-6841: -- Haven't been able to track down that test failure, yet. It shouldn't happen, but yet somehow it does. @J-D: Since this is (presumably) a long standing condition, how do you feel about moving this to 0.94.3? > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6524) Hooks for hbase tracing
[ https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460247#comment-13460247 ] Jonathan Leavitt commented on HBASE-6524: - Sounds good. :) > Hooks for hbase tracing > --- > > Key: HBASE-6524 > URL: https://issues.apache.org/jira/browse/HBASE-6524 > Project: HBase > Issue Type: Sub-task >Reporter: Jonathan Leavitt >Assignee: Jonathan Leavitt > Fix For: 0.96.0 > > Attachments: 6524.addendum, 6524-v2.txt, 6524v3.txt, > createTableTrace.png, hbase-6524.diff > > > Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] > library to add dapper-like tracing to hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460244#comment-13460244 ] stack commented on HBASE-6852: -- bq. Do we have to think about this generally? How perfect do these metrics have to be? In 0.94 we started recording way more than previous. I like your question on how perfect do they need to be. For metrics updated by 1 frequently, my quess is we could miss a few. Why we using atomic longs anyway and not cliffclick's high scale lib... its in our CLASSPATH... > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5937) Refactor HLog into an interface.
[ https://issues.apache.org/jira/browse/HBASE-5937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460232#comment-13460232 ] stack commented on HBASE-5937: -- [~fpj] Sorry for not getting to your log. What have you been having to do to get tests to pass? How did you fix TestMultiParallel? It is stuff to do w/ this refactoring? On your question{quote}I have also looked at making getReader/getWriter part of HLog{quote} What are you thinking? Currently Reader and Writer are Interfaces defined inside HLog. You get one by calling a static method on HLog. You'd like to getReader non-static, an invocation on a particular instance of HLog. That seems fine by me. It makes sense given what you are trying to do. It is less flexible than what we currently have but its flexible because it presumes a particular implementation of HLog. {quote}HLogInputFormat: Not clear how to instantiate HLog{quote} This is a facility little used if ever. I'm surprised it not used more often. Its a repair facility. You'd use it when you started a cluster somehow w/o replaying WALs. You could use this class in a mapreduce job to quickly add the edits from WAL back up into the cluster. I took a look. What are you thinking constructors will look like for HLogs? There'll be a factory? What will the factory take for arguments? {quote}HLogPrettyPrinter: Executed through main calls in FSHLog and HLogPrettyPrinter, so maybe we could pass necessary parameters{quote} This is a tool for humans to look at contents of HLogs. {quote}HLogSplitter: Have all parameters{quote} This is the important one (smile) {quote}HRegion: Have HLog object{quote} Good... Its passed the HLog, right? {quote}ReplicationSource: Not clear how to instantiate HLog{quote} You know what this is about, right? This is how we do replication. We tail the WALs and as the edits come in, we send them off to other clusters. We'll need to be able to tail logs. Could we pass Replication an HLog instance? I hope you call your HLog Inteface WAL! {quote}I was also wondering if there are important side-effects in the case we use the factory to get an HLog object just to get a reader or a writer{quote} We'd have to change the current HLog constructor. It does a bunch of work when created -- sets a sync'ing thread running (this syncing thread though is in need of some cleanup), creates dirs and sets up first WAL. We wouldn't want it doing this stuff if we wanted the instance just to do getReader/getWriter on it. {quote}I have looked into the main constructor of FSHLog and I haven't been able to convince myself that there is a way of executing it without throwing an exception unnecessarily or having side-effects.{quote} As it is currently written, yes. I think this work trying to make an Interface for WAL is kinda important. There is this bookeeping project but the multi-WAL dev -- i.e. making the regionserver write more than one WAL at a time (into HDFS) -- could use the result of this effort too. Thanks Flavio. > Refactor HLog into an interface. > > > Key: HBASE-5937 > URL: https://issues.apache.org/jira/browse/HBASE-5937 > Project: HBase > Issue Type: Sub-task >Reporter: Li Pi >Assignee: Flavio Junqueira >Priority: Minor > Attachments: > org.apache.hadoop.hbase.client.TestMultiParallel-output.txt > > > What the summary says. Create HLog interface. Make current implementation use > it. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460224#comment-13460224 ] Cheng Hao commented on HBASE-6852: -- @stack: it should make more sense if we put the close() into the AbastractHFileReader, but not sure if there any other concern, since the AbstractHFileReader doesn't have it. And for the THRESHOLD_METRICS_FLUSH = 2k, which I used during my testing, hope it's big enough for reducing the overhead, and less impact for getting the metrics snapshot timely. sorry, I may not able to give a good experiential number for it. @Lars: Yes, that's right, we're still updating an AtomicLong each time, but from profiling result, I didn't see the AtomicLong became the new hotspots, and the testing also did >10% saved in running time, which may means the overhead of AtomicLong could be ignored. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6299: - Status: Patch Available (was: Open) > RS starts region open while fails ack to HMaster.sendRegionOpen() causes > inconsistency in HMaster's region state and a series of successive problems. > - > > Key: HBASE-6299 > URL: https://issues.apache.org/jira/browse/HBASE-6299 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.90.6 >Reporter: Maryann Xue >Assignee: Maryann Xue >Priority: Critical > Fix For: 0.92.3, 0.94.3, 0.96.0 > > Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, > HBASE-6299-v3.patch > > > 1. HMaster tries to assign a region to an RS. > 2. HMaster creates a RegionState for this region and puts it into > regionsInTransition. > 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS > receives the open region request and starts to proceed, with success > eventually. However, due to network problems, HMaster fails to receive the > response for the openRegion() call, and the call times out. > 4. HMaster attemps to assign for a second time, choosing another RS. > 5. But since the HMaster's OpenedRegionHandler has been triggered by the > region open of the previous RS, and the RegionState has already been removed > from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK > node "RS_ZK_REGION_OPENING" updated by the second attempt. > 6. The unassigned ZK node stays and a later unassign fails coz > RS_ZK_REGION_CLOSING cannot be created. > {code} > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for > region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; > > plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., > src=swbss-hadoop-004,60020,1340890123243, > dest=swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:28,882 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,291 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED > event for > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, > regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node > 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Deleting existing unassigned node for > b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Successfully deleted unassigned node for > region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has > opened the region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > that was online on serverName=swbss-hadoop-006,60020,1340890678078, > load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) > 2012-06-29 07:07:41,140 WARN > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, > regions=575, usedHeap=0, maxHeap=0), t
[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6299: - Status: Open (was: Patch Available) > RS starts region open while fails ack to HMaster.sendRegionOpen() causes > inconsistency in HMaster's region state and a series of successive problems. > - > > Key: HBASE-6299 > URL: https://issues.apache.org/jira/browse/HBASE-6299 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.94.0, 0.90.6 >Reporter: Maryann Xue >Assignee: Maryann Xue >Priority: Critical > Fix For: 0.92.3, 0.94.3, 0.96.0 > > Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, > HBASE-6299-v3.patch > > > 1. HMaster tries to assign a region to an RS. > 2. HMaster creates a RegionState for this region and puts it into > regionsInTransition. > 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS > receives the open region request and starts to proceed, with success > eventually. However, due to network problems, HMaster fails to receive the > response for the openRegion() call, and the call times out. > 4. HMaster attemps to assign for a second time, choosing another RS. > 5. But since the HMaster's OpenedRegionHandler has been triggered by the > region open of the previous RS, and the RegionState has already been removed > from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK > node "RS_ZK_REGION_OPENING" updated by the second attempt. > 6. The unassigned ZK node stays and a later unassign fails coz > RS_ZK_REGION_CLOSING cannot be created. > {code} > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for > region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; > > plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., > src=swbss-hadoop-004,60020,1340890123243, > dest=swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:28,882 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,291 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED > event for > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, > regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node > 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Deleting existing unassigned node for > b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Successfully deleted unassigned node for > region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has > opened the region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > that was online on serverName=swbss-hadoop-006,60020,1340890678078, > load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) > 2012-06-29 07:07:41,140 WARN > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=0, > regions=575, usedHeap=0, maxHeap=0), t
[jira] [Updated] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6299: - Attachment: 6299v4.txt v3 rotted. Here is v4 which applies to trunk. Is this in the right place MaryAnn? Thanks. > RS starts region open while fails ack to HMaster.sendRegionOpen() causes > inconsistency in HMaster's region state and a series of successive problems. > - > > Key: HBASE-6299 > URL: https://issues.apache.org/jira/browse/HBASE-6299 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.94.0 >Reporter: Maryann Xue >Assignee: Maryann Xue >Priority: Critical > Fix For: 0.92.3, 0.94.3, 0.96.0 > > Attachments: 6299v4.txt, HBASE-6299.patch, HBASE-6299-v2.patch, > HBASE-6299-v3.patch > > > 1. HMaster tries to assign a region to an RS. > 2. HMaster creates a RegionState for this region and puts it into > regionsInTransition. > 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS > receives the open region request and starts to proceed, with success > eventually. However, due to network problems, HMaster fails to receive the > response for the openRegion() call, and the call times out. > 4. HMaster attemps to assign for a second time, choosing another RS. > 5. But since the HMaster's OpenedRegionHandler has been triggered by the > region open of the previous RS, and the RegionState has already been removed > from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK > node "RS_ZK_REGION_OPENING" updated by the second attempt. > 6. The unassigned ZK node stays and a later unassign fails coz > RS_ZK_REGION_CLOSING cannot be created. > {code} > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for > region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; > > plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., > src=swbss-hadoop-004,60020,1340890123243, > dest=swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:28,882 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,291 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED > event for > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, > regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node > 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Deleting existing unassigned node for > b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Successfully deleted unassigned node for > region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has > opened the region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > that was online on serverName=swbss-hadoop-006,60020,1340890678078, > load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) > 2012-06-29 07:07:41,140 WARN > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to serverName=swbss
[jira] [Commented] (HBASE-6299) RS starts region open while fails ack to HMaster.sendRegionOpen() causes inconsistency in HMaster's region state and a series of successive problems.
[ https://issues.apache.org/jira/browse/HBASE-6299?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460202#comment-13460202 ] ramkrishna.s.vasudevan commented on HBASE-6299: --- Yes Stack. +1 on this. > RS starts region open while fails ack to HMaster.sendRegionOpen() causes > inconsistency in HMaster's region state and a series of successive problems. > - > > Key: HBASE-6299 > URL: https://issues.apache.org/jira/browse/HBASE-6299 > Project: HBase > Issue Type: Bug > Components: master >Affects Versions: 0.90.6, 0.94.0 >Reporter: Maryann Xue >Assignee: Maryann Xue >Priority: Critical > Fix For: 0.92.3, 0.94.3, 0.96.0 > > Attachments: HBASE-6299.patch, HBASE-6299-v2.patch, > HBASE-6299-v3.patch > > > 1. HMaster tries to assign a region to an RS. > 2. HMaster creates a RegionState for this region and puts it into > regionsInTransition. > 3. In the first assign attempt, HMaster calls RS.openRegion(). The RS > receives the open region request and starts to proceed, with success > eventually. However, due to network problems, HMaster fails to receive the > response for the openRegion() call, and the call times out. > 4. HMaster attemps to assign for a second time, choosing another RS. > 5. But since the HMaster's OpenedRegionHandler has been triggered by the > region open of the previous RS, and the RegionState has already been removed > from regionsInTransition, HMaster finds invalid and ignores the unassigned ZK > node "RS_ZK_REGION_OPENING" updated by the second attempt. > 6. The unassigned ZK node stays and a later unassign fails coz > RS_ZK_REGION_CLOSING cannot be created. > {code} > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Using pre-existing plan for > region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568.; > > plan=hri=CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568., > src=swbss-hadoop-004,60020,1340890123243, > dest=swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Assigning region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to swbss-hadoop-006,60020,1340890678078 > 2012-06-29 07:03:38,870 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=M_ZK_REGION_OFFLINE, server=swbss-hadoop-002:6, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:28,882 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,291 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENING, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.AssignmentManager: Handling > transition=RS_ZK_REGION_OPENED, server=swbss-hadoop-006,60020,1340890678078, > region=b713fd655fa02395496c5a6e39ddf568 > 2012-06-29 07:06:32,299 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: Handling OPENED > event for > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > from serverName=swbss-hadoop-006,60020,1340890678078, load=(requests=518945, > regions=575, usedHeap=15282, maxHeap=31301); deleting unassigned node > 2012-06-29 07:06:32,299 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Deleting existing unassigned node for > b713fd655fa02395496c5a6e39ddf568 that is in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG org.apache.hadoop.hbase.zookeeper.ZKAssign: > master:6-0x2377fee2ae80007 Successfully deleted unassigned node for > region b713fd655fa02395496c5a6e39ddf568 in expected state RS_ZK_REGION_OPENED > 2012-06-29 07:06:32,301 DEBUG > org.apache.hadoop.hbase.master.handler.OpenedRegionHandler: The master has > opened the region > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > that was online on serverName=swbss-hadoop-006,60020,1340890678078, > load=(requests=518945, regions=575, usedHeap=15282, maxHeap=31301) > 2012-06-29 07:07:41,140 WARN > org.apache.hadoop.hbase.master.AssignmentManager: Failed assignment of > CDR_STATS_TRAFFIC,13184390567|20120508|17||2|3|913,1337256975556.b713fd655fa02395496c5a6e39ddf568. > to serverName=swbss-hadoop-006,60020,
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460199#comment-13460199 ] Lars Hofhansl commented on HBASE-6852: -- @Cheng: Even with this patch we're still updating an AtomicLong each time we get a cache hit, right? I had assumed that that was the slow part. Is it not? > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460198#comment-13460198 ] Lars Hofhansl commented on HBASE-6852: -- This is third time that metrics come up as a performance issue. Do we have to think about this generally? How perfect do these metrics have to be? (Assuming a 64 bit architecture) we *could* just use plain (not even volatile) longs and accept the fact that we'll miss some updates or overwrite others; the values would still be the right ballpark. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460193#comment-13460193 ] Lars Hofhansl commented on HBASE-6852: -- Wait. This is the cache hit path we're talking about. Didn't come up in my profiling at all. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6852: - Fix Version/s: 0.94.2 Since 0.94.2. got delayed, pulling this in. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460189#comment-13460189 ] Lars Hofhansl commented on HBASE-6852: -- @Stack: No, this is a different issue. Didn't come up in my profiling since I only did cache path (so far). Good one Cheng. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.94.2, 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts
[ https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6806: - Resolution: Fixed Fix Version/s: 0.96.0 Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks for the fixup Lukas. Nice. > HBASE-4658 breaks backward compatibility / example scripts > -- > > Key: HBASE-6806 > URL: https://issues.apache.org/jira/browse/HBASE-6806 > Project: HBase > Issue Type: Bug > Components: Thrift >Affects Versions: 0.94.0 >Reporter: Lukas > Fix For: 0.96.0 > > Attachments: HBASE-6806-fix-examples.diff > > > HBASE-4658 introduces the new 'attributes' argument as a non optional > parameter. This is not backward compatible and also breaks the code in the > example section. Resolution: Mark as 'optional' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6798) HDFS always read checksum form meta file
[ https://issues.apache.org/jira/browse/HBASE-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460185#comment-13460185 ] stack commented on HBASE-6798: -- [~liulei.cn] so we should add new setSkipChecksum(boolean) method in FileSystem you mean per file? You mean to HFileSystem? Pardon my not understanding. Thanks. > HDFS always read checksum form meta file > > > Key: HBASE-6798 > URL: https://issues.apache.org/jira/browse/HBASE-6798 > Project: HBase > Issue Type: Bug > Components: Performance >Affects Versions: 0.94.0, 0.94.1 >Reporter: LiuLei >Priority: Blocker > Attachments: 6798.txt > > > I use hbase0.941 and hadoop-0.20.2-cdh3u5 version. > The HBase support checksums in HBase block cache in HBASE-5074 jira. > The HBase support checksums for decrease the iops of HDFS, so that HDFS > dont't need to read the checksum from meta file of block file. > But in hadoop-0.20.2-cdh3u5 version, BlockSender still read the metadata file > even if the > hbase.regionserver.checksum.verify property is ture. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460182#comment-13460182 ] stack commented on HBASE-6852: -- Patch looks good as does the change in the character of the pasted oprofile output. Did you look at adding a close to AbstractHFileReader that hfile v1 and v2 reader close could share? Would that make sense here? The THRESHOLD_METRICS_FLUSH = 2k seems arbitrary. Any reason why this number in particular? Nit is that the param name isCompaction is the name of a method that returns a boolean result. +1 on patch. [~eclark] Mr. Metrics, want to take a look see at this one? > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460176#comment-13460176 ] Cheng Hao commented on HBASE-6852: -- stack, do you mean I should submit the patch for 0.94 as well? > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460174#comment-13460174 ] Cheng Hao commented on HBASE-6852: -- It's quite similar with https://issues.apache.org/jira/browse/HBASE-6603, but per my testing, the 6603 doesn't improve that much in my case (full scan a table), but this fix did improve the performance a lot (it's 10% time shorter totally). > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460172#comment-13460172 ] Cheng Hao commented on HBASE-6852: -- yes, I ran the profiling in 0.94.0, but the patch is based on the trunk. it should also works for the later 0.94s. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460171#comment-13460171 ] stack commented on HBASE-6852: -- It doesn't look like it (after taking a look). > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460169#comment-13460169 ] stack commented on HBASE-6852: -- [~chenghao_sh] Is it 0.94.0 that you are running? [~lhofhansl] Did we fix these in later 0.94s? > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460161#comment-13460161 ] Hadoop QA commented on HBASE-6852: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545995/onhitcache-trunk.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2911//console This message is automatically generated. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Fix For: 0.96.0 > > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Fix Version/s: 0.96.0 Status: Patch Available (was: Open) After patch the fix, the oprofile shows the top 8 hotspots as: samples %image name app name symbol name --- 59829 7.9422 17779.jo java int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, i nt, byte[], int, int) 59829100.000 17779.jo java int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 28571 3.7927 17779.jo java int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.bin arySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator) 28571100.000 17779.jo java int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.b inarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator) [self] --- 19331 2.5662 17779.jo java org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode org.apach e.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue) 19331100.000 17779.jo java org.apache.hadoop.hbase.regionserver.ScanQueryMatcher$MatchCode org.apa che.hadoop.hbase.regionserver.ScanQueryMatcher.match(org.apache.hadoop.hbase.KeyValue) [self] --- 19063 2.5306 17779.jo java void org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek() 19063100.000 17779.jo java void org.apache.hadoop.hbase.regionserver.StoreFileScanner.enforceSeek( ) [self] --- 1 0.0054 libjvm.sojava Monitor::ILock(Thread*) 1 0.0054 libjvm.sojava ObjectMonitor::enter(Thread*) 2 0.0107 libjvm.sojava VMThread::loop() 1864299.9785 libjvm.sojava StealTask::do_it(GCTaskManager*, unsigned int) 18646 2.4752 libjvm.sojava SpinPause 18646100.000 libjvm.sojava SpinPause [self] --- 15860 2.1054 17779.jo java byte[] org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, int, byte[], int, int, byte[], int, int, long, org.apache.hadoop.hbase.KeyValue$Type, byte[], int, int) 15860100.000 17779.jo java byte[] org.apache.hadoop.hbase.KeyValue.createByteArray(byte[], int, in t, byte[], int, int, byte[], int, int, long, org.apache.hadoop.hbase.KeyValue$Type, byte[], int, int) [self] --- 14754 1.9586 17779.jo java org.apache.hadoop.hbase.io.hfile.Cacheable org.apache.hadoop.hbase.io.hfi le.LruBlockCache.getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, boolean) 14754100.000 17779.jo java org.apache.hadoop.hbase.io.hfile.Cacheable org.apache.hadoop.hbase.io.h file.LruBlockCache.getBlock(org.apache.hadoop.hbase.io.hfile.BlockCacheKey, boolean) [self] --- 13068 1.7348 17779.jo java org.apache.hadoop.hbase.io.hfile.HFileBlock org.apache.hadoop.hbase.io.hf ile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, org.apache.hadoop.hbase.io.hfile.HFileBlock, boolean, boolean, boolean )~2 13068100.000 17779.jo java org.apache.hadoop.hbase.io.hfile.HFileBlock org.apache.hadoop.hbase.io. hfile.HFileBlockIndex$BlockIndexReader.seekToDataBlock(byte[], int, int, org.apache.hadoop.hbase.io.hfile.HFileBlock, boolean, boolean, boole an)~2 [self] --- > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > --
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460158#comment-13460158 ] stack commented on HBASE-4191: -- What you think of Liyin's costing vs what you have in the Stochastic balancer Elliott (Do you think the HRegion#computeHDFSBlocksDistribution call will happen often? Seems like its value is cached for a period of time). > hbase load balancer needs locality awareness > > > Key: HBASE-4191 > URL: https://issues.apache.org/jira/browse/HBASE-4191 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Ted Yu >Assignee: Liyin Tang > > Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, > which provides the HFile level locality information. > But in order to work with load balancer and region assignment, we need the > region level locality information. > Let's define the region locality information first, which is almost the same > as HFile locality index. > HRegion locality index (HRegion A, RegionServer B) = > (Total number of HDFS blocks that can be retrieved locally by the > RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the > Region A) > So the HRegion locality index tells us that how much locality we can get if > the HMaster assign the HRegion A to the RegionServer B. > So there will be 2 steps involved to assign regions based on the locality. > 1) During the cluster start up time, the master will scan the hdfs to > calculate the "HRegion locality index" for each pair of HRegion and Region > Server. It is pretty expensive to scan the dfs. So we only needs to do this > once during the start up time. > 2) During the cluster run time, each region server will update the "HRegion > locality index" as metrics periodically as HBASE-4114 did. The Region Server > can expose them to the Master through ZK, meta table, or just RPC messages. > Based on the "HRegion locality index", the assignment manager in the master > would have a global knowledge about the region locality distribution and can > run the MIN COST MAXIMUM FLOW solver to reach the global optimization. > Let's construct the graph first: > [Graph] > Imaging there is a bipartite graph and the left side is the set of regions > and the right side is the set of region servers. > There is a source node which links itself to each node in the region set. > There is a sink node which is linked from each node in the region server set. > [Capacity] > The capacity between the source node and region nodes is 1. > And the capacity between the region nodes and region server nodes is also 1. > (The purpose is each region can ONLY be assigned to one region server at one > time) > The capacity between the region server nodes and sink node are the avg number > of regions which should be assigned each region server. > (The purpose is balance the load for each region server) > [Cost] > The cost between each region and region server is the opposite of locality > index, which means the higher locality is, if region A is assigned to region > server B, the lower cost it is. > The cost function could be more sophisticated when we put more metrics into > account. > So after running the min-cost max flow solver, the master could assign the > regions based on the global locality optimization. > Also the master should share this global view to secondary master in case the > master fail over happens. > In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on > the same metrics, to proactively to scan dfs to calculate the global locality > information in the cluster. It will help us to verify data locality > information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
[ https://issues.apache.org/jira/browse/HBASE-6852?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Cheng Hao updated HBASE-6852: - Attachment: onhitcache-trunk.patch The fix will cache the metrics and flush every 2000 calls, or the HFileReader closed. > SchemaMetrics.updateOnCacheHit costs too much while full scanning a table > with all of its fields > > > Key: HBASE-6852 > URL: https://issues.apache.org/jira/browse/HBASE-6852 > Project: HBase > Issue Type: Improvement > Components: metrics >Affects Versions: 0.94.0 >Reporter: Cheng Hao >Priority: Minor > Labels: performance > Attachments: onhitcache-trunk.patch > > > The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full > table scanning. > Here is the top 5 hotspots within regionserver while full scanning a table: > (Sorry for the less-well-format) > CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) > Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit > mask of 0x00 (No unit mask) count 500 > samples %image name symbol name > --- > 9844713.4324 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) > 98447100.000 14033.jo void > org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, > boolean) [self] > --- > 45814 6.2510 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) > 45814100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, > byte[], int, int) [self] > --- > 43523 5.9384 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > 43523100.000 14033.jo boolean > org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) > [self] > --- > 42548 5.8054 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) > 42548100.000 14033.jo int > org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, > byte[], int, int) [self] > --- > 40572 5.5358 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 > 40572100.000 14033.jo int > org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], > int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6852) SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields
Cheng Hao created HBASE-6852: Summary: SchemaMetrics.updateOnCacheHit costs too much while full scanning a table with all of its fields Key: HBASE-6852 URL: https://issues.apache.org/jira/browse/HBASE-6852 Project: HBase Issue Type: Improvement Components: metrics Affects Versions: 0.94.0 Reporter: Cheng Hao Priority: Minor The SchemaMetrics.updateOnCacheHit costs too much while I am doing the full table scanning. Here is the top 5 hotspots within regionserver while full scanning a table: (Sorry for the less-well-format) CPU: Intel Westmere microarchitecture, speed 2.262e+06 MHz (estimated) Counted CPU_CLK_UNHALTED events (Clock cycles when not halted) with a unit mask of 0x00 (No unit mask) count 500 samples %image name symbol name --- 9844713.4324 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) 98447100.000 14033.jo void org.apache.hadoop.hbase.regionserver.metrics.SchemaMetrics.updateOnCacheHit(org.apache.hadoop.hbase.io.hfile.BlockType$BlockCategory, boolean) [self] --- 45814 6.2510 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) 45814100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compareRows(byte[], int, int, byte[], int, int) [self] --- 43523 5.9384 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) 43523100.000 14033.jo boolean org.apache.hadoop.hbase.regionserver.StoreFileScanner.reseek(org.apache.hadoop.hbase.KeyValue) [self] --- 42548 5.8054 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) 42548100.000 14033.jo int org.apache.hadoop.hbase.KeyValue$KeyComparator.compare(byte[], int, int, byte[], int, int) [self] --- 40572 5.5358 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 40572100.000 14033.jo int org.apache.hadoop.hbase.io.hfile.HFileBlockIndex$BlockIndexReader.binarySearchNonRootIndex(byte[], int, int, java.nio.ByteBuffer, org.apache.hadoop.io.RawComparator)~1 [self] -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5959) Add other load balancers
[ https://issues.apache.org/jira/browse/HBASE-5959?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-5959: - Release Note: Added a new StochasticLoadBalancer that when enabled will perform a randomized search for the optimal cluster balance. The new balancer takes into account data locality, storefile size, memstore size, and the evenness of tables over region servers when trying potential new cluster states. To enable the new balancer set hbase.master.loadbalancer.class to org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer . It is also recommended to set hbase.master.loadbalance.bytable to false . Lots of different configuration options can be tuned to prioritize costs differently. Explanations of all of the configuration options are available on the JavaDoc for StochasticLoadBalancer. StochasticLoadBalancer is the default in 0.96.0 was: Added a new StochasticLoadBalancer that when enabled will perform a randomized search for the optimal cluster balance. The new balancer takes into account data locality, storefile size, memstore size, and the evenness of tables over region servers when trying potential new cluster states. To enable the new balancer set hbase.master.loadbalancer.class to org.apache.hadoop.hbase.master.balancer.StochasticLoadBalancer . It is also recommended to set hbase.master.loadbalance.bytable to false . Lots of different configuration options can be tuned to prioritize costs differently. Explanations of all of the configuration options are available on the JavaDoc for StochasticLoadBalancer. > Add other load balancers > > > Key: HBASE-5959 > URL: https://issues.apache.org/jira/browse/HBASE-5959 > Project: HBase > Issue Type: New Feature > Components: master >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark > Attachments: ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.1.patch, > ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.2.patch, > ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.3.patch, > ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.4.patch, > ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.5.patch, > ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.6.patch, > ASF.LICENSE.NOT.GRANTED--HBASE-5959.D3189.7.patch, HBASE-5959-0.patch, > HBASE-5959-11.patch, HBASE-5959-12.patch, HBASE-5959-13.patch, > HBASE-5959-14.patch, HBASE-5959-1.patch, HBASE-5959-2.patch, > HBASE-5959-3.patch, HBASE-5959-6.patch, HBASE-5959-7.patch, > HBASE-5959-8.patch, HBASE-5959-9.patch > > > Now that balancers are pluggable we should give some options. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-3663) The starvation problem in current load balance algorithm
[ https://issues.apache.org/jira/browse/HBASE-3663?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460153#comment-13460153 ] stack commented on HBASE-3663: -- [~liyin] Was this patch committed to 89fb? If so, can we close this? If not, can we close this because recent versions of hbase don't have this issue? Thanks. > The starvation problem in current load balance algorithm > > > Key: HBASE-3663 > URL: https://issues.apache.org/jira/browse/HBASE-3663 > Project: HBase > Issue Type: Bug >Reporter: Liyin Tang > Attachments: HBASE_3665[0.89].patch, result_new_load_balance.txt, > result_old_load_balance.txt > > > This is an interesting starvation case. There are 2 conditions to trigger > this problem. > Condition1: r/s - r/(s+1) << 1 > Let r: the number of regions > Let s: the number of servers > Condition2: for each server, the load of each server is less or equal the > ceil of avg load. > Here is the unit test to verify this problem: > For example, there are 16 servers and 62 regions. The avg load is > 3.875. And setting the slot to 0 to keep the load of each server either 3 or > 4. > When a new server is coming, no server needs to assign regions to this new > server, since no one is larger the ceil of the avg. > (Setting slot to 0 is to easily trigger this situation, otherwise it needs > much larger numbers) > Solutions is pretty straightforward. Just compare the floor of the avg > instead of the ceil. This solution will evenly balance the load from the > servers which is little more loaded than others. > I also attached the comparison result for the case mentioned above between > the old balance algorithm and new balance algorithm. (I set the slot = 0 when > testing) -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6798) HDFS always read checksum form meta file
[ https://issues.apache.org/jira/browse/HBASE-6798?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460151#comment-13460151 ] LiuLei commented on HBASE-6798: --- Hi all, if HDFS don't read checksum form meta file, that can dcrease iops for HFile, but HLog file of Hbase don't contain the checksum, so when HBase read the HLog, that must use checksum of HDFS, so we should add new setSkipChecksum(boolean) method in FileSystem, let HBase to deceid whether or not read the checksum from meta file. > HDFS always read checksum form meta file > > > Key: HBASE-6798 > URL: https://issues.apache.org/jira/browse/HBASE-6798 > Project: HBase > Issue Type: Bug > Components: performance >Affects Versions: 0.94.0, 0.94.1 >Reporter: LiuLei >Priority: Blocker > Attachments: 6798.txt > > > I use hbase0.941 and hadoop-0.20.2-cdh3u5 version. > The HBase support checksums in HBase block cache in HBASE-5074 jira. > The HBase support checksums for decrease the iops of HDFS, so that HDFS > dont't need to read the checksum from meta file of block file. > But in hadoop-0.20.2-cdh3u5 version, BlockSender still read the metadata file > even if the > hbase.regionserver.checksum.verify property is ture. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460131#comment-13460131 ] Elliott Clark commented on HBASE-4191: -- It seems like the stochastic load balancer gives HBase the locality awareness when balancing. > hbase load balancer needs locality awareness > > > Key: HBASE-4191 > URL: https://issues.apache.org/jira/browse/HBASE-4191 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Ted Yu >Assignee: Liyin Tang > > Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, > which provides the HFile level locality information. > But in order to work with load balancer and region assignment, we need the > region level locality information. > Let's define the region locality information first, which is almost the same > as HFile locality index. > HRegion locality index (HRegion A, RegionServer B) = > (Total number of HDFS blocks that can be retrieved locally by the > RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the > Region A) > So the HRegion locality index tells us that how much locality we can get if > the HMaster assign the HRegion A to the RegionServer B. > So there will be 2 steps involved to assign regions based on the locality. > 1) During the cluster start up time, the master will scan the hdfs to > calculate the "HRegion locality index" for each pair of HRegion and Region > Server. It is pretty expensive to scan the dfs. So we only needs to do this > once during the start up time. > 2) During the cluster run time, each region server will update the "HRegion > locality index" as metrics periodically as HBASE-4114 did. The Region Server > can expose them to the Master through ZK, meta table, or just RPC messages. > Based on the "HRegion locality index", the assignment manager in the master > would have a global knowledge about the region locality distribution and can > run the MIN COST MAXIMUM FLOW solver to reach the global optimization. > Let's construct the graph first: > [Graph] > Imaging there is a bipartite graph and the left side is the set of regions > and the right side is the set of region servers. > There is a source node which links itself to each node in the region set. > There is a sink node which is linked from each node in the region server set. > [Capacity] > The capacity between the source node and region nodes is 1. > And the capacity between the region nodes and region server nodes is also 1. > (The purpose is each region can ONLY be assigned to one region server at one > time) > The capacity between the region server nodes and sink node are the avg number > of regions which should be assigned each region server. > (The purpose is balance the load for each region server) > [Cost] > The cost between each region and region server is the opposite of locality > index, which means the higher locality is, if region A is assigned to region > server B, the lower cost it is. > The cost function could be more sophisticated when we put more metrics into > account. > So after running the min-cost max flow solver, the master could assign the > regions based on the global locality optimization. > Also the master should share this global view to secondary master in case the > master fail over happens. > In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on > the same metrics, to proactively to scan dfs to calculate the global locality > information in the cluster. It will help us to verify data locality > information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6491) add limit function at ClientScanner
[ https://issues.apache.org/jira/browse/HBASE-6491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460126#comment-13460126 ] Jieshan Bean commented on HBASE-6491: - @ronghai: Why not use PageFilter instead of adding this new method? > add limit function at ClientScanner > --- > > Key: HBASE-6491 > URL: https://issues.apache.org/jira/browse/HBASE-6491 > Project: HBase > Issue Type: New Feature > Components: Client >Affects Versions: 0.96.0 >Reporter: ronghai.ma >Assignee: ronghai.ma > Labels: patch > Fix For: 0.96.0 > > Attachments: ClientScanner.java, HBASE-6491.patch > > > Add a new method in ClientScanner to implement a function like LIMIT in MySQL. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-4191) hbase load balancer needs locality awareness
[ https://issues.apache.org/jira/browse/HBASE-4191?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-4191: - Component/s: Balancer > hbase load balancer needs locality awareness > > > Key: HBASE-4191 > URL: https://issues.apache.org/jira/browse/HBASE-4191 > Project: HBase > Issue Type: New Feature > Components: Balancer >Reporter: Ted Yu >Assignee: Liyin Tang > > Previously, HBASE-4114 implements the metrics for HFile HDFS block locality, > which provides the HFile level locality information. > But in order to work with load balancer and region assignment, we need the > region level locality information. > Let's define the region locality information first, which is almost the same > as HFile locality index. > HRegion locality index (HRegion A, RegionServer B) = > (Total number of HDFS blocks that can be retrieved locally by the > RegionServer B for the HRegion A) / ( Total number of the HDFS blocks for the > Region A) > So the HRegion locality index tells us that how much locality we can get if > the HMaster assign the HRegion A to the RegionServer B. > So there will be 2 steps involved to assign regions based on the locality. > 1) During the cluster start up time, the master will scan the hdfs to > calculate the "HRegion locality index" for each pair of HRegion and Region > Server. It is pretty expensive to scan the dfs. So we only needs to do this > once during the start up time. > 2) During the cluster run time, each region server will update the "HRegion > locality index" as metrics periodically as HBASE-4114 did. The Region Server > can expose them to the Master through ZK, meta table, or just RPC messages. > Based on the "HRegion locality index", the assignment manager in the master > would have a global knowledge about the region locality distribution and can > run the MIN COST MAXIMUM FLOW solver to reach the global optimization. > Let's construct the graph first: > [Graph] > Imaging there is a bipartite graph and the left side is the set of regions > and the right side is the set of region servers. > There is a source node which links itself to each node in the region set. > There is a sink node which is linked from each node in the region server set. > [Capacity] > The capacity between the source node and region nodes is 1. > And the capacity between the region nodes and region server nodes is also 1. > (The purpose is each region can ONLY be assigned to one region server at one > time) > The capacity between the region server nodes and sink node are the avg number > of regions which should be assigned each region server. > (The purpose is balance the load for each region server) > [Cost] > The cost between each region and region server is the opposite of locality > index, which means the higher locality is, if region A is assigned to region > server B, the lower cost it is. > The cost function could be more sophisticated when we put more metrics into > account. > So after running the min-cost max flow solver, the master could assign the > regions based on the global locality optimization. > Also the master should share this global view to secondary master in case the > master fail over happens. > In addition, the HBASE-4491 (Locality Checker) is the tool, which is based on > the same metrics, to proactively to scan dfs to calculate the global locality > information in the cluster. It will help us to verify data locality > information during the run time. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts
[ https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460096#comment-13460096 ] Hadoop QA commented on HBASE-6806: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545720/HBASE-6806-fix-examples.diff against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 139 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.master.TestSplitLogManager Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2910//console This message is automatically generated. > HBASE-4658 breaks backward compatibility / example scripts > -- > > Key: HBASE-6806 > URL: https://issues.apache.org/jira/browse/HBASE-6806 > Project: HBase > Issue Type: Bug > Components: thrift >Affects Versions: 0.94.0 >Reporter: Lukas > Attachments: HBASE-6806-fix-examples.diff > > > HBASE-4658 introduces the new 'attributes' argument as a non optional > parameter. This is not backward compatible and also breaks the code in the > example section. Resolution: Mark as 'optional' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460090#comment-13460090 ] Hudson commented on HBASE-6848: --- Integrated in HBase-TRUNK #3362 (See [https://builds.apache.org/job/HBase-TRUNK/3362/]) HBASE-6848 Make hbase-hadoop-compat findbugs clean (Revision 1388252) Result = SUCCESS stack : Files : * /hbase/trunk/dev-support/findbugs-exclude.xml * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilityFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilitySingletonFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsWrapper.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/MBeanSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/metrics/ReplicationMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSourceFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricHistogram.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricsExecutor.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6849) Make StochasticLoadBalancer the default
[ https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460089#comment-13460089 ] Hudson commented on HBASE-6849: --- Integrated in HBase-TRUNK #3362 (See [https://builds.apache.org/job/HBase-TRUNK/3362/]) HBASE-6849 Make StochasticLoadBalancer the default (Revision 1388267) Result = SUCCESS stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/LoadBalancerFactory.java > Make StochasticLoadBalancer the default > --- > > Key: HBASE-6849 > URL: https://issues.apache.org/jira/browse/HBASE-6849 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.96.0 > > Attachments: HBASE-6849-0.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460084#comment-13460084 ] Hudson commented on HBASE-6848: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/]) HBASE-6848 Make hbase-hadoop-compat findbugs clean (Revision 1388252) Result = FAILURE stack : Files : * /hbase/trunk/dev-support/findbugs-exclude.xml * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilityFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/CompatibilitySingletonFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsWrapper.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/metrics/MBeanSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/replication/regionserver/metrics/ReplicationMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSource.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/hbase/thrift/metrics/ThriftServerMetricsSourceFactory.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricHistogram.java * /hbase/trunk/hbase-hadoop-compat/src/main/java/org/apache/hadoop/metrics/MetricsExecutor.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop1-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/master/metrics/MasterMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/hbase/metrics/BaseMetricsSourceImpl.java * /hbase/trunk/hbase-hadoop2-compat/src/main/java/org/apache/hadoop/metrics2/util/MetricSampleQuantiles.java > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication
[ https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460082#comment-13460082 ] Hudson commented on HBASE-6847: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388161) Result = FAILURE jdcryans : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > HBASE-6649 broke replication > > > Key: HBASE-6847 > URL: https://issues.apache.org/jira/browse/HBASE-6847 > Project: HBase > Issue Type: Bug >Reporter: Jean-Daniel Cryans >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.92.3, 0.94.2, 0.96.0 > > Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch > > > After running with HBASE-6646 and replication enabled I encountered this: > {noformat} > 2012-09-17 20:04:08,111 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on > IOE: > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318, > entryStart=78641557, pos=78771200, end=78771200, edit=84 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > currentNbOperations:164529 and seenEntries:84 and size: 154068 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicating 84 > 2012-09-17 20:04:08,146 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for > position 78771200 in > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > 2012-09-17 20:04:08,158 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Removing 0 logs in the list: [] > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicated in total: 93234 > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200 > 2012-09-17 20:04:08,163 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in ReplicationSource, > currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > java.lang.IndexOutOfBoundsException > at java.io.DataInputStream.readFully(DataInputStream.java:175) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307) > {noformat} > There's something weird at the end of the file and it's killing replication. > We used to just retry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6677) Random ZooKeeper port in test can overrun max port
[ https://issues.apache.org/jira/browse/HBASE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460081#comment-13460081 ] Hudson commented on HBASE-6677: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/]) HBASE-6677 Random ZooKeeper port in test can overrun max port (Revision 1388125) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java > Random ZooKeeper port in test can overrun max port > -- > > Key: HBASE-6677 > URL: https://issues.apache.org/jira/browse/HBASE-6677 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 0.96.0 >Reporter: Gregory Chanan >Assignee: liang xie >Priority: Trivial > Labels: noob > Fix For: 0.96.0 > > Attachments: HBASE-6677.patch > > > {code} > while (true) { > try { > standaloneServerFactory = new NIOServerCnxnFactory(); > standaloneServerFactory.configure( > new InetSocketAddress(tentativePort), > configuration.getInt(HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS, > 1000)); > } catch (BindException e) { > LOG.debug("Failed binding ZK Server to client port: " + > tentativePort); > // This port is already in use, try to use another. > tentativePort++; > continue; > } > break; > } > {code} > In the case of failure and all the above ports have already been binded, you > can extend past the max port. Need to check against a max value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6849) Make StochasticLoadBalancer the default
[ https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460083#comment-13460083 ] Hudson commented on HBASE-6849: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/]) HBASE-6849 Make StochasticLoadBalancer the default (Revision 1388267) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/RegionStates.java * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/LoadBalancerFactory.java > Make StochasticLoadBalancer the default > --- > > Key: HBASE-6849 > URL: https://issues.apache.org/jira/browse/HBASE-6849 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.96.0 > > Attachments: HBASE-6849-0.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460080#comment-13460080 ] Hudson commented on HBASE-6649: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388161) Result = FAILURE jdcryans : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] > --- > > Key: HBASE-6649 > URL: https://issues.apache.org/jira/browse/HBASE-6649 > Project: HBase > Issue Type: Bug >Reporter: Devaraj Das >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.92.3, 0.94.2, 0.96.0 > > Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, > 6649-fix-io-exception-handling-1.patch, > 6649-fix-io-exception-handling-1-trunk.patch, > 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, > 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 > #502 test - queueFailover [Jenkins].html > > > Have seen it twice in the recent past: http://bit.ly/MPCykB & > http://bit.ly/O79Dq7 .. > Looking briefly at the logs hints at a pattern - in both the failed test > instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
[ https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460079#comment-13460079 ] Hudson commented on HBASE-6698: --- Integrated in HBase-TRUNK-on-Hadoop-2.0.0 #184 (See [https://builds.apache.org/job/HBase-TRUNK-on-Hadoop-2.0.0/184/]) HBASE-6698 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation (Priya) Submitted by:PrIya Reviewed by:Ram, Stack, Ted, Lars (Revision 1388141) Result = FAILURE ramkrishna : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java > Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation > -- > > Key: HBASE-6698 > URL: https://issues.apache.org/jira/browse/HBASE-6698 > Project: HBase > Issue Type: Improvement >Reporter: ramkrishna.s.vasudevan > Fix For: 0.96.0 > > Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, > HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, > HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, > HBASE-6698_7.patch, HBASE-6698_8.patch, HBASE-6698_8.patch, > HBASE-6698_8.patch, HBASE-6698.patch > > > Currently the checkAndPut and checkAndDelete api internally calls the > internalPut and internalDelete. May be we can just call doMiniBatchMutation > only. This will help in future like if we have some hooks and the CP > handles certain cases in the doMiniBatchMutation the same can be done while > doing a put thro checkAndPut or while doing a delete thro checkAndDelete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6850) REST implementation internals conflict with clients that use Jersey
[ https://issues.apache.org/jira/browse/HBASE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Purtell updated HBASE-6850: -- Summary: REST implementation internals conflict with clients that use Jersey (was: PlainTextMessageBodyProducer is dangerous) > REST implementation internals conflict with clients that use Jersey > --- > > Key: HBASE-6850 > URL: https://issues.apache.org/jira/browse/HBASE-6850 > Project: HBase > Issue Type: Bug > Components: Client, REST >Affects Versions: 0.94.1 >Reporter: Jonathan Leech > > - It is my understanding that there is one and only one hbase jar, which > includes > org.apache.hadoop.hbase.rest.provider.producer.PlainTextMessageBodyProducer, > which is only used in the REST / jersey server-side implementation. > - PlainTextMessageBodyProducer claims to provide a text/plain output for > absolutely any input by calling .toString() on it. > - If I am a client to HBase, and I do my own REST / jersey, including my own > custom text/plain writing, by default the jersey stack finds > PlainTextMessageBodyProducer and uses it instead of mine. > I could be off base here; so please feel free to change this from a Bug to a > Feature Request or close it, especially if my assumptions are wrong. > Workaround: set init-param of com.sun.jersey.config.property.packages to > limit it to my own packages. > Recommended fix: > - provide a client jar and / or a maven pom for hbase-client which doesn't > include server-side hbase code or dependencies. > and / or > - don't return true from isWriteable() for every possible input, or create a > different custom mime type that other users of the API might be also using, > and if possible map text/plain to that type in the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6850) PlainTextMessageBodyProducer is dangerous
[ https://issues.apache.org/jira/browse/HBASE-6850?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460074#comment-13460074 ] Andrew Purtell commented on HBASE-6850: --- IMO, this is not REST specific, but the larger issue of us packaging non client classes into a fatjar along with the client. > PlainTextMessageBodyProducer is dangerous > - > > Key: HBASE-6850 > URL: https://issues.apache.org/jira/browse/HBASE-6850 > Project: HBase > Issue Type: Bug > Components: Client, REST >Affects Versions: 0.94.1 >Reporter: Jonathan Leech > > - It is my understanding that there is one and only one hbase jar, which > includes > org.apache.hadoop.hbase.rest.provider.producer.PlainTextMessageBodyProducer, > which is only used in the REST / jersey server-side implementation. > - PlainTextMessageBodyProducer claims to provide a text/plain output for > absolutely any input by calling .toString() on it. > - If I am a client to HBase, and I do my own REST / jersey, including my own > custom text/plain writing, by default the jersey stack finds > PlainTextMessageBodyProducer and uses it instead of mine. > I could be off base here; so please feel free to change this from a Bug to a > Feature Request or close it, especially if my assumptions are wrong. > Workaround: set init-param of com.sun.jersey.config.property.packages to > limit it to my own packages. > Recommended fix: > - provide a client jar and / or a maven pom for hbase-client which doesn't > include server-side hbase code or dependencies. > and / or > - don't return true from isWriteable() for every possible input, or create a > different custom mime type that other users of the API might be also using, > and if possible map text/plain to that type in the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460069#comment-13460069 ] Lars Hofhansl commented on HBASE-6841: -- Not sure what the test issue is, yet. Also looking at the code again I notice the prefetchRegionLimit is already defaulted to 10. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6849) Make StochasticLoadBalancer the default
[ https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-6849. -- Resolution: Fixed Release Note: Makes the StochasticLoadBalancer the default. Hadoop Flags: Reviewed Committed to trunk. Lets try it. Can revert if its a mess before we release 0.96 (Weird you had to disable by table explicitly, apart from setting default balancer -- that looks broke to me that we're doing by table outside of the balancer). Thanks Elliott. > Make StochasticLoadBalancer the default > --- > > Key: HBASE-6849 > URL: https://issues.apache.org/jira/browse/HBASE-6849 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.96.0 > > Attachments: HBASE-6849-0.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6806) HBASE-4658 breaks backward compatibility / example scripts
[ https://issues.apache.org/jira/browse/HBASE-6806?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6806: - Status: Patch Available (was: Open) Passing by hadoopqa. Thanks Lukas. Let the php, perl, and ruby heads file an issue if broke. We'll take your fixes for the rest. > HBASE-4658 breaks backward compatibility / example scripts > -- > > Key: HBASE-6806 > URL: https://issues.apache.org/jira/browse/HBASE-6806 > Project: HBase > Issue Type: Bug > Components: thrift >Affects Versions: 0.94.0 >Reporter: Lukas > Attachments: HBASE-6806-fix-examples.diff > > > HBASE-4658 introduces the new 'attributes' argument as a non optional > parameter. This is not backward compatible and also breaks the code in the > example section. Resolution: Mark as 'optional' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6851) Race condition in TableAuthManager.updateGlobalCache()
Gary Helmling created HBASE-6851: Summary: Race condition in TableAuthManager.updateGlobalCache() Key: HBASE-6851 URL: https://issues.apache.org/jira/browse/HBASE-6851 Project: HBase Issue Type: Bug Components: security Affects Versions: 0.94.1, 0.96.0 Reporter: Gary Helmling Priority: Critical When new global permissions are assigned, there is a race condition, during which further authorization checks relying on global permissions may fail. In TableAuthManager.updateGlobalCache(), we have: {code:java} USER_CACHE.clear(); GROUP_CACHE.clear(); try { initGlobal(conf); } catch (IOException e) { // Never happens LOG.error("Error occured while updating the user cache", e); } for (Map.Entry entry : userPerms.entries()) { if (AccessControlLists.isGroupPrincipal(entry.getKey())) { GROUP_CACHE.put(AccessControlLists.getGroupName(entry.getKey()), new Permission(entry.getValue().getActions())); } else { USER_CACHE.put(entry.getKey(), new Permission(entry.getValue().getActions())); } } {code} If authorization checks come in following the .clear() but before repopulating, they will fail. We should have some synchronization here to serialize multiple updates and use a COW type rebuild and reassign of the new maps. This particular issue crept in with the fix in HBASE-6157, so I'm flagging for 0.94 and 0.96. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-6848: - Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Committed to trunk. Thanks Elliott > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460024#comment-13460024 ] Lars Hofhansl commented on HBASE-6841: -- Heh. TestHCM.testRegionCaching looks relevant :) Looking. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460023#comment-13460023 ] stack commented on HBASE-6841: -- +1 on patch but whats that TestHCM fail about? > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13460021#comment-13460021 ] Hadoop QA commented on HBASE-6841: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545967/6841-0.96.txt against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 139 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 14 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.client.TestHCM Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2909//console This message is automatically generated. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6669) Add BigDecimalColumnInterpreter for doing aggregations using AggregationClient
[ https://issues.apache.org/jira/browse/HBASE-6669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Wissmann updated HBASE-6669: --- Attachment: TestBDAggregateProtocol.patch > Add BigDecimalColumnInterpreter for doing aggregations using AggregationClient > -- > > Key: HBASE-6669 > URL: https://issues.apache.org/jira/browse/HBASE-6669 > Project: HBase > Issue Type: New Feature > Components: client, coprocessors >Reporter: Anil Gupta >Priority: Minor > Labels: client, coprocessors > Attachments: BigDecimalColumnInterpreter.java, > BigDecimalColumnInterpreter.patch, BigDecimalColumnInterpreter.patch, > TestBDAggregateProtocol.patch > > > I recently created a Class for doing aggregations(sum,min,max,std) on values > stored as BigDecimal in HBase. I would like to commit the > BigDecimalColumnInterpreter into HBase. In my opinion this class can be used > by a wide variety of users. Please let me know if its not appropriate to add > this class in HBase. > Thanks, > Anil Gupta > Software Engineer II, Intuit, Inc -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6850) PlainTextMessageBodyProducer is dangerous
Jonathan Leech created HBASE-6850: - Summary: PlainTextMessageBodyProducer is dangerous Key: HBASE-6850 URL: https://issues.apache.org/jira/browse/HBASE-6850 Project: HBase Issue Type: Bug Components: client, REST Affects Versions: 0.94.1 Reporter: Jonathan Leech - It is my understanding that there is one and only one hbase jar, which includes org.apache.hadoop.hbase.rest.provider.producer.PlainTextMessageBodyProducer, which is only used in the REST / jersey server-side implementation. - PlainTextMessageBodyProducer claims to provide a text/plain output for absolutely any input by calling .toString() on it. - If I am a client to HBase, and I do my own REST / jersey, including my own custom text/plain writing, by default the jersey stack finds PlainTextMessageBodyProducer and uses it instead of mine. I could be off base here; so please feel free to change this from a Bug to a Feature Request or close it, especially if my assumptions are wrong. Workaround: set init-param of com.sun.jersey.config.property.packages to limit it to my own packages. Recommended fix: - provide a client jar and / or a maven pom for hbase-client which doesn't include server-side hbase code or dependencies. and / or - don't return true from isWriteable() for every possible input, or create a different custom mime type that other users of the API might be also using, and if possible map text/plain to that type in the server. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6839) Operations may be executed without holding rowLock
[ https://issues.apache.org/jira/browse/HBASE-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans updated HBASE-6839: -- Fix Version/s: 0.92.3 Adding 0.92.3 as a target since Ted committed it there. > Operations may be executed without holding rowLock > -- > > Key: HBASE-6839 > URL: https://issues.apache.org/jira/browse/HBASE-6839 > Project: HBase > Issue Type: Bug > Components: regionserver >Reporter: chunhui shen >Assignee: chunhui shen >Priority: Critical > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6839.patch > > > HRegion#internalObtainRowLock will return null if timed out, > but many place which call this method don't handle this case > The bad result is operation will be executed even if it havn't obtained the > row lock. Such as put、delete、increment。。。 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-5974) Scanner retry behavior with RPC timeout on next() seems incorrect
[ https://issues.apache.org/jira/browse/HBASE-5974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-5974: - Fix Version/s: (was: 0.94.3) Let's do this correctly in 0.96 (where it is OK to break wire compatibility). Removing this from 0.94. This means we can pass the seqno as a proper field in the Scan object. > Scanner retry behavior with RPC timeout on next() seems incorrect > - > > Key: HBASE-5974 > URL: https://issues.apache.org/jira/browse/HBASE-5974 > Project: HBase > Issue Type: Bug > Components: client, regionserver >Affects Versions: 0.90.7, 0.92.1, 0.94.0, 0.96.0 >Reporter: Todd Lipcon >Assignee: Anoop Sam John >Priority: Critical > Fix For: 0.96.0 > > Attachments: 5974_94-V4.patch, 5974_trunk.patch, 5974_trunk-V2.patch, > HBASE-5974_0.94.patch, HBASE-5974_94-V2.patch, HBASE-5974_94-V3.patch > > > I'm seeing the following behavior: > - set RPC timeout to a short value > - call next() for some batch of rows, big enough so the client times out > before the result is returned > - the HConnectionManager stuff will retry the next() call to the same server. > At this point, one of two things can happen: 1) the previous next() call will > still be processing, in which case you get a LeaseException, because it was > removed from the map during the processing, or 2) the next() call will > succeed but skip the prior batch of rows. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6504) Adding GC details prevents HBase from starting in non-distributed mode
[ https://issues.apache.org/jira/browse/HBASE-6504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6504: - Fix Version/s: (was: 0.94.3) 0.94.2 > Adding GC details prevents HBase from starting in non-distributed mode > -- > > Key: HBASE-6504 > URL: https://issues.apache.org/jira/browse/HBASE-6504 > Project: HBase > Issue Type: Bug >Affects Versions: 0.94.0 >Reporter: Benoit Sigoure >Assignee: Michael Drzal >Priority: Trivial > Labels: noob > Fix For: 0.96.0, 0.94.2 > > Attachments: HBASE-6504-output.txt, HBASE-6504.patch, > HBASE-6504-v2.patch > > > The {{conf/hbase-env.sh}} that ships with HBase contains a few commented out > examples of variables that could be useful, such as adding > {{-XX:+PrintGCDetails -XX:+PrintGCDateStamps}} to {{HBASE_OPTS}}. This has > the annoying side effect that the JVM prints a summary of memory usage when > it exits, and it does so on stdout: > {code} > $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool > hbase.cluster.distributed > false > Heap > par new generation total 19136K, used 4908K [0x00073a20, > 0x00073b6c, 0x00075186) > eden space 17024K, 28% used [0x00073a20, 0x00073a6cb0a8, > 0x00073b2a) > from space 2112K, 0% used [0x00073b2a, 0x00073b2a, > 0x00073b4b) > to space 2112K, 0% used [0x00073b4b, 0x00073b4b, > 0x00073b6c) > concurrent mark-sweep generation total 63872K, used 0K [0x00075186, > 0x0007556c, 0x0007f5a0) > concurrent-mark-sweep perm gen total 21248K, used 6994K [0x0007f5a0, > 0x0007f6ec, 0x0008) > $ ./bin/hbase org.apache.hadoop.hbase.util.HBaseConfTool > hbase.cluster.distributed >/dev/null > (nothing printed) > {code} > And this confuses {{bin/start-hbase.sh}} when it does > {{distMode=`$bin/hbase --config "$HBASE_CONF_DIR" > org.apache.hadoop.hbase.util.HBaseConfTool hbase.cluster.distributed`}}, > because then the {{distMode}} variable is not just set to {{false}}, it also > contains all this JVM spam. > If you don't pay enough attention and realize that 3 processes are getting > started (ZK, HM, RS) instead of just one (HM), then you end up with this > confusing error message: > {{Could not start ZK at requested port of 2181. ZK was started at port: > 2182. Aborting as clients (e.g. shell) will not be able to find this ZK > quorum.}}, which is even more puzzling because when you run {{netstat}} to > see who owns that port, then you won't find any rogue process other than the > one you just started. > I'm wondering if the fix is not to just change the {{if [ "$distMode" == > 'false' ]}} to a {{switch $distMode case (false*)}} type of test, to work > around this annoying JVM misfeature that pollutes stdout. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6438) RegionAlreadyInTransitionException needs to give more info to avoid assignment inconsistencies
[ https://issues.apache.org/jira/browse/HBASE-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6438: - Fix Version/s: (was: 0.94.3) 0.94.2 > RegionAlreadyInTransitionException needs to give more info to avoid > assignment inconsistencies > -- > > Key: HBASE-6438 > URL: https://issues.apache.org/jira/browse/HBASE-6438 > Project: HBase > Issue Type: Bug >Reporter: ramkrishna.s.vasudevan >Assignee: rajeshbabu > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: 6438-0.92.txt, 6438.addendum, 6438-addendum.94, > 6438-trunk_2.patch, HBASE-6438_2.patch, HBASE-6438_94_3.patch, > HBASE-6438_94_4.patch, HBASE-6438_94.patch, HBASE-6438-trunk_2.patch, > HBASE-6438_trunk.patch > > > Seeing some of the recent issues in region assignment, > RegionAlreadyInTransitionException is one reason after which the region > assignment may or may not happen(in the sense we need to wait for the TM to > assign). > In HBASE-6317 we got one problem due to RegionAlreadyInTransitionException on > master restart. > Consider the following case, due to some reason like master restart or > external assign call, we try to assign a region that is already getting > opened in a RS. > Now the next call to assign has already changed the state of the znode and so > the current assign that is going on the RS is affected and it fails. The > second assignment that started also fails getting RAITE exception. Finally > both assignments not carrying on. Idea is to find whether any such RAITE > exception can be retried or not. > Here again we have following cases like where > -> The znode is yet to transitioned from OFFLINE to OPENING in RS > -> RS may be in the step of openRegion. > -> RS may be trying to transition OPENING to OPENED. > -> RS is yet to add to online regions in the RS side. > Here in openRegion() and updateMeta() any failures we are moving the znode to > FAILED_OPEN. So in these cases getting an RAITE should be ok. But in other > cases the assignment is stopped. > The idea is to just add the current state of the region assignment in the RIT > map in the RS side and using that info we can determine whether the > assignment can be retried or not on getting an RAITE. > Considering the current work going on in AM, pls do share if this is needed > atleast in the 0.92/0.94 versions? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6792) Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516
[ https://issues.apache.org/jira/browse/HBASE-6792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6792: - Fix Version/s: (was: 0.94.3) 0.94.2 > Remove interface audience annotations in 0.94/0.92 introduced by HBASE-6516 > --- > > Key: HBASE-6792 > URL: https://issues.apache.org/jira/browse/HBASE-6792 > Project: HBase > Issue Type: Sub-task >Reporter: Jonathan Hsieh >Assignee: Jonathan Hsieh > Fix For: 0.92.3, 0.94.2 > > Attachments: hbase-6792.patch > > > bq. An InterfaceAudience slipped into 0.94 here. It breaks 0.94 for older > versions of hadoop. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6841: - Assignee: Lars Hofhansl Status: Patch Available (was: Open) > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Assignee: Lars Hofhansl >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6841: - Attachment: 6841-0.96.txt 0.96 patch for Hadoop QA > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt, 6841-0.96.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459977#comment-13459977 ] Hudson commented on HBASE-6649: --- Integrated in HBase-0.92 #583 (See [https://builds.apache.org/job/HBase-0.92/583/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388159) Fixing the CHANGES.txt after 0.92.2's release and adding HBASE-6649 (Revision 1388157) Result = SUCCESS jdcryans : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java jdcryans : Files : * /hbase/branches/0.92/CHANGES.txt > [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] > --- > > Key: HBASE-6649 > URL: https://issues.apache.org/jira/browse/HBASE-6649 > Project: HBase > Issue Type: Bug >Reporter: Devaraj Das >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, > 6649-fix-io-exception-handling-1.patch, > 6649-fix-io-exception-handling-1-trunk.patch, > 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, > 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 > #502 test - queueFailover [Jenkins].html > > > Have seen it twice in the recent past: http://bit.ly/MPCykB & > http://bit.ly/O79Dq7 .. > Looking briefly at the logs hints at a pattern - in both the failed test > instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication
[ https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459978#comment-13459978 ] Hudson commented on HBASE-6847: --- Integrated in HBase-0.92 #583 (See [https://builds.apache.org/job/HBase-0.92/583/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388159) Result = SUCCESS jdcryans : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > HBASE-6649 broke replication > > > Key: HBASE-6847 > URL: https://issues.apache.org/jira/browse/HBASE-6847 > Project: HBase > Issue Type: Bug >Reporter: Jean-Daniel Cryans >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch > > > After running with HBASE-6646 and replication enabled I encountered this: > {noformat} > 2012-09-17 20:04:08,111 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on > IOE: > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318, > entryStart=78641557, pos=78771200, end=78771200, edit=84 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > currentNbOperations:164529 and seenEntries:84 and size: 154068 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicating 84 > 2012-09-17 20:04:08,146 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for > position 78771200 in > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > 2012-09-17 20:04:08,158 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Removing 0 logs in the list: [] > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicated in total: 93234 > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200 > 2012-09-17 20:04:08,163 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in ReplicationSource, > currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > java.lang.IndexOutOfBoundsException > at java.io.DataInputStream.readFully(DataInputStream.java:175) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307) > {noformat} > There's something weird at the end of the file and it's killing replication. > We used to just retry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459962#comment-13459962 ] Lars Hofhansl commented on HBASE-6841: -- Yeah the stuff we in HTable is a disaster (if you ask me... probably has to do with multiple threads using HTables that share the same HConnection, not sure), but I think that's for another patch. The meaning of the flag is not changed, just the default (unless I made mistake in the patch). > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459952#comment-13459952 ] stack commented on HBASE-6841: -- Its kinda ugly we have setRegionCachePrefetch up in HTable... Is the meaning of the enable flag up in this public API changed by this patch? Patch looks good otherwise. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lars Hofhansl updated HBASE-6841: - Attachment: 6841-0.94.txt Trivial patch. For 0.94. Looks like trunk has the same problem. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.94.2 > > Attachments: 6841-0.94.txt > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication
[ https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459924#comment-13459924 ] Hudson commented on HBASE-6847: --- Integrated in HBase-0.94 #476 (See [https://builds.apache.org/job/HBase-0.94/476/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388160) Result = FAILURE jdcryans : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > HBASE-6649 broke replication > > > Key: HBASE-6847 > URL: https://issues.apache.org/jira/browse/HBASE-6847 > Project: HBase > Issue Type: Bug >Reporter: Jean-Daniel Cryans >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch > > > After running with HBASE-6646 and replication enabled I encountered this: > {noformat} > 2012-09-17 20:04:08,111 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on > IOE: > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318, > entryStart=78641557, pos=78771200, end=78771200, edit=84 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > currentNbOperations:164529 and seenEntries:84 and size: 154068 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicating 84 > 2012-09-17 20:04:08,146 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for > position 78771200 in > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > 2012-09-17 20:04:08,158 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Removing 0 logs in the list: [] > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicated in total: 93234 > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200 > 2012-09-17 20:04:08,163 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in ReplicationSource, > currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > java.lang.IndexOutOfBoundsException > at java.io.DataInputStream.readFully(DataInputStream.java:175) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307) > {noformat} > There's something weird at the end of the file and it's killing replication. > We used to just retry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459923#comment-13459923 ] Hudson commented on HBASE-6649: --- Integrated in HBase-0.94 #476 (See [https://builds.apache.org/job/HBase-0.94/476/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388160) Result = FAILURE jdcryans : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] > --- > > Key: HBASE-6649 > URL: https://issues.apache.org/jira/browse/HBASE-6649 > Project: HBase > Issue Type: Bug >Reporter: Devaraj Das >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, > 6649-fix-io-exception-handling-1.patch, > 6649-fix-io-exception-handling-1-trunk.patch, > 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, > 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 > #502 test - queueFailover [Jenkins].html > > > Have seen it twice in the recent past: http://bit.ly/MPCykB & > http://bit.ly/O79Dq7 .. > Looking briefly at the logs hints at a pattern - in both the failed test > instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6649) [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1]
[ https://issues.apache.org/jira/browse/HBASE-6649?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459917#comment-13459917 ] Hudson commented on HBASE-6649: --- Integrated in HBase-TRUNK #3360 (See [https://builds.apache.org/job/HBase-TRUNK/3360/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388161) Result = FAILURE jdcryans : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > [0.92 UNIT TESTS] TestReplication.queueFailover occasionally fails [Part-1] > --- > > Key: HBASE-6649 > URL: https://issues.apache.org/jira/browse/HBASE-6649 > Project: HBase > Issue Type: Bug >Reporter: Devaraj Das >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: 6649-0.92.patch, 6649-1.patch, 6649-2.txt, > 6649-fix-io-exception-handling-1.patch, > 6649-fix-io-exception-handling-1-trunk.patch, > 6649-fix-io-exception-handling.patch, 6649-trunk.patch, 6649-trunk.patch, > 6649.txt, HBase-0.92 #495 test - queueFailover [Jenkins].html, HBase-0.92 > #502 test - queueFailover [Jenkins].html > > > Have seen it twice in the recent past: http://bit.ly/MPCykB & > http://bit.ly/O79Dq7 .. > Looking briefly at the logs hints at a pattern - in both the failed test > instances, there was an RS crash while the test was running. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6847) HBASE-6649 broke replication
[ https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459918#comment-13459918 ] Hudson commented on HBASE-6847: --- Integrated in HBase-TRUNK #3360 (See [https://builds.apache.org/job/HBase-TRUNK/3360/]) HBASE-6847 HBASE-6649 broke replication (Devaraj Das via JD) (Revision 1388161) Result = FAILURE jdcryans : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/replication/regionserver/ReplicationSource.java > HBASE-6649 broke replication > > > Key: HBASE-6847 > URL: https://issues.apache.org/jira/browse/HBASE-6847 > Project: HBase > Issue Type: Bug >Reporter: Jean-Daniel Cryans >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch > > > After running with HBASE-6646 and replication enabled I encountered this: > {noformat} > 2012-09-17 20:04:08,111 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on > IOE: > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318, > entryStart=78641557, pos=78771200, end=78771200, edit=84 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > currentNbOperations:164529 and seenEntries:84 and size: 154068 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicating 84 > 2012-09-17 20:04:08,146 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for > position 78771200 in > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > 2012-09-17 20:04:08,158 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Removing 0 logs in the list: [] > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicated in total: 93234 > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200 > 2012-09-17 20:04:08,163 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in ReplicationSource, > currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > java.lang.IndexOutOfBoundsException > at java.io.DataInputStream.readFully(DataInputStream.java:175) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307) > {noformat} > There's something weird at the end of the file and it's killing replication. > We used to just retry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459910#comment-13459910 ] Lars Hofhansl commented on HBASE-6841: -- I'd be OK with that. I'd also be worried that this is just a symptom. HConnectionManager.getConnection(...) and HConnection.close() should just do some reference counting rather than actually creating/destroying connection; which means sometime we're coming in there with a new Configuration every time...? And even that should be handled by the Configuration equivalence code we're using now. So if, in this case, we'd remove prefetching, we'd still have the expensive Connection setup every time. Then again and just to state the obvious, the prefetching is only useful for long lived connections and then only if these connections actually use a larg'ish portion of the prefetched entries (otherwise we're doing a lot of unnecessary work cache, and wasting memory). Let's just disable it by default. I guess we'd do that by reversing the meaning (and name) of regionCachePrefetchDisabledTables). Happy to make a patch if you folks agree. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.94.2 > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6841) Meta prefetching is slower than doing multiple meta lookups
[ https://issues.apache.org/jira/browse/HBASE-6841?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459893#comment-13459893 ] Andrew Purtell commented on HBASE-6841: --- Given J-D's observations, perhaps we should default meta prefetch to off, like Stack suggests. Some time ago I patched our private Frankenbase to allow disable of meta prefetch on a per table basis. This was because meta prefetch was causing heap limited MR clients to OOME, and for that particular application table prefetch wasn't helpful. > Meta prefetching is slower than doing multiple meta lookups > --- > > Key: HBASE-6841 > URL: https://issues.apache.org/jira/browse/HBASE-6841 > Project: HBase > Issue Type: Improvement >Reporter: Jean-Daniel Cryans >Priority: Critical > Fix For: 0.94.2 > > > I got myself into a situation where I needed to truncate a massive table > while it was getting hits and surprisingly the clients were not recovering. > What I see in the logs is that every time we prefetch .META. we setup a new > HConnection because we close it on the way out. It's awfully slow. > We should just turn it off or make it useful. jstacks coming up. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459891#comment-13459891 ] Hadoop QA commented on HBASE-6848: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12545942/HBASE-6848-0.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 hadoop2.0. The patch compiles against the hadoop 2.0 profile. -1 javadoc. The javadoc tool appears to have generated 139 warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 7 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop2-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-server.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop1-compat.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-common.html Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//artifact/trunk/patchprocess/newPatchFindbugsWarningshbase-hadoop-compat.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/2908//console This message is automatically generated. > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6677) Random ZooKeeper port in test can overrun max port
[ https://issues.apache.org/jira/browse/HBASE-6677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459878#comment-13459878 ] Hudson commented on HBASE-6677: --- Integrated in HBase-TRUNK #3359 (See [https://builds.apache.org/job/HBase-TRUNK/3359/]) HBASE-6677 Random ZooKeeper port in test can overrun max port (Revision 1388125) Result = FAILURE stack : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/zookeeper/MiniZooKeeperCluster.java > Random ZooKeeper port in test can overrun max port > -- > > Key: HBASE-6677 > URL: https://issues.apache.org/jira/browse/HBASE-6677 > Project: HBase > Issue Type: Bug > Components: test >Affects Versions: 0.96.0 >Reporter: Gregory Chanan >Assignee: liang xie >Priority: Trivial > Labels: noob > Fix For: 0.96.0 > > Attachments: HBASE-6677.patch > > > {code} > while (true) { > try { > standaloneServerFactory = new NIOServerCnxnFactory(); > standaloneServerFactory.configure( > new InetSocketAddress(tentativePort), > configuration.getInt(HConstants.ZOOKEEPER_MAX_CLIENT_CNXNS, > 1000)); > } catch (BindException e) { > LOG.debug("Failed binding ZK Server to client port: " + > tentativePort); > // This port is already in use, try to use another. > tentativePort++; > continue; > } > break; > } > {code} > In the case of failure and all the above ports have already been binded, you > can extend past the max port. Need to check against a max value. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6698) Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation
[ https://issues.apache.org/jira/browse/HBASE-6698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459877#comment-13459877 ] Hudson commented on HBASE-6698: --- Integrated in HBase-TRUNK #3359 (See [https://builds.apache.org/job/HBase-TRUNK/3359/]) HBASE-6698 Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation (Priya) Submitted by:PrIya Reviewed by:Ram, Stack, Ted, Lars (Revision 1388141) Result = FAILURE ramkrishna : Files : * /hbase/trunk/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/HRegion.java * /hbase/trunk/hbase-server/src/test/java/org/apache/hadoop/hbase/regionserver/TestRegionServerMetrics.java > Refactor checkAndPut and checkAndDelete to use doMiniBatchMutation > -- > > Key: HBASE-6698 > URL: https://issues.apache.org/jira/browse/HBASE-6698 > Project: HBase > Issue Type: Improvement >Reporter: ramkrishna.s.vasudevan > Fix For: 0.96.0 > > Attachments: HBASE-6698_1.patch, HBASE-6698_2.patch, > HBASE-6698_3.patch, HBASE-6698_5.patch, HBASE-6698_6.patch, > HBASE-6698_6.patch, HBASE-6698_6.patch, HBASE-6698_6.patch, > HBASE-6698_7.patch, HBASE-6698_8.patch, HBASE-6698_8.patch, > HBASE-6698_8.patch, HBASE-6698.patch > > > Currently the checkAndPut and checkAndDelete api internally calls the > internalPut and internalDelete. May be we can just call doMiniBatchMutation > only. This will help in future like if we have some hooks and the CP > handles certain cases in the doMiniBatchMutation the same can be done while > doing a put thro checkAndPut or while doing a delete thro checkAndDelete. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459866#comment-13459866 ] Elliott Clark commented on HBASE-6410: -- Since this will probably end up being a larger patch I'm going to try and keep all the work on github. https://github.com/elliottneilclark/hbase/tree/HBASE-6410 > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-1.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6410: - Assignee: Elliott Clark (was: Alex Baranau) > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Blocker > Attachments: HBASE-6410-1.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Resolved] (HBASE-6847) HBASE-6649 broke replication
[ https://issues.apache.org/jira/browse/HBASE-6847?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jean-Daniel Cryans resolved HBASE-6847. --- Resolution: Fixed Hadoop Flags: Reviewed Committed to 0.92, 0.94 and trunk. Thanks Devaraj! > HBASE-6649 broke replication > > > Key: HBASE-6847 > URL: https://issues.apache.org/jira/browse/HBASE-6847 > Project: HBase > Issue Type: Bug >Reporter: Jean-Daniel Cryans >Assignee: Devaraj Das >Priority: Blocker > Fix For: 0.96.0, 0.92.3, 0.94.2 > > Attachments: HBASE-6847-0.94.patch, HBASE-6847.patch > > > After running with HBASE-6646 and replication enabled I encountered this: > {noformat} > 2012-09-17 20:04:08,111 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78617132 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Break on > IOE: > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318, > entryStart=78641557, pos=78771200, end=78771200, edit=84 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > currentNbOperations:164529 and seenEntries:84 and size: 154068 > 2012-09-17 20:04:08,120 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicating 84 > 2012-09-17 20:04:08,146 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Going to report log #va1r3s24%2C10304%2C1347911704238.1347911706318 for > position 78771200 in > hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > 2012-09-17 20:04:08,158 INFO > org.apache.hadoop.hbase.replication.regionserver.ReplicationSourceManager: > Removing 0 logs in the list: [] > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Replicated in total: 93234 > 2012-09-17 20:04:08,158 DEBUG > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: Opening > log for replication va1r3s24%2C10304%2C1347911704238.1347911706318 at 78771200 > 2012-09-17 20:04:08,163 ERROR > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource: > Unexpected exception in ReplicationSource, > currentPath=hdfs://va1r5s41:10101/va1-backup/.logs/va1r3s24,10304,1347911704238/va1r3s24%2C10304%2C1347911704238.1347911706318 > java.lang.IndexOutOfBoundsException > at java.io.DataInputStream.readFully(DataInputStream.java:175) > at > org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:63) > at > org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:101) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:2001) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1901) > at > org.apache.hadoop.io.SequenceFile$Reader.next(SequenceFile.java:1947) > at > org.apache.hadoop.hbase.regionserver.wal.SequenceFileLogReader.next(SequenceFileLogReader.java:235) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.readAllEntriesToReplicateOrNextFile(ReplicationSource.java:394) > at > org.apache.hadoop.hbase.replication.regionserver.ReplicationSource.run(ReplicationSource.java:307) > {noformat} > There's something weird at the end of the file and it's killing replication. > We used to just retry. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6849) Make StochasticLoadBalancer the default
[ https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6849: - Attachment: HBASE-6849-0.patch * Made the Stochastic LoadBalancer the default. * Turned off by table load balancing since that messes up the StochasticBalancer. > Make StochasticLoadBalancer the default > --- > > Key: HBASE-6849 > URL: https://issues.apache.org/jira/browse/HBASE-6849 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.96.0 > > Attachments: HBASE-6849-0.patch > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459851#comment-13459851 ] Elliott Clark commented on HBASE-6410: -- Yep I should have some time for this. Thanks for all of the work it's really in a good place. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Alex Baranau >Priority: Blocker > Attachments: HBASE-6410-1.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459849#comment-13459849 ] Elliott Clark commented on HBASE-6848: -- Yes that's a bug fix that I found through findbugs. Findbugs was alerting that ritOldestAgeGauge wasn't being used. > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6410) Move RegionServer Metrics to metrics2
[ https://issues.apache.org/jira/browse/HBASE-6410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alex Baranau updated HBASE-6410: Attachment: HBASE-6410-1.patch Updated patch with respect to latest changes in common classes (and the fixed HBASE-6501). Which look good. With factories for metrics sources this looks closer to what I suggested during the initial discussion. Anyhow, this is what left: * replacing (i.e. removing) old Rs Metrics classes * adding more metrics in new RS MetricsSource Elliott, if you have time for the above and have time to complete it as a part of this JIRA issue, feel free to take it from here.. I definitely don't want to be a stopper. > Move RegionServer Metrics to metrics2 > - > > Key: HBASE-6410 > URL: https://issues.apache.org/jira/browse/HBASE-6410 > Project: HBase > Issue Type: Sub-task > Components: metrics >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Alex Baranau >Priority: Blocker > Attachments: HBASE-6410-1.patch, HBASE-6410.patch > > > Move RegionServer Metrics to metrics2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459846#comment-13459846 ] stack commented on HBASE-6848: -- This intended: {code} -ritCountOverThresholdGauge.set(ritCount); +ritOldestAgeGauge.set(ritCount); {code} Here too... {code} -ritCountOverThresholdGauge.set(ritCount); +ritOldestAgeGauge.set(ritCount); {code} Looks like bug fix? Else patch looks good. > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6849) Make StochasticLoadBalancer the default
[ https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6849: - Component/s: master > Make StochasticLoadBalancer the default > --- > > Key: HBASE-6849 > URL: https://issues.apache.org/jira/browse/HBASE-6849 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.96.0 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6849) Make StochasticLoadBalancer the default
[ https://issues.apache.org/jira/browse/HBASE-6849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6849: - Fix Version/s: 0.96.0 > Make StochasticLoadBalancer the default > --- > > Key: HBASE-6849 > URL: https://issues.apache.org/jira/browse/HBASE-6849 > Project: HBase > Issue Type: Improvement > Components: master >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 0.96.0 > > -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6789) Convert test CoprocessorProtocol implementations to protocol buffer services
[ https://issues.apache.org/jira/browse/HBASE-6789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459842#comment-13459842 ] Andrew Purtell commented on HBASE-6789: --- I'd be +1 with dropping CoprocessorProtocol from 0.96 and up, given all of the other (deliberate) incompatibilities posed with RPC going from 0.94 to 0.96 and up. > Convert test CoprocessorProtocol implementations to protocol buffer services > > > Key: HBASE-6789 > URL: https://issues.apache.org/jira/browse/HBASE-6789 > Project: HBase > Issue Type: Sub-task > Components: coprocessors >Reporter: Gary Helmling > Fix For: 0.96.0 > > > With coprocessor endpoints now exposed as protobuf defined services, we > should convert over all of our built-in endpoints to PB services. > Several CoprocessorProtocol implementations are defined for tests: > * ColumnAggregationProtocol > * GenericProtocol > * TestServerCustomProtocol.PingProtocol > These should either be converted to PB services or removed if they duplicate > other tests/are no longer necessary. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6524) Hooks for hbase tracing
[ https://issues.apache.org/jira/browse/HBASE-6524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459838#comment-13459838 ] stack commented on HBASE-6524: -- Ok I integrate your doc into the book? > Hooks for hbase tracing > --- > > Key: HBASE-6524 > URL: https://issues.apache.org/jira/browse/HBASE-6524 > Project: HBase > Issue Type: Sub-task >Reporter: Jonathan Leavitt >Assignee: Jonathan Leavitt > Fix For: 0.96.0 > > Attachments: 6524.addendum, 6524-v2.txt, 6524v3.txt, > createTableTrace.png, hbase-6524.diff > > > Includes the hooks that use [htrace|http://www.github.com/cloudera/htrace] > library to add dapper-like tracing to hbase. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6848: - Status: Patch Available (was: Open) > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (HBASE-6849) Make StochasticLoadBalancer the default
Elliott Clark created HBASE-6849: Summary: Make StochasticLoadBalancer the default Key: HBASE-6849 URL: https://issues.apache.org/jira/browse/HBASE-6849 Project: HBase Issue Type: Improvement Reporter: Elliott Clark Assignee: Elliott Clark -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-6702) ResourceChecker refinement
[ https://issues.apache.org/jira/browse/HBASE-6702?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13459841#comment-13459841 ] stack commented on HBASE-6702: -- Thanks [~nkeywal] > ResourceChecker refinement > -- > > Key: HBASE-6702 > URL: https://issues.apache.org/jira/browse/HBASE-6702 > Project: HBase > Issue Type: Improvement > Components: test >Affects Versions: 0.96.0 >Reporter: Jesse Yates >Priority: Critical > Fix For: 0.96.0 > > > This was based on some discussion from HBASE-6234. > The ResourceChecker was added by N. Keywal to help resolve some hadoop qa > issues, but has since not be widely utilized. Further, with modularization we > have had to drop the ResourceChecker from the tests that are moved into the > hbase-common module because bringing the ResourceChecker up to hbase-common > would involved bringing all its dependencies (which are quite far reaching). > The question then is, what should we do with it? Get rid of it? Refactor and > resuse? -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HBASE-6848) Make hbase-hadoop-compat findbugs clean
[ https://issues.apache.org/jira/browse/HBASE-6848?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Elliott Clark updated HBASE-6848: - Attachment: HBASE-6848-0.patch I don't think that any of the issues were very big, but it's always nice to keep things clean. > Make hbase-hadoop-compat findbugs clean > --- > > Key: HBASE-6848 > URL: https://issues.apache.org/jira/browse/HBASE-6848 > Project: HBase > Issue Type: Improvement >Affects Versions: 0.96.0 >Reporter: Elliott Clark >Assignee: Elliott Clark >Priority: Minor > Fix For: 0.96.0 > > Attachments: HBASE-6848-0.patch > > > There are a few findbugs errors in hbase-hadoop-compat, hbase-hadoop1-compat, > and hbase-hadoop2-compat. Lets fix these up; since these are new modules it > would be nice to keep them with 0 findbugs errors. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira