[jira] [Commented] (HBASE-16610) Unify append, increment with AP
[ https://issues.apache.org/jira/browse/HBASE-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486332#comment-15486332 ] Hadoop QA commented on HBASE-16610: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 16s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 42s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 20s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 59s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 41s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 12s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 11s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 25s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 125m 52s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.regionserver.TestScannerHeartbeatMessages | | | hadoop.hbase.client.TestMultiParallel | | | hadoop.hbase.client.TestHCM | | Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient | | | org.apache.hadoop.hbase.client.TestFromClientSide | | | org.apache.hadoop.hbase.client.TestIncrementFromClientSideWithCoprocessor | | | org.apache.hadoop.hbase.client.TestMobCloneSnapshotFromClient | | | org.apache.hadoop.hbase.client.TestMobSnapshotCloneIndependence | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828167/HBASE-16610.v1.patch | | JIRA Issue | HBASE-16610 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 5f9505644894 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64
[jira] [Comment Edited] (HBASE-16388) Prevent client threads being blocked by only one slow region server
[ https://issues.apache.org/jira/browse/HBASE-16388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486310#comment-15486310 ] Phil Yang edited comment on HBASE-16388 at 9/13/16 5:30 AM: If a user connect to a cluster with N region servers, we expected that if one region server is slow and all requests to it timed out, at a higher level which can monitor the service using HBase client, there should be only 1/N requests failed/timeout. If the failed requests are more than 1/N, especially much more than 1/N, the limit should be change to a lower number. And if there is no slow server in the cluster, but clients throw SBE, there may be two reasons: the limit is too strict, or the regions are not balanced and much more than 1/N keys belong to the server. So the proper limit should be several times of 1/N*threadNumber, too high may reduce the availability when there are slow servers, too low may prevent normal requests. And a relatively balanced cluster is required. was (Author: yangzhe1991): If a user connect to a cluster with N region servers, we expected that if one region server is slow and all requests to it timed out, at a higher level which can monitor the service using HBase client, there should be only 1/N requests failed/timeout. If the failed requests are more than 1/N, especially much more than 1/N, the limit should be change to a lower number. And if there is no slow server in the cluster, but clients throw SBE, there may be two reasons: the limit is too strict, or the regions are not balanced and much more than 1/N keys belong to the server. So the proper limit should be several times of 1/N*threadNumber, too high may reduce the availability when there are slow servers, too low may prevent normal requests. > Prevent client threads being blocked by only one slow region server > --- > > Key: HBASE-16388 > URL: https://issues.apache.org/jira/browse/HBASE-16388 > Project: HBase > Issue Type: New Feature >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16388-branch-1-v1.patch, HBASE-16388-v1.patch, > HBASE-16388-v2.patch, HBASE-16388-v2.patch, HBASE-16388-v2.patch, > HBASE-16388-v2.patch > > > It is a general use case for HBase's users that they have several > threads/handlers in their service, and each handler has its own Table/HTable > instance. Generally users think each handler is independent and won't > interact each other. > However, in an extreme case, if a region server is very slow, every requests > to this RS will timeout, handlers of users' service may be occupied by the > long-waiting requests even requests belong to other RS will also be timeout. > For example: > If we have 100 handlers in a client service(timeout is 1000ms) and HBase has > 10 region servers whose average response time is 50ms. If no region server is > slow, we can handle 2000 requests per second. > Now this service's QPS is 1000. If there is one region server very slow and > all requests to it will be timeout. Users hope that only 10% requests failed, > and 90% requests' response time is still 50ms, because only 10% requests are > located to the slow RS. However, each second we have 100 long-waiting > requests which exactly occupies all 100 handles. So all handlers is blocked, > the availability of this service is almost zero. > To prevent this case, we can limit the max concurrent requests to one RS in > process-level. Requests exceeding the limit will throws > ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above > case, if we set this limit to 20, only 20 handlers will be occupied and other > 80 handlers can still handle requests to other RS. The availability of this > service is 90% as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16381) Shell deleteall command should support row key prefixes
[ https://issues.apache.org/jira/browse/HBASE-16381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486311#comment-15486311 ] Jerry He commented on HBASE-16381: -- {code} +if list.size >= 100 + @table.delete(list) + list = java.util.ArrayList.new +end {code} Should you re-use the list? Like send the list of delete, clear the list, refill the list, send again? > Shell deleteall command should support row key prefixes > --- > > Key: HBASE-16381 > URL: https://issues.apache.org/jira/browse/HBASE-16381 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Andrew Purtell >Assignee: Yi Liang >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16381-V1.patch, HBASE-16381-V2.patch, > HBASE-16381-V3.patch, HBASE-16381-V4.patch > > > The shell's deleteall command should support deleting a row range using a row > key prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16388) Prevent client threads being blocked by only one slow region server
[ https://issues.apache.org/jira/browse/HBASE-16388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486310#comment-15486310 ] Phil Yang commented on HBASE-16388: --- If a user connect to a cluster with N region servers, we expected that if one region server is slow and all requests to it timed out, at a higher level which can monitor the service using HBase client, there should be only 1/N requests failed/timeout. If the failed requests are more than 1/N, especially much more than 1/N, the limit should be change to a lower number. And if there is no slow server in the cluster, but clients throw SBE, there may be two reasons: the limit is too strict, or the regions are not balanced and much more than 1/N keys belong to the server. So the proper limit should be several times of 1/N*threadNumber, too high may reduce the availability when there are slow servers, too low may prevent normal requests. > Prevent client threads being blocked by only one slow region server > --- > > Key: HBASE-16388 > URL: https://issues.apache.org/jira/browse/HBASE-16388 > Project: HBase > Issue Type: New Feature >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16388-branch-1-v1.patch, HBASE-16388-v1.patch, > HBASE-16388-v2.patch, HBASE-16388-v2.patch, HBASE-16388-v2.patch, > HBASE-16388-v2.patch > > > It is a general use case for HBase's users that they have several > threads/handlers in their service, and each handler has its own Table/HTable > instance. Generally users think each handler is independent and won't > interact each other. > However, in an extreme case, if a region server is very slow, every requests > to this RS will timeout, handlers of users' service may be occupied by the > long-waiting requests even requests belong to other RS will also be timeout. > For example: > If we have 100 handlers in a client service(timeout is 1000ms) and HBase has > 10 region servers whose average response time is 50ms. If no region server is > slow, we can handle 2000 requests per second. > Now this service's QPS is 1000. If there is one region server very slow and > all requests to it will be timeout. Users hope that only 10% requests failed, > and 90% requests' response time is still 50ms, because only 10% requests are > located to the slow RS. However, each second we have 100 long-waiting > requests which exactly occupies all 100 handles. So all handlers is blocked, > the availability of this service is almost zero. > To prevent this case, we can limit the max concurrent requests to one RS in > process-level. Requests exceeding the limit will throws > ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above > case, if we set this limit to 20, only 20 handlers will be occupied and other > 80 handlers can still handle requests to other RS. The availability of this > service is 90% as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14123) HBase Backup/Restore Phase 2
[ https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486295#comment-15486295 ] Hadoop QA commented on HBASE-14123: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 42m 48s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 5s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:blue}0{color} | {color:blue} shelldocs {color} | {color:blue} 0m 9s {color} | {color:blue} Shelldocs was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 46 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 43s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 18m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 54s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 50s {color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 4m 31s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 13s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 7m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 6m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} cc {color} | {color:green} 6m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 6m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 19m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 2m 51s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} shellcheck {color} | {color:green} 0m 6s {color} | {color:green} There were no new shellcheck issues. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 1s {color} | {color:red} The patch has 538 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 18s {color} | {color:red} The patch 1 line(s) with tabs. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 4s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 39m 19s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue} 0m 0s {color} | {color:blue} Skipped patched modules with no Java source: . {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 3m 2s {color} | {color:red} hbase-server generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 0m 27s {color} | {color:red} hbase-client generated 4 new + 14 unchanged - 0 fixed = 18 total (was 14) {color} | | {color:red}-1{color} | {color:red} javadoc {color} | {color:red} 2m 49s {color} | {color:red} root generated 4 new + 20 unchanged - 0 fixed = 24 total (was 20) {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 36s {color} | {color:green} hbase-protocol in the patch passed. {color} | | {color:green}+1{color} |
[jira] [Commented] (HBASE-16604) Scanner retries on IOException can cause the scans to miss data
[ https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486292#comment-15486292 ] Devaraj Das commented on HBASE-16604: - Very good find, [~enis]. On the patch, just a thought - should we treat the ScannerResetException the same way as the UnknownScannerException (in terms of checking timeout) in ClientScanner. That way, if the client's scannertimeout has expired, the client gets back an exception. Saying this, because if the IOException happened due to an underlying filesystem issue, the data might be unavailable for a longer duration (which might cause other bigger issues but still), and multiple retries may or may not help... > Scanner retries on IOException can cause the scans to miss data > > > Key: HBASE-16604 > URL: https://issues.apache.org/jira/browse/HBASE-16604 > Project: HBase > Issue Type: Bug > Components: regionserver, Scanners >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4 > > Attachments: hbase-16604_v1.patch > > > Debugging an ITBLL failure, where the Verify did not "see" all the data in > the cluster, I've noticed that if we end up getting a generic IOException > from the HFileReader level, we may end up missing the rest of the data in the > region. I was able to manually test this, and this stack trace helps to > understand what is going on: > {code} > 2016-09-09 16:27:15,633 INFO [hconnection-0x71ad3d8a-shared--pool21-t9] > client.ScannerCallable(376): Open scanner=1 for > scan={"loadColumnFamiliesOnDemand":null,"startRow":"","stopRow":"","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":2097152,"families":{"testFamily":["testFamily"]},"caching":100,"maxVersions":1,"timeRange":[0,9223372036854775807]} > on region > region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee., > hostname=hw10676,51833,1473463626529, seqNum=2 > 2016-09-09 16:27:15,634 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: > 100 close_scanner: false next_call_seq: 0 client_handles_partials: true > client_handles_heartbeats: true renew: false > 2016-09-09 16:27:15,635 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2510): Rolling back next call seqId > 2016-09-09 16:27:15,635 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2565): Throwing new > ServiceExceptionjava.io.IOException: Could not reseek > StoreFileScanner[HFileScanner for reader > reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c, > compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, > currentSize=1567264, freeSize=1525578848, maxSize=1527146112, > heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, > multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, > cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, > cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, > prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, > lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, > avgValueLen=3, entries=17576, length=866998, > cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key > /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0 > 2016-09-09 16:27:15,635 DEBUG > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] ipc.CallRunner(110): > B.fifo.QRpcServer.handler=5,queue=0,port=51833: callId: 26 service: > ClientService methodName: Scan size: 26 connection: 192.168.42.75:51903 > java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for > reader > reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c, > compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, > currentSize=1567264, freeSize=1525578848, maxSize=1527146112, > heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, > multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, > cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, > cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, > prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, > lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, > avgValueLen=3, entries=17576, length=866998, >
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486284#comment-15486284 ] Jerry He commented on HBASE-16257: -- bq. Maybe extract the subdirs away, and add getters so that rest of the code does not have to deal with parsing the conf again: Do you mean to do something similar to the temp dir? {code} this.tempdir = new Path(this.rootdir, HConstants.HBASE_TEMP_DIRECTORY); ... /** * @return HBase temp dir. */ public Path getTempDir() { return this.tempdir; } {code} > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486278#comment-15486278 ] Jerry He commented on HBASE-16257: -- bq. We create this dir with default perms. If there is a way to get the default perms, and after ensuring that we have x, setting it to root should be fine. [~enis] This would work for you? As the last step: {code} FsPermission currentRootPerms = fs.getFileStatus(this.rootdir).getPermission(); if (!currentRootPerms.getUserAction().implies(FsAction.EXECUTE) || !currentRootPerms.getGroupAction().implies(FsAction.EXECUTE) || !currentRootPerms.getOtherAction().implies(FsAction.EXECUTE)) { LOG.warn("rootdir permissions do not contain 'excute' for user, group or other. " + "Automatically adding 'excute' permission for all"); fs.setPermission( this.rootdir, new FsPermission(currentRootPerms.getUserAction().or(FsAction.EXECUTE), currentRootPerms .getGroupAction().or(FsAction.EXECUTE), currentRootPerms.getOtherAction().or( FsAction.EXECUTE))); } {code} > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16611) Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet
[ https://issues.apache.org/jira/browse/HBASE-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486279#comment-15486279 ] stack commented on HBASE-16611: --- Patch looks good. Keep running it here and if it passes, +1 on commit. Thanks [~chenheng] > Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet > - > > Key: HBASE-16611 > URL: https://issues.apache.org/jira/browse/HBASE-16611 > Project: HBase > Issue Type: Bug >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16611.patch, HBASE-16611.v1.patch, > HBASE-16611.v1.patch, HBASE-16611.v2.patch > > > see > https://builds.apache.org/job/PreCommit-HBASE-Build/3494/artifact/patchprocess/patch-unit-hbase-server.txt > {code} > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.026 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 94.401 sec - > in org.apache.hadoop.hbase.client.TestAdmin2 > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.861 sec - > in org.apache.hadoop.hbase.client.TestClientScannerRPCTimeout > Running > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 261.925 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.522 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:581) > Running org.apache.hadoop.hbase.client.TestFastFail > Tests run: 2, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 3.648 sec - > in org.apache.hadoop.hbase.client.TestFastFail > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 277.894 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 5.359 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16615) Fix flaky TestScannerHeartbeatMessages
[ https://issues.apache.org/jira/browse/HBASE-16615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486276#comment-15486276 ] stack commented on HBASE-16615: --- Skimmed. Looks good. Try it. +1 > Fix flaky TestScannerHeartbeatMessages > -- > > Key: HBASE-16615 > URL: https://issues.apache.org/jira/browse/HBASE-16615 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16615.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486270#comment-15486270 ] Duo Zhang commented on HBASE-16592: --- Add a fix version please? Thanks. > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch, HBASE-16592.v2.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16388) Prevent client threads being blocked by only one slow region server
[ https://issues.apache.org/jira/browse/HBASE-16388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486221#comment-15486221 ] stack commented on HBASE-16388: --- Where does the user get feedback on how well or how bad their setting of this config is doing [~yangzhe1991]? On SBE, I confused it with RegionTooBusyException.java. Pardon me. I should have seen it is a new Exception. So ignore my remark. Answer above question and I'll commit (after changing name of SBE to ServerTooBusyException and adding detail on this new config.) Thanks [~yangzhe1991] > Prevent client threads being blocked by only one slow region server > --- > > Key: HBASE-16388 > URL: https://issues.apache.org/jira/browse/HBASE-16388 > Project: HBase > Issue Type: New Feature >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16388-branch-1-v1.patch, HBASE-16388-v1.patch, > HBASE-16388-v2.patch, HBASE-16388-v2.patch, HBASE-16388-v2.patch, > HBASE-16388-v2.patch > > > It is a general use case for HBase's users that they have several > threads/handlers in their service, and each handler has its own Table/HTable > instance. Generally users think each handler is independent and won't > interact each other. > However, in an extreme case, if a region server is very slow, every requests > to this RS will timeout, handlers of users' service may be occupied by the > long-waiting requests even requests belong to other RS will also be timeout. > For example: > If we have 100 handlers in a client service(timeout is 1000ms) and HBase has > 10 region servers whose average response time is 50ms. If no region server is > slow, we can handle 2000 requests per second. > Now this service's QPS is 1000. If there is one region server very slow and > all requests to it will be timeout. Users hope that only 10% requests failed, > and 90% requests' response time is still 50ms, because only 10% requests are > located to the slow RS. However, each second we have 100 long-waiting > requests which exactly occupies all 100 handles. So all handlers is blocked, > the availability of this service is almost zero. > To prevent this case, we can limit the max concurrent requests to one RS in > process-level. Requests exceeding the limit will throws > ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above > case, if we set this limit to 20, only 20 handlers will be occupied and other > 80 handlers can still handle requests to other RS. The availability of this > service is 90% as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16615) Fix flaky TestScannerHeartbeatMessages
[ https://issues.apache.org/jira/browse/HBASE-16615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-16615: -- Status: Patch Available (was: Open) > Fix flaky TestScannerHeartbeatMessages > -- > > Key: HBASE-16615 > URL: https://issues.apache.org/jira/browse/HBASE-16615 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Reporter: Duo Zhang > Attachments: HBASE-16615.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-16615) Fix flaky TestScannerHeartbeatMessages
[ https://issues.apache.org/jira/browse/HBASE-16615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang reassigned HBASE-16615: - Assignee: Duo Zhang > Fix flaky TestScannerHeartbeatMessages > -- > > Key: HBASE-16615 > URL: https://issues.apache.org/jira/browse/HBASE-16615 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16615.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16615) Fix flaky TestScannerHeartbeatMessages
[ https://issues.apache.org/jira/browse/HBASE-16615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-16615: -- Affects Version/s: 2.0.0 Fix Version/s: 2.0.0 > Fix flaky TestScannerHeartbeatMessages > -- > > Key: HBASE-16615 > URL: https://issues.apache.org/jira/browse/HBASE-16615 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16615.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16615) Fix flaky TestScannerHeartbeatMessages
[ https://issues.apache.org/jira/browse/HBASE-16615?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Duo Zhang updated HBASE-16615: -- Attachment: HBASE-16615.patch Split to 3 tests, the tests in one file will not be executed parallelly unless configured with annoations. And increase the sleep interval to make it more stable since the TimeoutTimer has a little delay, i.e., a 200ms timeout will timeout after 20x ms usually. > Fix flaky TestScannerHeartbeatMessages > -- > > Key: HBASE-16615 > URL: https://issues.apache.org/jira/browse/HBASE-16615 > Project: HBase > Issue Type: Bug > Components: Client, Scanners >Reporter: Duo Zhang > Attachments: HBASE-16615.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486175#comment-15486175 ] Hudson commented on HBASE-16540: FAILURE: Integrated in Jenkins build HBase-1.4 #409 (See [https://builds.apache.org/job/HBase-1.4/409/]) HBASE-16540 Adding checks in Scanner's setStartRow and setStopRow for (garyh: rev 7028a0d889b3649e5a0c4bd9532b852fb70d3ae3) * (edit) hbase-client/src/test/java/org/apache/hadoop/hbase/client/TestScan.java * (edit) hbase-client/src/main/java/org/apache/hadoop/hbase/client/Scan.java > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.002.patch, HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16611) Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet
[ https://issues.apache.org/jira/browse/HBASE-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486164#comment-15486164 ] Heng Chen commented on HBASE-16611: --- All timeout testcase could pass locally, and TestReplicasClient could pass. Rebase on master, let me try it again. > Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet > - > > Key: HBASE-16611 > URL: https://issues.apache.org/jira/browse/HBASE-16611 > Project: HBase > Issue Type: Bug >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16611.patch, HBASE-16611.v1.patch, > HBASE-16611.v1.patch, HBASE-16611.v2.patch > > > see > https://builds.apache.org/job/PreCommit-HBASE-Build/3494/artifact/patchprocess/patch-unit-hbase-server.txt > {code} > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.026 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 94.401 sec - > in org.apache.hadoop.hbase.client.TestAdmin2 > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.861 sec - > in org.apache.hadoop.hbase.client.TestClientScannerRPCTimeout > Running > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 261.925 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.522 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:581) > Running org.apache.hadoop.hbase.client.TestFastFail > Tests run: 2, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 3.648 sec - > in org.apache.hadoop.hbase.client.TestFastFail > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 277.894 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 5.359 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16611) Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet
[ https://issues.apache.org/jira/browse/HBASE-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16611: -- Attachment: HBASE-16611.v2.patch > Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet > - > > Key: HBASE-16611 > URL: https://issues.apache.org/jira/browse/HBASE-16611 > Project: HBase > Issue Type: Bug >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16611.patch, HBASE-16611.v1.patch, > HBASE-16611.v1.patch, HBASE-16611.v2.patch > > > see > https://builds.apache.org/job/PreCommit-HBASE-Build/3494/artifact/patchprocess/patch-unit-hbase-server.txt > {code} > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.026 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 94.401 sec - > in org.apache.hadoop.hbase.client.TestAdmin2 > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.861 sec - > in org.apache.hadoop.hbase.client.TestClientScannerRPCTimeout > Running > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 261.925 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.522 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:581) > Running org.apache.hadoop.hbase.client.TestFastFail > Tests run: 2, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 3.648 sec - > in org.apache.hadoop.hbase.client.TestFastFail > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 277.894 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 5.359 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16416) Make NoncedRegionServerCallable extends RegionServerCallable
[ https://issues.apache.org/jira/browse/HBASE-16416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guanghao Zhang updated HBASE-16416: --- Resolution: Duplicate Status: Resolved (was: Patch Available) > Make NoncedRegionServerCallable extends RegionServerCallable > > > Key: HBASE-16416 > URL: https://issues.apache.org/jira/browse/HBASE-16416 > Project: HBase > Issue Type: Improvement > Components: Client >Affects Versions: 2.0.0 >Reporter: Guanghao Zhang >Assignee: Guanghao Zhang >Priority: Minor > Attachments: HBASE-16416.patch > > > After HBASE-16308, there are a new class NoncedRegionServerCallable which > extends AbstractRegionServerCallable. But it have some duplicate methods with > RegionServerCallable. So we can make NoncedRegionServerCallable extends > RegionServerCallable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16388) Prevent client threads being blocked by only one slow region server
[ https://issues.apache.org/jira/browse/HBASE-16388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phil Yang updated HBASE-16388: -- Release Note: Add a new configuration, hbase.client.perserver.requests.threshold, to limit the max number of concurrent request to one region server. If the user still create new request after reaching the limit, client will throw ServerBusyException and do not send the request to the server. This is a client side feature and can prevent client's threads being blocked by one slow region server resulting in the availability of client is much lower than the availability of region servers. > Prevent client threads being blocked by only one slow region server > --- > > Key: HBASE-16388 > URL: https://issues.apache.org/jira/browse/HBASE-16388 > Project: HBase > Issue Type: New Feature >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16388-branch-1-v1.patch, HBASE-16388-v1.patch, > HBASE-16388-v2.patch, HBASE-16388-v2.patch, HBASE-16388-v2.patch, > HBASE-16388-v2.patch > > > It is a general use case for HBase's users that they have several > threads/handlers in their service, and each handler has its own Table/HTable > instance. Generally users think each handler is independent and won't > interact each other. > However, in an extreme case, if a region server is very slow, every requests > to this RS will timeout, handlers of users' service may be occupied by the > long-waiting requests even requests belong to other RS will also be timeout. > For example: > If we have 100 handlers in a client service(timeout is 1000ms) and HBase has > 10 region servers whose average response time is 50ms. If no region server is > slow, we can handle 2000 requests per second. > Now this service's QPS is 1000. If there is one region server very slow and > all requests to it will be timeout. Users hope that only 10% requests failed, > and 90% requests' response time is still 50ms, because only 10% requests are > located to the slow RS. However, each second we have 100 long-waiting > requests which exactly occupies all 100 handles. So all handlers is blocked, > the availability of this service is almost zero. > To prevent this case, we can limit the max concurrent requests to one RS in > process-level. Requests exceeding the limit will throws > ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above > case, if we set this limit to 20, only 20 handlers will be occupied and other > 80 handlers can still handle requests to other RS. The availability of this > service is 90% as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16623) Unify Get with AP
Heng Chen created HBASE-16623: - Summary: Unify Get with AP Key: HBASE-16623 URL: https://issues.apache.org/jira/browse/HBASE-16623 Project: HBase Issue Type: Sub-task Reporter: Heng Chen Assignee: Heng Chen -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486125#comment-15486125 ] Hudson commented on HBASE-16616: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1591 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1591/]) HBASE-16616 Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry (Tomu (tedyu: rev 8855670cd701fdf9c2ab41907f9525d122608e6d) * (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/Counter.java * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara >Assignee: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message
[jira] [Updated] (HBASE-16610) Unify append, increment with AP
[ https://issues.apache.org/jira/browse/HBASE-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16610: -- Attachment: HBASE-16610.v1.patch fix findbugs and trigger server tests > Unify append, increment with AP > --- > > Key: HBASE-16610 > URL: https://issues.apache.org/jira/browse/HBASE-16610 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16610.patch, HBASE-16610.v1.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16229) Cleaning up size and heapSize calculation
[ https://issues.apache.org/jira/browse/HBASE-16229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486110#comment-15486110 ] Anoop Sam John commented on HBASE-16229: Have a +1 in RB from Ram. Will fix the nits on commit (Like white spaces) Test fails seems unrelated. > Cleaning up size and heapSize calculation > - > > Key: HBASE-16229 > URL: https://issues.apache.org/jira/browse/HBASE-16229 > Project: HBase > Issue Type: Sub-task >Affects Versions: 2.0.0 >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0 > > Attachments: HBASE-16229.patch, HBASE-16229_V2.patch, > HBASE-16229_V3.patch, HBASE-16229_V4.patch, HBASE-16229_V5.patch, > HBASE-16229_V5.patch, HBASE-16229_V6.patch > > > It is bit ugly now. For eg: > AbstractMemStore > {code} > public final static long FIXED_OVERHEAD = ClassSize.align( > ClassSize.OBJECT + > (4 * ClassSize.REFERENCE) + > (2 * Bytes.SIZEOF_LONG)); > public final static long DEEP_OVERHEAD = ClassSize.align(FIXED_OVERHEAD + > (ClassSize.ATOMIC_LONG + ClassSize.TIMERANGE_TRACKER + > ClassSize.CELL_SKIPLIST_SET + ClassSize.CONCURRENT_SKIPLISTMAP)); > {code} > We include the heap overhead of Segment also here. It will be better the > Segment contains its overhead part and the Memstore impl uses the heap size > of all of its segments to calculate its size. > Also this > {code} > public long heapSize() { > return getActive().getSize(); > } > {code} > HeapSize to consider all segment's size not just active's. I am not able to > see an override method in CompactingMemstore. > This jira tries to solve some of these. > When we create a Segment, we seems pass some initial heap size value to it. > Why? The segment object internally has to know what is its heap size not > like some one else dictate it. > More to add when doing this cleanup -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (HBASE-16620) Command-line tool usability issues
[ https://issues.apache.org/jira/browse/HBASE-16620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486105#comment-15486105 ] Vladimir Rodionov edited comment on HBASE-16620 at 9/13/16 3:25 AM: Updated command-line usage patch. cc: [~tedyu] was (Author: vrodionov): Updated command-line usage patch. > Command-line tool usability issues > -- > > Key: HBASE-16620 > URL: https://issues.apache.org/jira/browse/HBASE-16620 > Project: HBase > Issue Type: Bug >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: HBASE-16620-v1.patch > > > We need to address issues found by [~saint@gmail.com] > https://issues.apache.org/jira/browse/HBASE-7912?focusedCommentId=15484865=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15484865 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16620) Command-line tool usability issues
[ https://issues.apache.org/jira/browse/HBASE-16620?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-16620: -- Attachment: HBASE-16620-v1.patch Updated command-line usage patch. > Command-line tool usability issues > -- > > Key: HBASE-16620 > URL: https://issues.apache.org/jira/browse/HBASE-16620 > Project: HBase > Issue Type: Bug >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: HBASE-16620-v1.patch > > > We need to address issues found by [~saint@gmail.com] > https://issues.apache.org/jira/browse/HBASE-7912?focusedCommentId=15484865=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15484865 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16388) Prevent client threads being blocked by only one slow region server
[ https://issues.apache.org/jira/browse/HBASE-16388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486102#comment-15486102 ] Phil Yang commented on HBASE-16388: --- We limit the number of concurrent requests. The suitable conf depends on the number of RSs and the number of threads accessing Table's API. But we only know the number of RSs, we don't know the number of threads. So this should be set by users according to their own environment. SBE is from client-side, we don't send request at all. So this is a client-only fix. After we have AsyncTable and users use it in an async way which means user's threads will not be blocked, this conf is still useful to prevent we have too many pending request resulting in an OOM. And of course for AsyncTable's user, this limit can be much larger than blocking Table's user. > Prevent client threads being blocked by only one slow region server > --- > > Key: HBASE-16388 > URL: https://issues.apache.org/jira/browse/HBASE-16388 > Project: HBase > Issue Type: New Feature >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: HBASE-16388-branch-1-v1.patch, HBASE-16388-v1.patch, > HBASE-16388-v2.patch, HBASE-16388-v2.patch, HBASE-16388-v2.patch, > HBASE-16388-v2.patch > > > It is a general use case for HBase's users that they have several > threads/handlers in their service, and each handler has its own Table/HTable > instance. Generally users think each handler is independent and won't > interact each other. > However, in an extreme case, if a region server is very slow, every requests > to this RS will timeout, handlers of users' service may be occupied by the > long-waiting requests even requests belong to other RS will also be timeout. > For example: > If we have 100 handlers in a client service(timeout is 1000ms) and HBase has > 10 region servers whose average response time is 50ms. If no region server is > slow, we can handle 2000 requests per second. > Now this service's QPS is 1000. If there is one region server very slow and > all requests to it will be timeout. Users hope that only 10% requests failed, > and 90% requests' response time is still 50ms, because only 10% requests are > located to the slow RS. However, each second we have 100 long-waiting > requests which exactly occupies all 100 handles. So all handlers is blocked, > the availability of this service is almost zero. > To prevent this case, we can limit the max concurrent requests to one RS in > process-level. Requests exceeding the limit will throws > ServerBusyException(extends DoNotRetryIOE) immediately to users. In the above > case, if we set this limit to 20, only 20 handlers will be occupied and other > 80 handlers can still handle requests to other RS. The availability of this > service is 90% as expected. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16604) Scanner retries on IOException can cause the scans to miss data
[ https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Enis Soztutar updated HBASE-16604: -- Attachment: hbase-16604_v1.patch Here is a patch which does option (3) as above. We close the scanner and throw a new ScannerResetException back to the client. On the client side, the ScannerCallable which is responsible for retrying the RPC re-throws the exception since this is a DNRIOE. The ClientScanner on the other hand knows about this exception, and handles it by resetting the scanner state and opening another region scanner. ClientScanner already handles this logic in loadCache() for DoNotRetryIOException subclasses. Reviews welcome since this is a quite important correctness patch which also has implications to the scanner RPC retries. > Scanner retries on IOException can cause the scans to miss data > > > Key: HBASE-16604 > URL: https://issues.apache.org/jira/browse/HBASE-16604 > Project: HBase > Issue Type: Bug > Components: regionserver, Scanners >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4 > > Attachments: hbase-16604_v1.patch > > > Debugging an ITBLL failure, where the Verify did not "see" all the data in > the cluster, I've noticed that if we end up getting a generic IOException > from the HFileReader level, we may end up missing the rest of the data in the > region. I was able to manually test this, and this stack trace helps to > understand what is going on: > {code} > 2016-09-09 16:27:15,633 INFO [hconnection-0x71ad3d8a-shared--pool21-t9] > client.ScannerCallable(376): Open scanner=1 for > scan={"loadColumnFamiliesOnDemand":null,"startRow":"","stopRow":"","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":2097152,"families":{"testFamily":["testFamily"]},"caching":100,"maxVersions":1,"timeRange":[0,9223372036854775807]} > on region > region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee., > hostname=hw10676,51833,1473463626529, seqNum=2 > 2016-09-09 16:27:15,634 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: > 100 close_scanner: false next_call_seq: 0 client_handles_partials: true > client_handles_heartbeats: true renew: false > 2016-09-09 16:27:15,635 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2510): Rolling back next call seqId > 2016-09-09 16:27:15,635 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2565): Throwing new > ServiceExceptionjava.io.IOException: Could not reseek > StoreFileScanner[HFileScanner for reader > reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c, > compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, > currentSize=1567264, freeSize=1525578848, maxSize=1527146112, > heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, > multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, > cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, > cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, > prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, > lastKey=zzz/testFamily:testFamily/1473463634271/Put, avgKeyLen=35, > avgValueLen=3, entries=17576, length=866998, > cur=/testFamily:/OLDEST_TIMESTAMP/Minimum/vlen=0/seqid=0] to key > /testFamily:testFamily/LATEST_TIMESTAMP/Maximum/vlen=0/seqid=0 > 2016-09-09 16:27:15,635 DEBUG > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] ipc.CallRunner(110): > B.fifo.QRpcServer.handler=5,queue=0,port=51833: callId: 26 service: > ClientService methodName: Scan size: 26 connection: 192.168.42.75:51903 > java.io.IOException: Could not reseek StoreFileScanner[HFileScanner for > reader > reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c, > compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, > currentSize=1567264, freeSize=1525578848, maxSize=1527146112, > heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, > multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, > cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, > cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, > prefetchOnOpen=false, firstKey=aaa/testFamily:testFamily/1473463633859/Put, >
[jira] [Commented] (HBASE-16610) Unify append, increment with AP
[ https://issues.apache.org/jira/browse/HBASE-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15486071#comment-15486071 ] Hadoop QA commented on HBASE-16610: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 3s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 8s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 27m 34s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 8s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 57s {color} | {color:red} hbase-client generated 1 new + 0 unchanged - 0 fixed = 1 total (was 0) {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 8s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 36m 33s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | FindBugs | module:hbase-client | | | org.apache.hadoop.hbase.client.HTable$7 defines compareTo(Object) and uses Object.equals() At HTable.java:Object.equals() At HTable.java:[line 734] | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12827962/HBASE-16610.patch | | JIRA Issue | HBASE-16610 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 520f67e340ab 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 831fb3c | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | findbugs | https://builds.apache.org/job/PreCommit-HBASE-Build/3528/artifact/patchprocess/new-findbugs-hbase-client.html | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/3528/testReport/ | | modules | C: hbase-client U: hbase-client | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/3528/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. >
[jira] [Updated] (HBASE-16610) Unify append, increment with AP
[ https://issues.apache.org/jira/browse/HBASE-16610?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16610: -- Status: Patch Available (was: Open) > Unify append, increment with AP > --- > > Key: HBASE-16610 > URL: https://issues.apache.org/jira/browse/HBASE-16610 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16610.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16592: -- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch, HBASE-16592.v2.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485978#comment-15485978 ] Heng Chen commented on HBASE-16592: --- commit it to master with whitespace fixed > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch, HBASE-16592.v2.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16585) Rewrite the delegation token tests with Parameterized pattern
[ https://issues.apache.org/jira/browse/HBASE-16585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485972#comment-15485972 ] Duo Zhang commented on HBASE-16585: --- Could you please file a jira to fix it on branch-1? Thanks. > Rewrite the delegation token tests with Parameterized pattern > - > > Key: HBASE-16585 > URL: https://issues.apache.org/jira/browse/HBASE-16585 > Project: HBase > Issue Type: Improvement > Components: security >Affects Versions: 2.0.0 >Reporter: Duo Zhang >Assignee: Duo Zhang > Fix For: 2.0.0 > > Attachments: HBASE-16585-branch-1.patch, HBASE-16585.patch > > > TestDelegationTokenWithEncryption and TestGenerateDelegationToken. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485948#comment-15485948 ] Duo Zhang commented on HBASE-16540: --- The commit message in master missed the issue number. > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.002.patch, HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15448) HBase Backup Phase 3: Restore optimization 2
[ https://issues.apache.org/jira/browse/HBASE-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485922#comment-15485922 ] Hadoop QA commented on HBASE-15448: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 3s {color} | {color:red} HBASE-15448 does not apply to master. Rebase required? Wrong Branch? See https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828156/HBASE-15448-v2.patch | | JIRA Issue | HBASE-15448 | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/3527/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > HBase Backup Phase 3: Restore optimization 2 > > > Key: HBASE-15448 > URL: https://issues.apache.org/jira/browse/HBASE-15448 > Project: HBase > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: HBASE-15448-v1.patch, HBASE-15448-v2.patch > > > JIRA opened to continue work on restore optimization. > This will focus on the following > # During incremental backup image restore - restoring full backup into region > boundaries of the most recent incremental backup image. > # Combining multiple tables into single M/R job -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16616: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the patch, Tomu. The remaining work would continue in HBASE-7612 > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara >Assignee: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15448) HBase Backup Phase 3: Restore optimization 2
[ https://issues.apache.org/jira/browse/HBASE-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vladimir Rodionov updated HBASE-15448: -- Attachment: HBASE-15448-v2.patch v2. > HBase Backup Phase 3: Restore optimization 2 > > > Key: HBASE-15448 > URL: https://issues.apache.org/jira/browse/HBASE-15448 > Project: HBase > Issue Type: Task >Affects Versions: 2.0.0 >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov >Priority: Critical > Labels: backup > Fix For: 2.0.0 > > Attachments: HBASE-15448-v1.patch, HBASE-15448-v2.patch > > > JIRA opened to continue work on restore optimization. > This will focus on the following > # During incremental backup image restore - restoring full backup into region > boundaries of the most recent incremental backup image. > # Combining multiple tables into single M/R job -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16622) Apache HBase â„¢ Reference Guide: HBase Java API example has several errors
[ https://issues.apache.org/jira/browse/HBASE-16622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485901#comment-15485901 ] Dima Spivak commented on HBASE-16622: - Seems reasonable. Wanna put together and upload a patch, [~alexxiyang]? > Apache HBase â„¢ Reference Guide: HBase Java API example has several errors > - > > Key: HBASE-16622 > URL: https://issues.apache.org/jira/browse/HBASE-16622 > Project: HBase > Issue Type: Bug >Reporter: alexxiyang > > 1. > {code} > if (admin.tableExists(tableName)) { > System.out.println("Table does not exist."); > System.exit(-1); > } > {code} > This should be > {code} > if (!admin.tableExists(tableName)) { > {code} > 2. > SNAPPY is not suitable for begginer. They may get exceptions like > {code} > Caused by: > org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException): > org.apache.hadoop.hbase.DoNotRetryIOException: Compression algorithm > 'snappy' previously failed test. Set hbase.table.sanity.checks to false at > conf or table descriptor if you want to bypass sanity checks > at > org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1701) > at > org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1569) > at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1491) > at > org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:462) > at > org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55682) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {code} > So the code below > {code} > table.addFamily(new > HColumnDescriptor(CF_DEFAULT).setCompressionType(Algorithm.SNAPPY)); > {code} > it better to change into > {code} > table.addFamily(new > HColumnDescriptor(CF_DEFAULT).setCompressionType(Algorithm.NONE)); > {code} > 3. > Before modify column family , get the table from connection > Change > {code} > HTableDescriptor table = new HTableDescriptor(tableName); > {code} > into > {code} > Table table = connection.getTable(TableName.valueOf(tablename)); > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485892#comment-15485892 ] Tomu Tsuruhara commented on HBASE-16616: Then, what's next for this issue? > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara >Assignee: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485879#comment-15485879 ] Hadoop QA commented on HBASE-16257: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 17s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 37s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 5s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 35s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 36s {color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 3s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 4s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 1s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 13s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 42s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 81m 9s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 41s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 130m 54s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.wal.TestWALSplit | | | hadoop.hbase.regionserver.TestScannerHeartbeatMessages | | | hadoop.hbase.wal.TestWALSplitCompressed | | Timed out junit tests | org.apache.hadoop.hbase.client.TestMultiRespectsLimits | | | org.apache.hadoop.hbase.client.TestSnapshotMetadata | | | org.apache.hadoop.hbase.client.TestTableSnapshotScanner | | | org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828131/HBASE-16257-v3.patch | | JIRA Issue | HBASE-16257 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle
[jira] [Commented] (HBASE-16373) precommit needs a dockerfile with hbase prereqs
[ https://issues.apache.org/jira/browse/HBASE-16373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485873#comment-15485873 ] Duo Zhang commented on HBASE-16373: --- We describe branch-1.1 on jdk8 as 'Running with JDK 8 will work but is not well tested.', so I think it is OK to also run a jdk8 compile check for branch-1.1. And for 0.98, will yetus make use of java6? Is it enough to just install a jdk6 in the docker image? Thanks. > precommit needs a dockerfile with hbase prereqs > --- > > Key: HBASE-16373 > URL: https://issues.apache.org/jira/browse/HBASE-16373 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 1.3.0, 1.4.0, 1.1.6, 1.2.3, 0.98.22 >Reporter: Sean Busbey >Assignee: Duo Zhang >Priority: Critical > Fix For: 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4 > > Attachments: HBASE-16373-branch-1.patch > > > specifically, we need protoc. starting with the dockerfile used by default in > yetus and adding it will probably suffice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16622) Apache HBase â„¢ Reference Guide: HBase Java API example has several errors
alexxiyang created HBASE-16622: -- Summary: Apache HBase â„¢ Reference Guide: HBase Java API example has several errors Key: HBASE-16622 URL: https://issues.apache.org/jira/browse/HBASE-16622 Project: HBase Issue Type: Bug Reporter: alexxiyang 1. {code} if (admin.tableExists(tableName)) { System.out.println("Table does not exist."); System.exit(-1); } {code} This should be {code} if (!admin.tableExists(tableName)) { {code} 2. SNAPPY is not suitable for begginer. They may get exceptions like {code} Caused by: org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(org.apache.hadoop.hbase.DoNotRetryIOException): org.apache.hadoop.hbase.DoNotRetryIOException: Compression algorithm 'snappy' previously failed test. Set hbase.table.sanity.checks to false at conf or table descriptor if you want to bypass sanity checks at org.apache.hadoop.hbase.master.HMaster.warnOrThrowExceptionForFailure(HMaster.java:1701) at org.apache.hadoop.hbase.master.HMaster.sanityCheckTableDescriptor(HMaster.java:1569) at org.apache.hadoop.hbase.master.HMaster.createTable(HMaster.java:1491) at org.apache.hadoop.hbase.master.MasterRpcServices.createTable(MasterRpcServices.java:462) at org.apache.hadoop.hbase.protobuf.generated.MasterProtos$MasterService$2.callBlockingMethod(MasterProtos.java:55682) at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2178) at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) at java.lang.Thread.run(Thread.java:745) {code} So the code below {code} table.addFamily(new HColumnDescriptor(CF_DEFAULT).setCompressionType(Algorithm.SNAPPY)); {code} it better to change into {code} table.addFamily(new HColumnDescriptor(CF_DEFAULT).setCompressionType(Algorithm.NONE)); {code} 3. Before modify column family , get the table from connection Change {code} HTableDescriptor table = new HTableDescriptor(tableName); {code} into {code} Table table = connection.getTable(TableName.valueOf(tablename)); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485860#comment-15485860 ] Hadoop QA commented on HBASE-16257: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 46s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 5 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 14s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 30s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 10s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 1s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 33s {color} | {color:green} master passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 0m 39s {color} | {color:red} hbase-common in master has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 59s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 0s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} xml {color} | {color:green} 0m 2s {color} | {color:green} The patch has no ill-formed XML file. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 29m 2s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 4m 23s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 47s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 96m 30s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 42s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 152m 9s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.snapshot.TestSnapshotClientRetries | | | org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot | | | org.apache.hadoop.hbase.filter.TestFilterListOrOperatorWithBlkCnt | | | org.apache.hadoop.hbase.snapshot.TestRestoreFlushSnapshotFromClient | | | org.apache.hadoop.hbase.client.TestHCM | | | org.apache.hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient | | | org.apache.hadoop.hbase.snapshot.TestFlushSnapshotFromClient | | | org.apache.hadoop.hbase.TestMetaTableAccessor | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828125/HBASE-16257-v3.patch | | JIRA Issue |
[jira] [Commented] (HBASE-15806) An endpoint-based export tool
[ https://issues.apache.org/jira/browse/HBASE-15806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485812#comment-15485812 ] ChiaPing Tsai commented on HBASE-15806: --- [~mbertozzi] It seems to me this patch is ready. Could you please take a look at the latest patch ? Thanks > An endpoint-based export tool > - > > Key: HBASE-15806 > URL: https://issues.apache.org/jira/browse/HBASE-15806 > Project: HBase > Issue Type: New Feature >Affects Versions: 2.0.0 >Reporter: ChiaPing Tsai >Assignee: ChiaPing Tsai >Priority: Minor > Fix For: 2.0.0 > > Attachments: Experiment.png, HBASE-15806-v1.patch, > HBASE-15806-v2.patch, HBASE-15806-v3.patch, HBASE-15806.patch, > HBASE-15806.v4.patch, HBASE-15806.v5.patch > > > The time for exporting table can be reduced, if we use the endpoint technique > to export the hdfs files by the region server rather than by hbase client. > In my experiments, the elapsed time of endpoint-based export can be less than > half of current export tool (enable the hdfs compression) > But the shortcomings is we need to alter table for deploying the endpoint > any comments about this? thanks -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485800#comment-15485800 ] Gary Helmling commented on HBASE-16540: --- Committed to master and branch-1. [~mantonov], please ack for pull in to branch-1.3. Minor bug fix. > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.002.patch, HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485798#comment-15485798 ] Hadoop QA commented on HBASE-16592: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 58s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 26s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 10s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 33s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 1s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} whitespace {color} | {color:red} 0m 0s {color} | {color:red} The patch has 1 line(s) that end in whitespace. Use git apply --whitespace=fix. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 31m 10s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 0s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 113m 0s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 34s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 165m 26s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures | | | hadoop.hbase.regionserver.TestHRegion | | Timed out junit tests | org.apache.hadoop.hbase.security.access.TestWithDisabledAuthorization | | | org.apache.hadoop.hbase.security.access.TestAccessController | | | org.apache.hadoop.hbase.security.access.TestZKPermissionsWatcher | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.12.1 Server=1.12.1 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828121/HBASE-16592.v2.patch | | JIRA Issue | HBASE-16592 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux c77b963a7e03 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 552400e | | Default Java | 1.8.0_101 | |
[jira] [Commented] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485796#comment-15485796 ] Sean Busbey commented on HBASE-15297: - QA bot looks good. I'll commit this and see how far back it cherry picks this evening, unless someone beats me to it. > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485762#comment-15485762 ] Hadoop QA commented on HBASE-15297: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 0s {color} | {color:blue} rubocop was not available. {color} | | {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 0s {color} | {color:blue} Ruby-lint was not available. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 8s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 55s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 26s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 47s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 32s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 27s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 43s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 53s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 29s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 7s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 26s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 58s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 24s {color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 14s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 48m 6s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828134/HBASE-15297.v1.patch | | JIRA Issue | HBASE-15297 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile rubocop ruby_lint | | uname | Linux 431b98d57bf8 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8855670 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485749#comment-15485749 ] Hadoop QA commented on HBASE-16540: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 9s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 22s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 49s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 16s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 20s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 24m 43s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 51s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 8s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 38s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828136/HBASE-16540.002.patch | | JIRA Issue | HBASE-16540 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 1a96162c1689 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8855670 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/3526/testReport/ | | modules | C: hbase-client U: hbase-client | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/3526/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.002.patch, HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485715#comment-15485715 ] Hudson commented on HBASE-16616: FAILURE: Integrated in Jenkins build HBase-1.4 #408 (See [https://builds.apache.org/job/HBase-1.4/408/]) HBASE-16616 Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry (Tomu (tedyu: rev 8ad14bac6728792bee2b0deab0d65c8e083f4f19) * (edit) hbase-server/src/main/java/org/apache/hadoop/hbase/ipc/RpcServer.java * (edit) hbase-common/src/main/java/org/apache/hadoop/hbase/util/Counter.java > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara >Assignee: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian
[jira] [Commented] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485690#comment-15485690 ] Umesh Agashe commented on HBASE-15297: -- Thanks [~busbey] for the review! > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16491) A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience annotation
[ https://issues.apache.org/jira/browse/HBASE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485686#comment-15485686 ] Hudson commented on HBASE-16491: FAILURE: Integrated in Jenkins build HBase-Trunk_matrix #1590 (See [https://builds.apache.org/job/HBase-Trunk_matrix/1590/]) HBASE-16491 A few org.apache.hadoop.hbase.rsgroup classes missing (tedyu: rev 6f072809eee22a04be35a013ede41986484adc04) * (edit) hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/VerifyingRSGroupAdminClient.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupAdminEndpoint.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupSerDe.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManager.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java HBASE-16491 A few org.apache.hadoop.hbase.rsgroup classes missing (tedyu: rev 3642287b2f86a7c88c140bc9d9e35a9bff7253c4) * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java * (edit) hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/VerifyingRSGroupAdminClient.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManager.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupSerDe.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupAdminEndpoint.java HBASE-16491 A few org.apache.hadoop.hbase.rsgroup classes missing (tedyu: rev 552400e53641991d959da4e27042b2157172e373) * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManagerImpl.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupAdminEndpoint.java * (edit) hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/VerifyingRSGroupAdminClient.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupInfoManager.java * (edit) hbase-rsgroup/src/main/java/org/apache/hadoop/hbase/rsgroup/RSGroupSerDe.java > A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience > annotation > --- > > Key: HBASE-16491 > URL: https://issues.apache.org/jira/browse/HBASE-16491 > Project: HBase > Issue Type: Bug > Components: API, regionserver >Reporter: Ted Yu >Assignee: Umesh Agashe >Priority: Minor > Labels: beginner, rsgroup > Fix For: 2.0.0 > > Attachments: HBASE-16491.v1.patch > > > A few classes, such as RSGroupInfoManagerImpl.java, miss @InterfaceAudience > This was discovered when I reviewed HBASE-16456. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485678#comment-15485678 ] Enis Soztutar commented on HBASE-16257: --- bq. Hmm where is the other place? Here: {code} --- a/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SecureBulkLoadManager.java +++ b/hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/SecureBulkLoadManager.java + private final static String BULKLOAD_STAGING_DIR = "hbase.bulkload.staging.dir"; {code} bq. Is the following ok? We create this dir with default perms. If there is a way to get the default perms, and after ensuring that we have {{x}}, setting it to root should be fine. > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485669#comment-15485669 ] Gary Helmling commented on HBASE-16540: --- +1 on v2. Thanks for the patch! I'll go ahead and commit. > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.002.patch, HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16491) A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience annotation
[ https://issues.apache.org/jira/browse/HBASE-16491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485666#comment-15485666 ] Umesh Agashe commented on HBASE-16491: -- Thanks [~ted_yu] for quickly committing these changes! > A few org.apache.hadoop.hbase.rsgroup classes missing @InterfaceAudience > annotation > --- > > Key: HBASE-16491 > URL: https://issues.apache.org/jira/browse/HBASE-16491 > Project: HBase > Issue Type: Bug > Components: API, regionserver >Reporter: Ted Yu >Assignee: Umesh Agashe >Priority: Minor > Labels: beginner, rsgroup > Fix For: 2.0.0 > > Attachments: HBASE-16491.v1.patch > > > A few classes, such as RSGroupInfoManagerImpl.java, miss @InterfaceAudience > This was discovered when I reviewed HBASE-16456. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485665#comment-15485665 ] Sean Busbey commented on HBASE-15297: - patch looks good to me. I'm +1, presuming the QA bot doesn't come back with something surprising. > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485662#comment-15485662 ] Umesh Agashe commented on HBASE-15297: -- [~busbey] thats right! I tested is manually by enabling security on a standalone instance. Reproduced the problem without these changes and verified the behavior after applying the code changes. Here is the fixed output; {code} hbase(main):001:0> grant 'a1', 'R', '@aaa' ERROR: Can't find a namespace: aaa Here is some help for this command: Grant users specific rights. Syntax : grant , [, <@namespace> [, [, [, ]]] permissions is either zero or more letters from the set "RWXCA". READ('R'), WRITE('W'), EXEC('X'), CREATE('C'), ADMIN('A') Note: Groups and users are granted access in the same way, but groups are prefixed with an '@' character. In the same way, tables and namespaces are specified, but namespaces are prefixed with an '@' character. For example: hbase> grant 'bobsmith', 'RWXCA' hbase> grant '@admins', 'RWXCA' hbase> grant 'bobsmith', 'RWXCA', '@ns1' hbase> grant 'bobsmith', 'RW', 't1', 'f1', 'col1' hbase> grant 'bobsmith', 'RW', 'ns1:t1', 'f1', 'col1' Took 0.3230 seconds {code} > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16381) Shell deleteall command should support row key prefixes
[ https://issues.apache.org/jira/browse/HBASE-16381?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485657#comment-15485657 ] Hadoop QA commented on HBASE-16381: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} rubocop {color} | {color:blue} 0m 0s {color} | {color:blue} rubocop was not available. {color} | | {color:blue}0{color} | {color:blue} ruby-lint {color} | {color:blue} 0m 0s {color} | {color:blue} Ruby-lint was not available. {color} | | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s {color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 6s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 7s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 24m 53s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 6s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 7m 13s {color} | {color:green} hbase-shell in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 7s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 36m 36s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828129/HBASE-16381-V4.patch | | JIRA Issue | HBASE-16381 | | Optional Tests | asflicense javac javadoc unit rubocop ruby_lint | | uname | Linux 100d4a9d13cc 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 8855670 | | Default Java | 1.8.0_101 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/3524/testReport/ | | modules | C: hbase-shell U: hbase-shell | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/3524/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Shell deleteall command should support row key prefixes > --- > > Key: HBASE-16381 > URL: https://issues.apache.org/jira/browse/HBASE-16381 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Andrew Purtell >Assignee: Yi Liang >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16381-V1.patch, HBASE-16381-V2.patch, > HBASE-16381-V3.patch, HBASE-16381-V4.patch > > > The shell's deleteall command should support deleting a row range using a row > key prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485653#comment-15485653 ] Jerry He commented on HBASE-16257: -- bq. Can we either use all upper case or all lower case for the names of these bq. Let's add a comment here talking about the layout of rootdir and the permissions of its subdirs. Will do. bq. This is now defined in two places Hmm where is the other place? bq. I was mentioning above that, for upgrades, we have to "open-up" the permissions for the root directory Is the following ok? {noformat} +// Handle the last few special files and set the final rootdir permissions +// rootdir needs 'x' for all for bulk load staging dir +if (isSecurityEnabled) { + fs.setPermission(new Path(rootdir, HConstants.VERSION_FILE_NAME), secureRootFilePerms); + fs.setPermission(new Path(rootdir, HConstants.CLUSTER_ID_FILE_NAME), secureRootFilePerms); +} +fs.setPermission(this.rootdir, this.rootPerms); {noformat} > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Pho updated HBASE-16540: --- Attachment: HBASE-16540.002.patch updated the javadocs to reference the constant used > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.002.patch, HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485640#comment-15485640 ] Sean Busbey commented on HBASE-15297: - thanks for taking this on [~uagashe]! Good digging. It looks like the friendly error message was already present, but never got displayed because {{namespace_exists?}} was throwing instead of returning false. Do I have that right? could you describe what testing you did to verify the change? > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Busbey updated HBASE-15297: Fix Version/s: (was: 1.2.0) 1.2.4 1.4.0 1.3.0 > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.2.4 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-15297: - Fix Version/s: 1.2.0 2.0.0 Affects Version/s: 2.0.0 Status: Patch Available (was: Open) Shell code for SecurityAdmin, method namespace_exists? is changed to catch NamespaceNotFoundException and java code is modified to document the exception. > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 1.2.0, 2.0.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.2.0 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-15297) error message is wrong when a wrong namspace is specified in grant in hbase shell
[ https://issues.apache.org/jira/browse/HBASE-15297?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Umesh Agashe updated HBASE-15297: - Attachment: HBASE-15297.v1.patch Shell code for SecurityAdmin, method namespace_exists? is changed to catch NamespaceNotFoundException and java code is modified to document the exception. > error message is wrong when a wrong namspace is specified in grant in hbase > shell > - > > Key: HBASE-15297 > URL: https://issues.apache.org/jira/browse/HBASE-15297 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0, 1.2.0 >Reporter: Xiang Li >Assignee: Umesh Agashe >Priority: Minor > Fix For: 2.0.0, 1.2.0 > > Attachments: HBASE-15297.v1.patch > > > In HBase shell, specify a non-existing namespace in "grant" command, such as > {code} > hbase(main):001:0> grant 'a1', 'R', '@aaa'<--- there is no namespace > called "aaa" > {code} > The error message issued is not correct > {code} > ERROR: Unknown namespace a1! > {code} > a1 is the user name, not the namespace. > The following error message would be better > {code} > ERROR: Unknown namespace aaa! > {code} > or > {code} > Can't find a namespace: aaa > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485621#comment-15485621 ] Ted Yu commented on HBASE-16616: TestClusterId failure is tracked by HBASE-16349 > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara >Assignee: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485615#comment-15485615 ] Gary Helmling commented on HBASE-16540: --- The patch looks good. The only minor comment I have is on javadoc added for the methods. Where describing when IllegalArgumentException is thrown, it would be better to actually link to the constant used for reference, ie. in "(length exceeds MAX_ROW_LENGTH)" use "{@link HConstants#MAX_ROW_LENGTH}". > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485599#comment-15485599 ] Hadoop QA commented on HBASE-16616: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 19s {color} | {color:blue} Docker mode activated. {color} | | {color:blue}0{color} | {color:blue} patch {color} | {color:blue} 0m 2s {color} | {color:blue} The patch file was not named according to hbase's naming conventions. Please see https://yetus.apache.org/documentation/0.3.0/precommit-patchnames for instructions. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 48s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 46s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} branch-1 passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s {color} | {color:green} branch-1 passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 29s {color} | {color:green} branch-1 passed {color} | | {color:red}-1{color} | {color:red} findbugs {color} | {color:red} 2m 1s {color} | {color:red} hbase-server in branch-1 has 1 extant Findbugs warnings. {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s {color} | {color:green} branch-1 passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 54s {color} | {color:green} branch-1 passed with JDK v1.7.0_111 {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 53s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 53s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 30s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 18m 3s {color} | {color:green} The patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 31s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 35s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} the patch passed with JDK v1.8.0_101 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 56s {color} | {color:green} the patch passed with JDK v1.7.0_111 {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 1m 48s {color} | {color:green} hbase-common in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 86m 21s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 31s {color} | {color:green} The patch does not generate ASF
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485587#comment-15485587 ] Hadoop QA commented on HBASE-16592: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 2 new or modified test files. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 10s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 22s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 45s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 23s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 45s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 52s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 9s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 11s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 103m 20s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 27s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 150m 37s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.TestZooKeeper | | | org.apache.hadoop.hbase.master.normalizer.TestSimpleRegionNormalizerOnCluster | | | org.apache.hadoop.hbase.master.TestRollingRestart | | | org.apache.hadoop.hbase.master.TestMasterShutdown | | | org.apache.hadoop.hbase.regionserver.throttle.TestFlushWithThroughputController | | | org.apache.hadoop.hbase.TestIOFencing | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828103/HBASE-16592.v1.patch | | JIRA Issue | HBASE-16592 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 39c11731d4a6 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / 552400e | | Default Java | 1.8.0_101 | |
[jira] [Commented] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485581#comment-15485581 ] Enis Soztutar commented on HBASE-16257: --- Can we either use all upper case or all lower case for the names of these: {code} + private final FsPermission rootPerms = FsPermission.valueOf("-rwxr-xr-x"); + // permissions for the files under rootDir that need secure protection + private final FsPermission secureRootFilePerms = FsPermission.valueOf("-rw---"); + // permissions for the directories under rootDir that need secure protection + private final FsPermission secureRootSubDirPerms; + // permissions for bulk load staging directory under rootDir + private final FsPermission PERM_HIDDEN = FsPermission.valueOf("-rwx--x--x"); {code} Let's add a comment here talking about the layout of rootdir and the permissions of its subdirs. This is now defined in two places: {code} + public static final String BULKLOAD_STAGING_DIR_NAME = "staging"; {code} I was mentioning above that, for upgrades, we have to "open-up" the permissions for the root directory because if a branch-1 cluster is run in secure mode, the root dir is already wiht perms 700. After upgrade, the staging dir will be inside this root dir, and all BL operation will fail after the upgrade. {code} // Filesystem is good. Go ahead and check for hbase.rootdir. try { if (!fs.exists(rd)) { -if (isSecurityEnabled) { - fs.mkdirs(rd, rootDirPerms); -} else { - fs.mkdirs(rd); -} +fs.mkdirs(rd); {code} Maybe extract the subdirs away, and add getters so that rest of the code does not have to deal with parsing the conf again: {code} +checkSubDir(new Path(this.rootdir, HConstants.BASE_NAMESPACE_DIR)); +checkSubDir(new Path(this.rootdir, HConstants.HFILE_ARCHIVE_DIRECTORY)); +checkSubDir(new Path(this.rootdir, HConstants.HREGION_LOGDIR_NAME)); {code} > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16611) Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet
[ https://issues.apache.org/jira/browse/HBASE-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485577#comment-15485577 ] Hadoop QA commented on HBASE-16611: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 15s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 22s {color} | {color:blue} Maven dependency ordering for branch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 24s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 50s {color} | {color:green} master passed {color} | | {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue} 0m 11s {color} | {color:blue} Maven dependency ordering for patch {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 44s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 25s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 28m 4s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 22s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 3m 10s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 46s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 55s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 101m 5s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 29s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 148m 46s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.master.TestMasterFailover | | | org.apache.hadoop.hbase.master.TestTableLockManager | | | org.apache.hadoop.hbase.master.snapshot.TestSnapshotFileCache | | | org.apache.hadoop.hbase.replication.TestReplicationTableBase | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828102/HBASE-16611.v1.patch | | JIRA Issue | HBASE-16611 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 9d3161592681 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 552400e | | Default Java | 1.8.0_101 | |
[jira] [Updated] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16257: - Attachment: (was: HBASE-16257-v3.patch) > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16257: - Attachment: HBASE-16257-v3.patch > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16381) Shell deleteall command should support row key prefixes
[ https://issues.apache.org/jira/browse/HBASE-16381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yi Liang updated HBASE-16381: - Attachment: HBASE-16381-V4.patch Add test in hbase shell tests > Shell deleteall command should support row key prefixes > --- > > Key: HBASE-16381 > URL: https://issues.apache.org/jira/browse/HBASE-16381 > Project: HBase > Issue Type: Improvement > Components: shell >Reporter: Andrew Purtell >Assignee: Yi Liang >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-16381-V1.patch, HBASE-16381-V2.patch, > HBASE-16381-V3.patch, HBASE-16381-V4.patch > > > The shell's deleteall command should support deleting a row range using a row > key prefix. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu reassigned HBASE-16616: -- Assignee: Tomu Tsuruhara > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara >Assignee: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16616: --- Fix Version/s: 1.4.0 2.0.0 > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara > Fix For: 2.0.0, 1.4.0 > > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485519#comment-15485519 ] Duo Zhang commented on HBASE-16616: --- [~enis] Agree. It will be a big patch so let's finish this issue first and do the replacement in HBASE-7612. > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485508#comment-15485508 ] Enis Soztutar commented on HBASE-16616: --- bq. I think on master we could just use LongAdder to replace the Counter since we are now jdk8 only? +1. See HBASE-7612. However, we can do it in the patch for HBASE-7612. > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16257) Move staging dir to be under hbase root dir
[ https://issues.apache.org/jira/browse/HBASE-16257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jerry He updated HBASE-16257: - Attachment: HBASE-16257-v3.patch Attached v3.patch > Move staging dir to be under hbase root dir > --- > > Key: HBASE-16257 > URL: https://issues.apache.org/jira/browse/HBASE-16257 > Project: HBase > Issue Type: Sub-task >Reporter: Jerry He >Assignee: Jerry He >Priority: Blocker > Fix For: 2.0.0 > > Attachments: HBASE-16257-v1.patch, HBASE-16257-v2.patch, > HBASE-16257-v3.patch > > > The hbase.bulkload.staging.dir defaults to hbase.fs.tmp.dir which then > defaults to > {code} > public static final String DEFAULT_TEMPORARY_HDFS_DIRECTORY = "/user/" > + System.getProperty("user.name") + "/hbase-staging"; > {code} > This default would have problem on local file system standalone case. > We can move the staging dir to be under hbase.rootdir. We are bringing > secure bulkload to the core. It makes sense to bring it under core control as > well, instead of an optional property. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485493#comment-15485493 ] Hadoop QA commented on HBASE-16540: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 13s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s {color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 2m 53s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 8s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 44s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 16s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 21s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 8s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 25m 21s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 9s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 15s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 0m 53s {color} | {color:green} hbase-client in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 6s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 33m 43s {color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828120/HBASE-16540.patch | | JIRA Issue | HBASE-16540 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux 80e38487d21e 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed Sep 3 21:56:12 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 552400e | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/3520/testReport/ | | modules | C: hbase-client U: hbase-client | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/3520/console | | Powered by | Apache Yetus 0.3.0 http://yetus.apache.org | This message was automatically generated. > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485440#comment-15485440 ] Ted Yu commented on HBASE-16592: +1 on v2. > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch, HBASE-16592.v2.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-16621) HBCK should have -fixHFileLinks
Enis Soztutar created HBASE-16621: - Summary: HBCK should have -fixHFileLinks Key: HBASE-16621 URL: https://issues.apache.org/jira/browse/HBASE-16621 Project: HBase Issue Type: Bug Reporter: Enis Soztutar Fix For: 2.0.0, 1.4.0 Similar to {{-fixReferenceFiles}}, HBCK should be able to sideline dangling HFile Links as well. We have seen a couple of cases, where due to HDFS-level fsck run which deleted files with missing blocks, the cluster is left with dangling HFIle Links, and regions cannot be opened because of these. Only a manual and error-prone finding and clearing of HFileLinks can save the tables regions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16592: -- Attachment: HBASE-16592.v2.patch Address [~tedyu] new comments, thanks. > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch, HBASE-16592.v2.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Pho updated HBASE-16540: --- Status: Patch Available (was: Open) > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dustin Pho updated HBASE-16540: --- Attachment: HBASE-16540.patch Added a validator for row keys passed to setStartRow and setStopRow such that they must be less than or equal to HConstant.MAX_ROW_LENGTH (Short.MAX_VALUE). > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > Attachments: HBASE-16540.patch > > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16540) Scan should do additional validation on start and stop row
[ https://issues.apache.org/jira/browse/HBASE-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Gary Helmling updated HBASE-16540: -- Assignee: Dustin Pho > Scan should do additional validation on start and stop row > -- > > Key: HBASE-16540 > URL: https://issues.apache.org/jira/browse/HBASE-16540 > Project: HBase > Issue Type: Bug > Components: Client >Reporter: Gary Helmling >Assignee: Dustin Pho > > Scan.setStartRow() and setStopRow() should validate the byte[] passed to > ensure it meets the criteria for a row key. If the byte[] length is greater > that Short.MAX_VALUE, we should throw an IllegalArgumentException in order to > fast fail and prevent server-side errors being thrown and retried. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16604) Scanner retries on IOException can cause the scans to miss data
[ https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485409#comment-15485409 ] Enis Soztutar commented on HBASE-16604: --- Some more info. KeyValueHeap.generalizedSeek() will leave the heap in "dirty" state by setting the {{current = null}} if it gets an IOException: {code } private boolean generalizedSeek(boolean isLazy, Cell seekKey, boolean forward, boolean useBloom) throws IOException { if (!isLazy && useBloom) { throw new IllegalArgumentException("Multi-column Bloom filter " + "optimization requires a lazy seek"); } if (current == null) { return false; } heap.add(current); current = null; ... {code} On the next call, this will return false, indicating that there are no more values to return. We can deal with this in a couple of different ways: (1) Handle IOExceptions in individual KVHeap methods and make sure that state is left consistent even in case of IOExceptions. (2) Handle IOExceptions in HRegionScannerImpl and reset the whole RegionScanner state before returning (3) Bubble the exception to the client, but make sure that the scanner is thrown away. The client will restart another RegionScanner, and possibly start from scanning from the start of the row throwing away partial results. I think, doing (1) will be very fragile. (2) also will not work, since there should be a way to reset the scanner state reliable in case of an IOException coming deep down from FS layer. If we are doing partial results, we maybe left in the middle of a row, but with partially seek'ed. Thus I think (2) also won't cut. (3) is the simplest, which would reset the scanner back to the start of the row, and makes sure that the ScannerCallable returns. The challenge with (3) is that, we want the ScannerCallable to not retry, but we want the ClientScanner to retry. ClientScanner only handles a couple of known exceptions which are derivatives of DNRIOE (UnknownScannerException, NotServingRegionException, OutOfOrderScannerNextException, etc). We can introduce another exception type (ResetScannerException), but we have to be careful for BC for existing clients. > Scanner retries on IOException can cause the scans to miss data > > > Key: HBASE-16604 > URL: https://issues.apache.org/jira/browse/HBASE-16604 > Project: HBase > Issue Type: Bug > Components: regionserver, Scanners >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4 > > > Debugging an ITBLL failure, where the Verify did not "see" all the data in > the cluster, I've noticed that if we end up getting a generic IOException > from the HFileReader level, we may end up missing the rest of the data in the > region. I was able to manually test this, and this stack trace helps to > understand what is going on: > {code} > 2016-09-09 16:27:15,633 INFO [hconnection-0x71ad3d8a-shared--pool21-t9] > client.ScannerCallable(376): Open scanner=1 for > scan={"loadColumnFamiliesOnDemand":null,"startRow":"","stopRow":"","batch":-1,"cacheBlocks":true,"totalColumns":1,"maxResultSize":2097152,"families":{"testFamily":["testFamily"]},"caching":100,"maxVersions":1,"timeRange":[0,9223372036854775807]} > on region > region=testScanThrowsException,,1473463632707.b2adfb618e5d0fe225c1dc40c0eabfee., > hostname=hw10676,51833,1473463626529, seqNum=2 > 2016-09-09 16:27:15,634 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2196): scan request:scanner_id: 1 number_of_rows: > 100 close_scanner: false next_call_seq: 0 client_handles_partials: true > client_handles_heartbeats: true renew: false > 2016-09-09 16:27:15,635 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2510): Rolling back next call seqId > 2016-09-09 16:27:15,635 INFO > [B.fifo.QRpcServer.handler=5,queue=0,port=51833] > regionserver.RSRpcServices(2565): Throwing new > ServiceExceptionjava.io.IOException: Could not reseek > StoreFileScanner[HFileScanner for reader > reader=hdfs://localhost:51795/user/enis/test-data/d6fb1c70-93c1-4099-acb7-5723fc05a737/data/default/testScanThrowsException/b2adfb618e5d0fe225c1dc40c0eabfee/testFamily/5a213cc23b714e5e8e1a140ebbe72f2c, > compression=none, cacheConf=blockCache=LruBlockCache{blockCount=0, > currentSize=1567264, freeSize=1525578848, maxSize=1527146112, > heapSize=1567264, minSize=1450788736, minFactor=0.95, multiSize=725394368, > multiFactor=0.5, singleSize=362697184, singleFactor=0.25}, > cacheDataOnRead=true, cacheDataOnWrite=false, cacheIndexesOnWrite=false, > cacheBloomsOnWrite=false, cacheEvictOnClose=false, cacheDataCompressed=false, > prefetchOnOpen=false,
[jira] [Comment Edited] (HBASE-16604) Scanner retries on IOException can cause the scans to miss data
[ https://issues.apache.org/jira/browse/HBASE-16604?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485409#comment-15485409 ] Enis Soztutar edited comment on HBASE-16604 at 9/12/16 9:55 PM: Some more info. KeyValueHeap.generalizedSeek() will leave the heap in "dirty" state by setting the {{current = null}} if it gets an IOException: {code} private boolean generalizedSeek(boolean isLazy, Cell seekKey, boolean forward, boolean useBloom) throws IOException { if (!isLazy && useBloom) { throw new IllegalArgumentException("Multi-column Bloom filter " + "optimization requires a lazy seek"); } if (current == null) { return false; } heap.add(current); current = null; ... {code} On the next call, this will return false, indicating that there are no more values to return. We can deal with this in a couple of different ways: (1) Handle IOExceptions in individual KVHeap methods and make sure that state is left consistent even in case of IOExceptions. (2) Handle IOExceptions in HRegionScannerImpl and reset the whole RegionScanner state before returning (3) Bubble the exception to the client, but make sure that the scanner is thrown away. The client will restart another RegionScanner, and possibly start from scanning from the start of the row throwing away partial results. I think, doing (1) will be very fragile. (2) also will not work, since there should be a way to reset the scanner state reliable in case of an IOException coming deep down from FS layer. If we are doing partial results, we maybe left in the middle of a row, but with partially seek'ed. Thus I think (2) also won't cut. (3) is the simplest and most reliable, which would reset the scanner back to the start of the row, and makes sure that the ScannerCallable returns. The challenge with (3) is that, we want the ScannerCallable to not retry, but we want the ClientScanner to retry. ClientScanner only handles a couple of known exceptions which are derivatives of DNRIOE (UnknownScannerException, NotServingRegionException, OutOfOrderScannerNextException, etc). We can introduce another exception type (ResetScannerException), but we have to be careful for BC for existing clients. was (Author: enis): Some more info. KeyValueHeap.generalizedSeek() will leave the heap in "dirty" state by setting the {{current = null}} if it gets an IOException: {code } private boolean generalizedSeek(boolean isLazy, Cell seekKey, boolean forward, boolean useBloom) throws IOException { if (!isLazy && useBloom) { throw new IllegalArgumentException("Multi-column Bloom filter " + "optimization requires a lazy seek"); } if (current == null) { return false; } heap.add(current); current = null; ... {code} On the next call, this will return false, indicating that there are no more values to return. We can deal with this in a couple of different ways: (1) Handle IOExceptions in individual KVHeap methods and make sure that state is left consistent even in case of IOExceptions. (2) Handle IOExceptions in HRegionScannerImpl and reset the whole RegionScanner state before returning (3) Bubble the exception to the client, but make sure that the scanner is thrown away. The client will restart another RegionScanner, and possibly start from scanning from the start of the row throwing away partial results. I think, doing (1) will be very fragile. (2) also will not work, since there should be a way to reset the scanner state reliable in case of an IOException coming deep down from FS layer. If we are doing partial results, we maybe left in the middle of a row, but with partially seek'ed. Thus I think (2) also won't cut. (3) is the simplest, which would reset the scanner back to the start of the row, and makes sure that the ScannerCallable returns. The challenge with (3) is that, we want the ScannerCallable to not retry, but we want the ClientScanner to retry. ClientScanner only handles a couple of known exceptions which are derivatives of DNRIOE (UnknownScannerException, NotServingRegionException, OutOfOrderScannerNextException, etc). We can introduce another exception type (ResetScannerException), but we have to be careful for BC for existing clients. > Scanner retries on IOException can cause the scans to miss data > > > Key: HBASE-16604 > URL: https://issues.apache.org/jira/browse/HBASE-16604 > Project: HBase > Issue Type: Bug > Components: regionserver, Scanners >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.3.0, 1.4.0, 1.1.7, 1.2.4 > > > Debugging an ITBLL failure, where the Verify did not "see" all the data in > the cluster,
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485393#comment-15485393 ] Ted Yu commented on HBASE-16592: Looks good overall. Since AbstractResponse is marked Private, the following is not needed: {code} + @InterfaceAudience.Private + public enum ResponseType { {code} {code} +SINGLE(0), +MULTI (1); {code} nit: please align the ordinal values. {code} +CompareType compareType = null; +if (compareOp != null) { {code} The above is needed after unification ? > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14123) HBase Backup/Restore Phase 2
[ https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14123: --- Attachment: 14123-master.v19.txt Patch v19 is up to commit 44812cf1ed6255649bd0d67b1cfe46940f11fc1a > HBase Backup/Restore Phase 2 > > > Key: HBASE-14123 > URL: https://issues.apache.org/jira/browse/HBASE-14123 > Project: HBase > Issue Type: Umbrella >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: 14123-master.v14.txt, 14123-master.v15.txt, > 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, > 14123-master.v19.txt, 14123-master.v2.txt, 14123-master.v3.txt, > 14123-master.v5.txt, 14123-master.v6.txt, 14123-master.v7.txt, > 14123-master.v8.txt, 14123-master.v9.txt, 14123-v14.txt, > HBASE-14123-for-7912-v1.patch, HBASE-14123-for-7912-v6.patch, > HBASE-14123-v1.patch, HBASE-14123-v10.patch, HBASE-14123-v11.patch, > HBASE-14123-v12.patch, HBASE-14123-v13.patch, HBASE-14123-v15.patch, > HBASE-14123-v16.patch, HBASE-14123-v2.patch, HBASE-14123-v3.patch, > HBASE-14123-v4.patch, HBASE-14123-v5.patch, HBASE-14123-v6.patch, > HBASE-14123-v7.patch, HBASE-14123-v9.patch > > > Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16373) precommit needs a dockerfile with hbase prereqs
[ https://issues.apache.org/jira/browse/HBASE-16373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485341#comment-15485341 ] Sean Busbey commented on HBASE-16373: - looks like Yetus doesn't consider a patch that impacts the specified Dockerfile as cause for re-exec, which sounds like a Yetus bug. patch looks good. +1. We'll need a different image for 0.98, because it needs jdk6 and needs to not have jdk8. Might also need an image w/o jdk8 on branch-1.1. > precommit needs a dockerfile with hbase prereqs > --- > > Key: HBASE-16373 > URL: https://issues.apache.org/jira/browse/HBASE-16373 > Project: HBase > Issue Type: Bug > Components: build >Affects Versions: 1.3.0, 1.4.0, 1.1.6, 1.2.3, 0.98.22 >Reporter: Sean Busbey >Assignee: Duo Zhang >Priority: Critical > Fix For: 1.3.0, 1.4.0, 1.1.7, 0.98.23, 1.2.4 > > Attachments: HBASE-16373-branch-1.patch > > > specifically, we need protoc. starting with the dockerfile used by default in > yetus and adding it will probably suffice. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485322#comment-15485322 ] Heng Chen commented on HBASE-16592: --- Failed testcases has no relates with the patch, all could pass locally. [~tedyu] any chance +1 here? It is a blocker to move on > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485310#comment-15485310 ] Tomu Tsuruhara commented on HBASE-16616: Ted: Sure, I will > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16618) Procedure v2 - Add base class for table and ns procedures
[ https://issues.apache.org/jira/browse/HBASE-16618?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485286#comment-15485286 ] Hadoop QA commented on HBASE-16618: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 18s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 3m 11s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 39s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 46s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 51s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s {color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 37s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 47s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 26m 35s {color} | {color:green} Patch does not cause any errors with Hadoop 2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.1 2.6.2 2.6.3 2.7.1. {color} | | {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green} 0m 11s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 28s {color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 84m 9s {color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 13s {color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 124m 0s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hbase.client.TestReplicasClient | | | org.apache.hadoop.hbase.client.TestFromClientSide3 | | | org.apache.hadoop.hbase.client.TestTableSnapshotScanner | \\ \\ || Subsystem || Report/Notes || | Docker | Client=1.11.2 Server=1.11.2 Image:yetus/hbase:7bda515 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12828087/HBASE-16618-v0.patch | | JIRA Issue | HBASE-16618 | | Optional Tests | asflicense javac javadoc unit findbugs hadoopcheck hbaseanti checkstyle compile | | uname | Linux f9f3f5192cf7 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build@2/component/dev-support/hbase-personality.sh | | git revision | master / 8290b2c | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HBASE-Build/3514/artifact/patchprocess/patch-unit-hbase-server.txt | | unit test logs | https://builds.apache.org/job/PreCommit-HBASE-Build/3514/artifact/patchprocess/patch-unit-hbase-server.txt | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/3514/testReport/ | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/3514/console | | Powered by | Apache Yetus
[jira] [Commented] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485282#comment-15485282 ] Ted Yu commented on HBASE-16616: Tomu: Can you attach new patch for master (see Duo's comment above) ? > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16616) Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry
[ https://issues.apache.org/jira/browse/HBASE-16616?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-16616: --- Attachment: 16616.branch-1.v2.txt Same patch as v2 for master. > Rpc handlers stuck on ThreadLocalMap.expungeStaleEntry > -- > > Key: HBASE-16616 > URL: https://issues.apache.org/jira/browse/HBASE-16616 > Project: HBase > Issue Type: Improvement > Components: Performance >Affects Versions: 1.2.2 >Reporter: Tomu Tsuruhara > Attachments: 16616.branch-1.v2.txt, HBASE-16616.master.001.patch, > HBASE-16616.master.002.patch, ScreenShot 2016-09-09 14.17.53.png > > > In our HBase 1.2.2 cluster, some regionserver showed too bad > "QueueCallTime_99th_percentile" exceeding 10 seconds. > Most rpc handler threads stuck on ThreadLocalMap.expungeStaleEntry call at > that time. > {noformat} > "PriorityRpcServer.handler=18,queue=0,port=16020" #322 daemon prio=5 > os_prio=0 tid=0x7fd422062800 nid=0x19b89 runnable [0x7fcb8a821000] >java.lang.Thread.State: RUNNABLE > at > java.lang.ThreadLocal$ThreadLocalMap.expungeStaleEntry(ThreadLocal.java:617) > at java.lang.ThreadLocal$ThreadLocalMap.remove(ThreadLocal.java:499) > at > java.lang.ThreadLocal$ThreadLocalMap.access$200(ThreadLocal.java:298) > at java.lang.ThreadLocal.remove(ThreadLocal.java:222) > at > java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:426) > at > java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) > at > java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.unlockForRegularUsage(ExponentiallyDecayingSample.java:196) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:113) > at > com.yammer.metrics.stats.ExponentiallyDecayingSample.update(ExponentiallyDecayingSample.java:81) > at > org.apache.hadoop.metrics2.lib.MutableHistogram.add(MutableHistogram.java:81) > at > org.apache.hadoop.metrics2.lib.MutableRangeHistogram.add(MutableRangeHistogram.java:59) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServerSourceImpl.dequeuedCall(MetricsHBaseServerSourceImpl.java:194) > at > org.apache.hadoop.hbase.ipc.MetricsHBaseServer.dequeuedCall(MetricsHBaseServer.java:76) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2192) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:112) > at > org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:133) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:108) > at java.lang.Thread.run(Thread.java:745) > {noformat} > We were using jdk 1.8.0_92 and here is a snippet from ThreadLocal.java. > {code} > 616:while (tab[h] != null) > 617:h = nextIndex(h, len); > {code} > So I hypothesized that there're too many consecutive entries in {{tab}} array > and actually I found them in the heapdump. > !ScreenShot 2016-09-09 14.17.53.png|width=50%! > Most of these entries pointed at instance of > {{org.apache.hadoop.hbase.util.Counter$1}} > which is equivarent to {{indexHolderThreadLocal}} instance-variable in the > {{Counter}} class. > Because {{RpcServer$Connection}} class creates a {{Counter}} instance > {{rpcCount}} for every connections, > it is possible to have lots of {{Counter#indexHolderThreadLocal}} instances > in RegionServer process > when we repeat connect-and-close from client. As a result, a ThreadLocalMap > can have lots of consecutive > entires. > Usually, since each entry is a {{WeakReference}}, these entries are collected > and removed > by garbage-collector soon after connection closed. > But if connection's life-time was long enough to survive youngGC, it wouldn't > be collected until old-gen collector runs. > Furthermore, under G1GC deployment, it is possible not to be collected even > by old-gen GC(mixed GC) > if entries sit in a region which doesn't have much garbages. > Actually we used G1GC when we encountered this problem. > We should remove the entry from ThreadLocalMap by calling ThreadLocal#remove > explicitly. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-16578) Mob data loss after mob compaction and normal compcation
[ https://issues.apache.org/jira/browse/HBASE-16578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485253#comment-15485253 ] huaxiang sun commented on HBASE-16578: -- Thanks [~jingcheng...@intel.com]! I am still digesting the info and will probably come back tomorrow. I have another testing case which is not reliably giving failure (the first case) to verify. > Mob data loss after mob compaction and normal compcation > > > Key: HBASE-16578 > URL: https://issues.apache.org/jira/browse/HBASE-16578 > Project: HBase > Issue Type: Bug > Components: mob >Affects Versions: 2.0.0 >Reporter: huaxiang sun >Assignee: huaxiang sun > Attachments: TestMobCompaction.java, TestMobCompaction.java > > > I have a unittest case, which could explore the mob data loss issue. The root > cause is that with the following line: > https://github.com/apache/hbase/blob/master/hbase-server/src/main/java/org/apache/hadoop/hbase/mob/compactions/PartitionedMobCompactor.java#L625 > It will make the old mob reference cell win during compaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14123) HBase Backup/Restore Phase 2
[ https://issues.apache.org/jira/browse/HBASE-14123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15485223#comment-15485223 ] Ted Yu commented on HBASE-14123: [~stack]: Here is review board for mega patch v18: https://reviews.apache.org/r/51823/ > HBase Backup/Restore Phase 2 > > > Key: HBASE-14123 > URL: https://issues.apache.org/jira/browse/HBASE-14123 > Project: HBase > Issue Type: Umbrella >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: 14123-master.v14.txt, 14123-master.v15.txt, > 14123-master.v16.txt, 14123-master.v17.txt, 14123-master.v18.txt, > 14123-master.v2.txt, 14123-master.v3.txt, 14123-master.v5.txt, > 14123-master.v6.txt, 14123-master.v7.txt, 14123-master.v8.txt, > 14123-master.v9.txt, 14123-v14.txt, HBASE-14123-for-7912-v1.patch, > HBASE-14123-for-7912-v6.patch, HBASE-14123-v1.patch, HBASE-14123-v10.patch, > HBASE-14123-v11.patch, HBASE-14123-v12.patch, HBASE-14123-v13.patch, > HBASE-14123-v15.patch, HBASE-14123-v16.patch, HBASE-14123-v2.patch, > HBASE-14123-v3.patch, HBASE-14123-v4.patch, HBASE-14123-v5.patch, > HBASE-14123-v6.patch, HBASE-14123-v7.patch, HBASE-14123-v9.patch > > > Phase 2 umbrella JIRA. See HBASE-7912 for design document and description. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16592) Unify Delete request with AP
[ https://issues.apache.org/jira/browse/HBASE-16592?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16592: -- Attachment: HBASE-16592.v1.patch > Unify Delete request with AP > > > Key: HBASE-16592 > URL: https://issues.apache.org/jira/browse/HBASE-16592 > Project: HBase > Issue Type: Sub-task >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16592.patch, HBASE-16592.v1.patch, > HBASE-16592.v1.patch > > > This is the first step try to unify the HTable with AP only, to extend AP > could process single action, i introduced AbstractResponse, multiResponse > and singleResponse (introduced to deal with single result) will extend this > class. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-16611) Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet
[ https://issues.apache.org/jira/browse/HBASE-16611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Heng Chen updated HBASE-16611: -- Attachment: HBASE-16611.v1.patch > Flakey org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet > - > > Key: HBASE-16611 > URL: https://issues.apache.org/jira/browse/HBASE-16611 > Project: HBase > Issue Type: Bug >Reporter: Heng Chen >Assignee: Heng Chen > Attachments: HBASE-16611.patch, HBASE-16611.v1.patch, > HBASE-16611.v1.patch > > > see > https://builds.apache.org/job/PreCommit-HBASE-Build/3494/artifact/patchprocess/patch-unit-hbase-server.txt > {code} > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.026 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > Tests run: 26, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 94.401 sec - > in org.apache.hadoop.hbase.client.TestAdmin2 > Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.861 sec - > in org.apache.hadoop.hbase.client.TestClientScannerRPCTimeout > Running > org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClientWithRegionReplicas > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 261.925 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 4.522 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:581) > Running org.apache.hadoop.hbase.client.TestFastFail > Tests run: 2, Failures: 0, Errors: 0, Skipped: 1, Time elapsed: 3.648 sec - > in org.apache.hadoop.hbase.client.TestFastFail > Tests run: 1, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 277.894 sec > <<< FAILURE! - in org.apache.hadoop.hbase.client.TestReplicasClient > testCancelOfMultiGet(org.apache.hadoop.hbase.client.TestReplicasClient) Time > elapsed: 5.359 sec <<< FAILURE! > java.lang.AssertionError: null > at org.junit.Assert.fail(Assert.java:86) > at org.junit.Assert.assertTrue(Assert.java:41) > at org.junit.Assert.assertTrue(Assert.java:52) > at > org.apache.hadoop.hbase.client.TestReplicasClient.testCancelOfMultiGet(TestReplicasClient.java:579) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)