[jira] [Commented] (HBASE-13770) Programmatic JAAS configuration option for secure zookeeper may be broken
[ https://issues.apache.org/jira/browse/HBASE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944603#comment-14944603 ] Maddineni Sukumar commented on HBASE-13770: --- Thanks [~apurtell] for reviewing and pushing this. > Programmatic JAAS configuration option for secure zookeeper may be broken > - > > Key: HBASE-13770 > URL: https://issues.apache.org/jira/browse/HBASE-13770 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.13, 1.2.0 >Reporter: Andrew Purtell >Assignee: Maddineni Sukumar > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: HBASE-13770-0.98.patch, HBASE-13770-0.98.patch, > HBASE-13770-v1.patch, HBASE-13770-v2-0.98.patch, HBASE-13770-v2.patch, > HBASE-13770-v3-0.98.patch, HBASE-13770-v4-0.98.patch, > HBASE-13770-v4-master.patch > > > While verifying the patch fix for HBASE-13768 we were unable to successfully > test the programmatic JAAS configuration option for secure ZooKeeper > integration. Unclear if that was due to a bug or incorrect test configuration. > Update the security section of the online book with clear instructions for > setting up the programmatic JAAS configuration option for secure ZooKeeper > integration. > Verify it works. > Fix as necessary. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944604#comment-14944604 ] Mikhail Antonov commented on HBASE-14559: - I think there're several things here (or maybe I'm missing something). Conceptually it looks logical to me that on large cluster overall responsiveness to admin commands would be improved by serving admin commands from separate threadpool (is that actually right? if not, we can revert it completely?). There were some deadlock-type bugs (these 2 referenced by [~eclark]), but that should be fixed? In theory having yet one more threadpool to separate admin commands from things like meta lookups might help prevent it, but looks overkill. For the tests, these features are revealed in corner case, as in minicluster everything is running as admin, so admin threadpool is overloaded. On current master as I'm seeing in HConstants: bq. public static final int DEFAULT_REGION_SERVER_HIGH_PRIORITY_HANDLER_COUNT = 20; as was set in HBASE-13351. [~stack] so in these 2 oneliners, you're dropping down the number of high priority handlers (to tighten up thread usage?) right? > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HBASE-13773) Replication should not use ZooKeeper at all for coordination
[ https://issues.apache.org/jira/browse/HBASE-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Maddineni Sukumar reassigned HBASE-13773: - Assignee: Maddineni Sukumar > Replication should not use ZooKeeper at all for coordination > > > Key: HBASE-13773 > URL: https://issues.apache.org/jira/browse/HBASE-13773 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.12, 1.2.0 >Reporter: Andrew Purtell >Assignee: Maddineni Sukumar >Priority: Critical > > Introduce a new system table for replication state and use this table for > coordination instead of znodes. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944668#comment-14944668 ] Hudson commented on HBASE-14559: FAILURE: Integrated in HBase-1.2 #229 (See [https://builds.apache.org/job/HBase-1.2/229/]) HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: rev e568cda7f05880577b27a8e693f8788cae372596) * hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks
[ https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944679#comment-14944679 ] Hadoop QA commented on HBASE-14432: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765127/HBASE-14432.v1-branch-1.patch against branch-1 branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193. ATTACHMENT ID: 12765127 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 13 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:red}-1 core tests{color}. The patch failed these unit tests: org.apache.hadoop.hbase.master.handler.TestEnableTableHandler {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15881//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15881//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15881//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15881//console This message is automatically generated. > Procedure V2 - enforce ACL on procedure admin tasks > --- > > Key: HBASE-14432 > URL: https://issues.apache.org/jira/browse/HBASE-14432 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Affects Versions: 2.0.0, 1.3.0 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Labels: security > Attachments: HBASE-14432.v1-branch-1.patch, > HBASE-14432.v1-master.patch > > > In the Procedure class, the owner field is never set. We need to set it so > that we can enforce ACLs on admin tasks such as whether a user has privilege > to abort a procedure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944702#comment-14944702 ] Hudson commented on HBASE-14559: SUCCESS: Integrated in HBase-1.2-IT #192 (See [https://builds.apache.org/job/HBase-1.2-IT/192/]) HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: rev e568cda7f05880577b27a8e693f8788cae372596) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14558) Document ChaosMonkey enhancements from HBASE-14261
[ https://issues.apache.org/jira/browse/HBASE-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944706#comment-14944706 ] Hadoop QA commented on HBASE-14558: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765131/HBASE-14558.patch against master branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193. ATTACHMENT ID: 12765131 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 1 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:red}-1 lineLengths{color}. The patch introduces the following lines longer than 100: +HBase 0.96 introduced a tool named `ChaosMonkey`, modeled after link:http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html +policy, which is configured with all the available actions. It chose to run `RestartActiveMaster` and `RestartRandomRs` actions. +$ bin/hbase org.apache.hadoop.hbase.IntegrationTestIngest -m slowDeterministic -monkeyProps monkey.properties {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15882//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15882//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15882//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15882//console This message is automatically generated. > Document ChaosMonkey enhancements from HBASE-14261 > -- > > Key: HBASE-14558 > URL: https://issues.apache.org/jira/browse/HBASE-14558 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.0.0 >Reporter: Misty Stanley-Jones >Assignee: Misty Stanley-Jones > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: HBASE-14558.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944733#comment-14944733 ] Hudson commented on HBASE-14559: SUCCESS: Integrated in HBase-1.3-IT #212 (See [https://builds.apache.org/job/HBase-1.3-IT/212/]) HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: rev 80961187aa7053d886c88be56311b88a4e02d28f) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944737#comment-14944737 ] Hudson commented on HBASE-14559: FAILURE: Integrated in HBase-1.3 #237 (See [https://builds.apache.org/job/HBase-1.3/237/]) HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: rev 80961187aa7053d886c88be56311b88a4e02d28f) * hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14525) Append and increment operation throws NullPointerException on non-existing column families.
[ https://issues.apache.org/jira/browse/HBASE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944852#comment-14944852 ] Anoop Sam John commented on HBASE-14525: Patch LGTM > Append and increment operation throws NullPointerException on non-existing > column families. > --- > > Key: HBASE-14525 > URL: https://issues.apache.org/jira/browse/HBASE-14525 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0 >Reporter: Abhishek Kumar >Assignee: Abhishek Kumar >Priority: Minor > Attachments: HBASE-14525-V1.patch, HBASE-14525.patch > > > When performing append operation on non-existing column families, > NullPointerException is thrown in hbase shell as shown below: > {noformat} > hbase(main):007:0> append 't1', 'r1', 'none:c1', '123' > ERROR: java.io.IOException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hbase.regionserver.HRegion.doGet(HRegion.java:6987) > at org.apache.hadoop.hbase.regionserver.HRegion.append(HRegion.java:7048) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.append(RSRpcServices.java:580) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2206) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32452) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133) > ... 4 more > {noformat} > This seems to be caused by absence of check for valid family names as done in > other operations like 'Put' in HRegion.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14525) Append and increment operation throws NullPointerException on non-existing column families.
[ https://issues.apache.org/jira/browse/HBASE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Anoop Sam John updated HBASE-14525: --- Hadoop Flags: Reviewed Status: Patch Available (was: Open) > Append and increment operation throws NullPointerException on non-existing > column families. > --- > > Key: HBASE-14525 > URL: https://issues.apache.org/jira/browse/HBASE-14525 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0 >Reporter: Abhishek Kumar >Assignee: Abhishek Kumar >Priority: Minor > Attachments: HBASE-14525-V1.patch, HBASE-14525.patch > > > When performing append operation on non-existing column families, > NullPointerException is thrown in hbase shell as shown below: > {noformat} > hbase(main):007:0> append 't1', 'r1', 'none:c1', '123' > ERROR: java.io.IOException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hbase.regionserver.HRegion.doGet(HRegion.java:6987) > at org.apache.hadoop.hbase.regionserver.HRegion.append(HRegion.java:7048) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.append(RSRpcServices.java:580) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2206) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32452) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133) > ... 4 more > {noformat} > This seems to be caused by absence of check for valid family names as done in > other operations like 'Put' in HRegion.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans
[ https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944894#comment-14944894 ] ramkrishna.s.vasudevan commented on HBASE-12790: bq.Did you mean: instead of waiting 20 seconds for one count query now we will see several point queries completing during that interval? Yes [~apurtell]. That is right. bq.Should see clear improvement when the count query is running with the patch applied. (smile) The count query still runs with the same amount of time but it is the smaller queries that stays behind the bigger queries gets benefited. I think that is a valid case and I can see that the point queries are lagging without the patch because the queues are filled up with the parallel scans launched by the bigger count query. Let me see how to present these results. > Support fairness across parallelized scans > -- > > Key: HBASE-12790 > URL: https://issues.apache.org/jira/browse/HBASE-12790 > Project: HBase > Issue Type: New Feature >Reporter: James Taylor >Assignee: ramkrishna.s.vasudevan > Labels: Phoenix > Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, > HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, > HBASE-12790_trunk_1.patch > > > Some HBase clients parallelize the execution of a scan to reduce latency in > getting back results. This can lead to starvation with a loaded cluster and > interleaved scans, since the RPC queue will be ordered and processed on a > FIFO basis. For example, if there are two clients, A & B that submit largish > scans at the same time. Say each scan is broken down into 100 scans by the > client (broken down into equal depth chunks along the row key), and the 100 > scans of client A are queued first, followed immediately by the 100 scans of > client B. In this case, client B will be starved out of getting any results > back until the scans for client A complete. > One solution to this is to use the attached AbstractRoundRobinQueue instead > of the standard FIFO queue. The queue to be used could be (maybe it already > is) configurable based on a new config parameter. Using this queue would > require the client to have the same identifier for all of the 100 parallel > scans that represent a single logical scan from the clients point of view. > With this information, the round robin queue would pick off a task from the > queue in a round robin fashion (instead of a strictly FIFO manner) to prevent > starvation over interleaved parallelized scans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14366) NPE in case visibility expression is not present in labels table during importtsv run
[ https://issues.apache.org/jira/browse/HBASE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Bhupendra Kumar Jain updated HBASE-14366: - Attachment: HBASE-14366-0.98.patch HBASE-14366-branch-1.patch Attached patches for 0.98 and branch-1. Please review > NPE in case visibility expression is not present in labels table during > importtsv run > - > > Key: HBASE-14366 > URL: https://issues.apache.org/jira/browse/HBASE-14366 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Y. SREENIVASULU REDDY >Assignee: Bhupendra Kumar Jain >Priority: Minor > Attachments: 0001-HBASE-14366.patch, 0001-HBASE-14366_1.patch, > HBASE-14366-0.98.patch, HBASE-14366-branch-1.patch, HBASE-14366_2(1).patch, > HBASE-14366_2.patch > > > Below exception is shown in logs if visibility expression is not present in > labels table during importtsv run. Appropriate exception / message should be > logged for the user to take further action. > {code} > WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver$1.getLabelOrdinal(DefaultVisibilityExpressionResolver.java:127) > at > org.apache.hadoop.hbase.security.visibility.VisibilityUtils.getLabelOrdinals(VisibilityUtils.java:358) > at > org.apache.hadoop.hbase.security.visibility.VisibilityUtils.createVisibilityExpTags(VisibilityUtils.java:323) > at > org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver.createVisibilityExpTags(DefaultVisibilityExpressionResolver.java:137) > at > org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.populatePut(TsvImporterMapper.java:205) > at > org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:165) > at > org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:1) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14520) Optimize the number of calls for tags creation in bulk load
[ https://issues.apache.org/jira/browse/HBASE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14520: --- Resolution: Fixed Hadoop Flags: Reviewed Status: Resolved (was: Patch Available) Thanks for the patch, Bhupendra Thanks for the review, Anoop. > Optimize the number of calls for tags creation in bulk load > --- > > Key: HBASE-14520 > URL: https://issues.apache.org/jira/browse/HBASE-14520 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Bhupendra Kumar Jain >Assignee: Bhupendra Kumar Jain > Fix For: 2.0.0 > > Attachments: HBASE-14520.patch > > > At present, ttl and Visibility expr is one per tsv line i.e. the values and > the tags remain same for all the columns present in that line. As per the > code, List of tags are created for each cell, Instead of creating new tags > for each cell, tags created once for the line can be reused by other cells. > Assume 1Million rows and 1000 columns. Currently tags creation will happen > for 1M * 1000 times. If reuse the tags, the tags creation can reduce to 1M > times. (i.e. one per tsv line). > This is applicable in both TsvImporterMapper and TextSortReducer logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14525) Append and increment operation throws NullPointerException on non-existing column families.
[ https://issues.apache.org/jira/browse/HBASE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945049#comment-14945049 ] Hadoop QA commented on HBASE-14525: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765143/HBASE-14525-V1.patch against master branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193. ATTACHMENT ID: 12765143 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): at org.apache.ambari.server.upgrade.UpgradeCatalog211Test.testExecuteDDLUpdates(UpgradeCatalog211Test.java:73) Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15883//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15883//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15883//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15883//console This message is automatically generated. > Append and increment operation throws NullPointerException on non-existing > column families. > --- > > Key: HBASE-14525 > URL: https://issues.apache.org/jira/browse/HBASE-14525 > Project: HBase > Issue Type: Bug > Components: shell >Affects Versions: 2.0.0 >Reporter: Abhishek Kumar >Assignee: Abhishek Kumar >Priority: Minor > Attachments: HBASE-14525-V1.patch, HBASE-14525.patch > > > When performing append operation on non-existing column families, > NullPointerException is thrown in hbase shell as shown below: > {noformat} > hbase(main):007:0> append 't1', 'r1', 'none:c1', '123' > ERROR: java.io.IOException > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175) > at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106) > at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130) > at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.NullPointerException > at org.apache.hadoop.hbase.regionserver.HRegion.doGet(HRegion.java:6987) > at org.apache.hadoop.hbase.regionserver.HRegion.append(HRegion.java:7048) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.append(RSRpcServices.java:580) > at > org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2206) > at > org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32452) > at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133) > ... 4 more > {noformat} > This seems to be caused by absence of check for valid family names as done in > other operations like 'Put' in HRegion.java -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14366) NPE in case visibility expression is not present in labels table during importtsv run
[ https://issues.apache.org/jira/browse/HBASE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945153#comment-14945153 ] Hadoop QA commented on HBASE-14366: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765169/HBASE-14366-0.98.patch against 0.98 branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193. ATTACHMENT ID: 12765169 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:red}-1 javadoc{color}. The javadoc tool appears to have generated 29 warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15884//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15884//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15884//artifact/patchprocess/checkstyle-aggregate.html Javadoc warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15884//artifact/patchprocess/patchJavadocWarnings.txt Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15884//console This message is automatically generated. > NPE in case visibility expression is not present in labels table during > importtsv run > - > > Key: HBASE-14366 > URL: https://issues.apache.org/jira/browse/HBASE-14366 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Y. SREENIVASULU REDDY >Assignee: Bhupendra Kumar Jain >Priority: Minor > Attachments: 0001-HBASE-14366.patch, 0001-HBASE-14366_1.patch, > HBASE-14366-0.98.patch, HBASE-14366-branch-1.patch, HBASE-14366_2(1).patch, > HBASE-14366_2.patch > > > Below exception is shown in logs if visibility expression is not present in > labels table during importtsv run. Appropriate exception / message should be > logged for the user to take further action. > {code} > WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : > java.lang.NullPointerException > at > org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver$1.getLabelOrdinal(DefaultVisibilityExpressionResolver.java:127) > at > org.apache.hadoop.hbase.security.visibility.VisibilityUtils.getLabelOrdinals(VisibilityUtils.java:358) > at > org.apache.hadoop.hbase.security.visibility.VisibilityUtils.createVisibilityExpTags(VisibilityUtils.java:323) > at > org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver.createVisibilityExpTags(DefaultVisibilityExpressionResolver.java:137) > at > org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.populatePut(TsvImporterMapper.java:205) > at > org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:165) > at > org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:1) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945189#comment-14945189 ] stack commented on HBASE-14559: --- Thanks for noticing lads. I'm running a little rig here w/ branch-1. The tests in the patch pass with the one-liner (upped priorities) where w/o they failed near 100% of the time. At this stage in the zombie stomping session, I'm -- ahem -- a little less concerned about root cause of a failure and more about just getting stuff going again so did not spend much time on why these fixes are needed in branch-1 and not on master. bq. stack so in these 2 oneliners, you're dropping down the number of high priority handlers (to tighten up thread usage?) right? No sir. HBASE-14290 set the number of handlers down when I noticed tests with 500 threads running... that failed to run on my local machine because OOME, could not create thread. So, here, I'm upping the handlers on a few tests. I'd already done a pass on master -- a few tests there needed more handlers or they hung (we need to fix!) -- so was a bit surprised this necessary in branch-1 but it looks like you and [~eclark] have identified why. We should revert HBASE-13635 and HBASE-14322 from branch-1? > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans
[ https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945217#comment-14945217 ] Andrew Purtell commented on HBASE-12790: bq. The count query still runs with the same amount of time but it is the smaller queries that stays behind the bigger queries gets benefited. Yes, that's what I mean. And when no count query is running the point queries shouldn't show a penalty (or if they do then we discuss) > Support fairness across parallelized scans > -- > > Key: HBASE-12790 > URL: https://issues.apache.org/jira/browse/HBASE-12790 > Project: HBase > Issue Type: New Feature >Reporter: James Taylor >Assignee: ramkrishna.s.vasudevan > Labels: Phoenix > Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, > HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, > HBASE-12790_trunk_1.patch > > > Some HBase clients parallelize the execution of a scan to reduce latency in > getting back results. This can lead to starvation with a loaded cluster and > interleaved scans, since the RPC queue will be ordered and processed on a > FIFO basis. For example, if there are two clients, A & B that submit largish > scans at the same time. Say each scan is broken down into 100 scans by the > client (broken down into equal depth chunks along the row key), and the 100 > scans of client A are queued first, followed immediately by the 100 scans of > client B. In this case, client B will be starved out of getting any results > back until the scans for client A complete. > One solution to this is to use the attached AbstractRoundRobinQueue instead > of the standard FIFO queue. The queue to be used could be (maybe it already > is) configurable based on a new config parameter. Using this queue would > require the client to have the same identifier for all of the 100 parallel > scans that represent a single logical scan from the clients point of view. > With this information, the round robin queue would pick off a task from the > queue in a round robin fashion (instead of a strictly FIFO manner) to prevent > starvation over interleaved parallelized scans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans
[ https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945252#comment-14945252 ] ramkrishna.s.vasudevan commented on HBASE-12790: bq.And when no count query is running the point queries shouldn't show a penalty (or if they do then we discuss) That does not happen and I have verified that. Thanks Andy. > Support fairness across parallelized scans > -- > > Key: HBASE-12790 > URL: https://issues.apache.org/jira/browse/HBASE-12790 > Project: HBase > Issue Type: New Feature >Reporter: James Taylor >Assignee: ramkrishna.s.vasudevan > Labels: Phoenix > Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, > HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, > HBASE-12790_trunk_1.patch > > > Some HBase clients parallelize the execution of a scan to reduce latency in > getting back results. This can lead to starvation with a loaded cluster and > interleaved scans, since the RPC queue will be ordered and processed on a > FIFO basis. For example, if there are two clients, A & B that submit largish > scans at the same time. Say each scan is broken down into 100 scans by the > client (broken down into equal depth chunks along the row key), and the 100 > scans of client A are queued first, followed immediately by the 100 scans of > client B. In this case, client B will be starved out of getting any results > back until the scans for client A complete. > One solution to this is to use the attached AbstractRoundRobinQueue instead > of the standard FIFO queue. The queue to be used could be (maybe it already > is) configurable based on a new config parameter. Using this queue would > require the client to have the same identifier for all of the 100 parallel > scans that represent a single logical scan from the clients point of view. > With this information, the round robin queue would pick off a task from the > queue in a round robin fashion (instead of a strictly FIFO manner) to prevent > starvation over interleaved parallelized scans. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks
[ https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945268#comment-14945268 ] Stephen Yuan Jiang commented on HBASE-14432: The {{org.apache.hadoop.hbase.master.handler.TestEnableTableHandler.testEnableTableWithNoRegionServers}} failure is a bad assert, it was just fixed by stack after this patch was submitted (HBASE-14559). Sync the latest branch-1 and re-run test, the problem went away. > Procedure V2 - enforce ACL on procedure admin tasks > --- > > Key: HBASE-14432 > URL: https://issues.apache.org/jira/browse/HBASE-14432 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Affects Versions: 2.0.0, 1.3.0 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Labels: security > Attachments: HBASE-14432.v1-branch-1.patch, > HBASE-14432.v1-master.patch > > > In the Procedure class, the owner field is never set. We need to set it so > that we can enforce ACLs on admin tasks such as whether a user has privilege > to abort a procedure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks
[ https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stephen Yuan Jiang updated HBASE-14432: --- Resolution: Fixed Fix Version/s: 1.3.0 2.0.0 Status: Resolved (was: Patch Available) > Procedure V2 - enforce ACL on procedure admin tasks > --- > > Key: HBASE-14432 > URL: https://issues.apache.org/jira/browse/HBASE-14432 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Affects Versions: 2.0.0, 1.3.0 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Labels: security > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14432.v1-branch-1.patch, > HBASE-14432.v1-master.patch > > > In the Procedure class, the owner field is never set. We need to set it so > that we can enforce ACLs on admin tasks such as whether a user has privilege > to abort a procedure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14509) Configurable sparse indexes?
[ https://issues.apache.org/jira/browse/HBASE-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945304#comment-14945304 ] stack commented on HBASE-14509: --- bq. We could add a method to filter, which is passed an HFile or a FileInfo or something, and based on that gets to decide whether to include the HFile or not. Is filter Interface, like CP, operating at too high a level for the ruling in/out of hfile? bq, The other question is whether HFile is too large of a unit. On whether an hfile is too large a unit, block is the next natural construct; a BF of CQ per block so can skip blocks at a time? The sparse index would go into the current block index as ancillary data rather than add at the head of a data block... We already load the hfile index BF per CQ or min/max could be part of this? bq. Or we punt and just add the building blocks: Sounds like extra config/options to me... so no (smile). Could we start small? Add extra generic info on index -- a BF or min/max -- just so we can skip blocks as we scan? min/max in hfile would be useful too... so could skip whole hfile (would be rare event but great when it happens) > Configurable sparse indexes? > > > Key: HBASE-14509 > URL: https://issues.apache.org/jira/browse/HBASE-14509 > Project: HBase > Issue Type: Brainstorming >Reporter: Lars Hofhansl > > This idea just popped up today and I wanted to record it for discussion: > What if we kept sparse column indexes per region or HFile or per configurable > range? > I.e. For any given CQ we record the lowest and highest value for a particular > range (HFile, Region, or a custom range like the Phoenix guide post). > By tweaking the size of these ranges we can control the size of the index, vs > its selectivity. > For example if we kept it by HFile we can almost instantly decide whether we > need scan a particular HFile at all to find a particular value in a Cell. > We can also collect min/max values for each n MB of data, for example when we > can the region the first time. Assuming ranges are large enough we can always > keep the index in memory together with the region. > Kind of a sparse local index. Might much easier than the buddy region stuff > we've been discussing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945329#comment-14945329 ] Elliott Clark commented on HBASE-14559: --- bq.We should revert HBASE-13635 and HBASE-14322 from branch-1? Nope those two jiras are partial reverts of HBASE-13375. I'm asking if we should remove HBASE-13375 completely. Since we've had to remove it when requests are going to master and we are upping the number of threads that a regionserver needs. > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14557) MapReduce WALPlayer issue with NoTagsKeyValue
[ https://issues.apache.org/jira/browse/HBASE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945321#comment-14945321 ] Jerry He commented on HBASE-14557: -- Yes. [~ram_krish] The Cells in the WALEdit can be KeyValue or NoTagsKeyValue depending on whether they have tag when written by the server. Right? Then we will have problem setting the OutputValueClass to either of the two classes. > MapReduce WALPlayer issue with NoTagsKeyValue > - > > Key: HBASE-14557 > URL: https://issues.apache.org/jira/browse/HBASE-14557 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0 >Reporter: Jerry He > > Running MapReduce WALPlayer to convert WAL info HFiles: > {noformat} > 15/10/05 20:28:08 INFO mapred.JobClient: Task Id : > attempt_201508031611_0029_m_00_0, Status : FAILED > java.io.IOException: Type mismatch in value from map: expected > org.apache.hadoop.hbase.KeyValue, recieved > org.apache.hadoop.hbase.NoTagsKeyValue > at > org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:997) > at > org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:689) > at > org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89) > at > org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112) > at > org.apache.hadoop.hbase.mapreduce.WALPlayer$WALKeyValueMapper.map(WALPlayer.java:111) > at > org.apache.hadoop.hbase.mapreduce.WALPlayer$WALKeyValueMapper.map(WALPlayer.java:96) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:368) > at org.apache.hadoop.mapred.Child$4.run(Child.java:255) > at > java.security.AccessController.doPrivileged(AccessController.java:369) > at javax.security.auth.Subject.doAs(Subject.java:572) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502) > at org.apache.hadoop.mapred.Child.main(Child.java:249) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14558) Document ChaosMonkey enhancements from HBASE-14261
[ https://issues.apache.org/jira/browse/HBASE-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945334#comment-14945334 ] Elliott Clark commented on HBASE-14558: --- {code}HBase 1.02 and newer adds the ability to restart{code} 1.0.2 {code}have no reasonable defaults{code} Should we call out that they have no default because it's deployment specific ? {code}in your ChaosMonkey properties file.{code} This can be in hbase-site.xml > Document ChaosMonkey enhancements from HBASE-14261 > -- > > Key: HBASE-14558 > URL: https://issues.apache.org/jira/browse/HBASE-14558 > Project: HBase > Issue Type: Sub-task > Components: documentation >Affects Versions: 2.0.0 >Reporter: Misty Stanley-Jones >Assignee: Misty Stanley-Jones > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3 > > Attachments: HBASE-14558.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks
[ https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945344#comment-14945344 ] Hudson commented on HBASE-14432: SUCCESS: Integrated in HBase-1.3-IT #213 (See [https://builds.apache.org/job/HBase-1.3-IT/213/]) HBASE-14432 Procedure V2 - enforce ACL on procedure admin tasks (Stephen (syuanjiangdev: rev a6d90bcc97ea6e00d2d75381db0b598ab6c71026) * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java * hbase-server/pom.xml * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * hbase-server/src/test/protobuf/TestProcedure.proto * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyColumnFamilyProcedure.java * hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/protobuf/generated/TestProcedureProtos.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * hbase-common/src/main/java/org/apache/hadoop/hbase/ProcedureInfo.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterAndRegionObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteColumnFamilyProcedure.java * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AddColumnFamilyProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java > Procedure V2 - enforce ACL on procedure admin tasks > --- > > Key: HBASE-14432 > URL: https://issues.apache.org/jira/browse/HBASE-14432 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Affects Versions: 2.0.0, 1.3.0 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Labels: security > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14432.v1-branch-1.patch, > HBASE-14432.v1-master.patch > > > In the Procedure class, the owner field is never set. We need to set it so > that we can enforce ACLs on admin tasks such as whether a user has privilege > to abort a procedure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14520) Optimize the number of calls for tags creation in bulk load
[ https://issues.apache.org/jira/browse/HBASE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945362#comment-14945362 ] Hudson commented on HBASE-14520: FAILURE: Integrated in HBase-TRUNK #6877 (See [https://builds.apache.org/job/HBase-TRUNK/6877/]) HBASE-14520 Optimize the number of calls for tags creation in bulk load (tedyu: rev 23079c02bf40c318fff4f77fa9182ebdfb230e90) * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java * hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TextSortReducer.java > Optimize the number of calls for tags creation in bulk load > --- > > Key: HBASE-14520 > URL: https://issues.apache.org/jira/browse/HBASE-14520 > Project: HBase > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Bhupendra Kumar Jain >Assignee: Bhupendra Kumar Jain > Fix For: 2.0.0 > > Attachments: HBASE-14520.patch > > > At present, ttl and Visibility expr is one per tsv line i.e. the values and > the tags remain same for all the columns present in that line. As per the > code, List of tags are created for each cell, Instead of creating new tags > for each cell, tags created once for the line can be reused by other cells. > Assume 1Million rows and 1000 columns. Currently tags creation will happen > for 1M * 1000 times. If reuse the tags, the tags creation can reduce to 1M > times. (i.e. one per tsv line). > This is applicable in both TsvImporterMapper and TextSortReducer logic. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14268) Improve KeyLocker
[ https://issues.apache.org/jira/browse/HBASE-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945398#comment-14945398 ] Yu Li commented on HBASE-14268: --- [~jingcheng...@intel.com] I guess we're discussing about lock fairness here? If so, it seems to me the original implementation also uses unfair lock and early waiting thread might also starve. Maybe another point to improve, though. > Improve KeyLocker > - > > Key: HBASE-14268 > URL: https://issues.apache.org/jira/browse/HBASE-14268 > Project: HBase > Issue Type: Improvement > Components: util >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: 14268-V5.patch, HBASE-14268-V2.patch, > HBASE-14268-V3.patch, HBASE-14268-V4.patch, HBASE-14268-V5.patch, > HBASE-14268-V5.patch, HBASE-14268-V6.patch, HBASE-14268-V7.patch, > HBASE-14268-V7.patch, HBASE-14268-V7.patch, HBASE-14268-V7.patch, > HBASE-14268.patch, KeyLockerIncrKeysPerformance.java, > KeyLockerPerformance.java, ReferenceTestApp.java > > > 1. In the implementation of {{KeyLocker}} it uses atomic variables inside a > synchronized block, which doesn't make sense. Moreover, logic inside the > synchronized block is not trivial so that it makes less performance in heavy > multi-threaded environment. > 2. {{KeyLocker}} gives an instance of {{RentrantLock}} which is already > locked, but it doesn't follow the contract of {{ReentrantLock}} because you > are not allowed to freely invoke lock/unlock methods under that contract. > That introduces a potential risk; Whenever you see a variable of the type > {{RentrantLock}}, you should pay attention to what the included instance is > coming from. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14535) Unit test for rpc connection concurrency / deadlock testing
[ https://issues.apache.org/jira/browse/HBASE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945401#comment-14945401 ] Enis Soztutar commented on HBASE-14535: --- bq. I agree with Andrew (till I understand more). We've already got a fleet of 'non-deterministic' tests, so many, our CI runs are of no value A non-deterministic test in this context is different from a flaky test. We have a bunch of flaky tests that fail due to false negatives making jenkins runs useless. This test in particular will not fail for false negatives, but it might fail to catch deadlocks (false positive). If this test fails, I imagine we take a look at it rather than classify as a flaky test. > Unit test for rpc connection concurrency / deadlock testing > > > Key: HBASE-14535 > URL: https://issues.apache.org/jira/browse/HBASE-14535 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: hbase-14535_v1.patch, hbase-14535_v2.patch > > > As per parent jira and recent jiras HBASE-14449 + HBASE-14241 and > HBASE-14313, we seem to be lacking some testing rpc connection concurrency > issues in a UT env. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14529) Respond to SIGHUP to reload config
[ https://issues.apache.org/jira/browse/HBASE-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945408#comment-14945408 ] stack commented on HBASE-14529: --- Ok on 2.0 then with fat release note. Where else should it go? 1.3. Want this for 1.2 [~busbey]? > Respond to SIGHUP to reload config > -- > > Key: HBASE-14529 > URL: https://issues.apache.org/jira/browse/HBASE-14529 > Project: HBase > Issue Type: New Feature >Affects Versions: 1.2.0 >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14529-v1.patch, HBASE-14529-v2.patch, > HBASE-14529.patch > > > SIGHUP is the way everyone since the dawn of unix has done config reload. > Lets not be a special unique snowflake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14535) Unit test for rpc connection concurrency / deadlock testing
[ https://issues.apache.org/jira/browse/HBASE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945439#comment-14945439 ] stack commented on HBASE-14535: --- bq. I imagine we take a look at it rather than classify as a flaky test. Thanks for the explanation. How to designate difference between a flakey and a test that might fail with 'real' issue that needs looking at? An outsider like myself trying to cleanup test failures would need to be able to distinguish between the two. Devs trying to get a clean run against their patch would need to be able to look at results and see that the fail was not theirs but because the test is a 'non-deterministic'. > Unit test for rpc connection concurrency / deadlock testing > > > Key: HBASE-14535 > URL: https://issues.apache.org/jira/browse/HBASE-14535 > Project: HBase > Issue Type: Sub-task > Components: rpc >Reporter: Enis Soztutar >Assignee: Enis Soztutar > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: hbase-14535_v1.patch, hbase-14535_v2.patch > > > As per parent jira and recent jiras HBASE-14449 + HBASE-14241 and > HBASE-14313, we seem to be lacking some testing rpc connection concurrency > issues in a UT env. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking
[ https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14497: --- Attachment: 14497-branch-1-v6.patch > Reverse Scan threw StackOverflow caused by readPt checking > -- > > Key: HBASE-14497 > URL: https://issues.apache.org/jira/browse/HBASE-14497 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 0.98.14, 1.3.0 >Reporter: Yerui Sun >Assignee: Yerui Sun > Fix For: 2.0.0 > > Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, > HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, > HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, > HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, > HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, > HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, > HBASE-14497-master.patch > > > I met stack overflow error in StoreFileScanner.seekToPreviousRow using > reversed scan. I searched and founded HBASE-14155, but it seems to be a > different reason. > The seekToPreviousRow will fetch the row which closest before, and compare > mvcc to the readPt, which acquired when scanner created. If the row's mvcc is > bigger than readPt, an recursive call of seekToPreviousRow will invoked, to > find the next closest before row. > Considering we created a scanner for reversed scan, and some data with > smaller rows was written and flushed, before calling scanner next. When > seekToPreviousRow was invoked, it would call itself recursively, until all > rows which written after scanner created were iterated. The depth of > recursive calling stack depends on the count of rows, the stack overflow > error will be threw if the count of rows is large, like 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14529) Respond to SIGHUP to reload config
[ https://issues.apache.org/jira/browse/HBASE-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945481#comment-14945481 ] Sean Busbey commented on HBASE-14529: - +1 for 1.2 with release note. > Respond to SIGHUP to reload config > -- > > Key: HBASE-14529 > URL: https://issues.apache.org/jira/browse/HBASE-14529 > Project: HBase > Issue Type: New Feature >Affects Versions: 1.2.0 >Reporter: Elliott Clark >Assignee: Elliott Clark > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14529-v1.patch, HBASE-14529-v2.patch, > HBASE-14529.patch > > > SIGHUP is the way everyone since the dawn of unix has done config reload. > Lets not be a special unique snowflake. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14560) TestNamespacesInstanceModel#testToXML fails when JDK 1.8 is used
Ted Yu created HBASE-14560: -- Summary: TestNamespacesInstanceModel#testToXML fails when JDK 1.8 is used Key: HBASE-14560 URL: https://issues.apache.org/jira/browse/HBASE-14560 Project: HBase Issue Type: Test Reporter: Ted Yu Priority: Minor >From >https://builds.apache.org/job/HBase-1.3/jdk=latest1.8,label=Hadoop/237/consoleFull > : {code} org.apache.hadoop.hbase.rest.model.TestNamespacesInstanceModel testToXML(org.apache.hadoop.hbase.rest.model.TestNamespacesInstanceModel) Time elapsed: 0.017 sec <<< FAILURE! junit.framework.ComparisonFailure: expected:<...perties>[NAMEtestNamespaceKEY_2VALUE_2KEY_1VALUE_1] but was:<...perties>[KEY_1VALUE_1KEY_2VALUE_2NAMEtestNamespace] at junit.framework.Assert.assertEquals(Assert.java:100) at junit.framework.Assert.assertEquals(Assert.java:107) at junit.framework.TestCase.assertEquals(TestCase.java:269) at org.apache.hadoop.hbase.rest.model.TestModelBase.testToXML(TestModelBase.java:115) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) {code} The above test failure can be reproduced locally. It was likely caused by the different behavior w.r.t. JAXBContext between JDK 1.7 and 1.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks
[ https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945514#comment-14945514 ] Hudson commented on HBASE-14432: FAILURE: Integrated in HBase-1.3 #238 (See [https://builds.apache.org/job/HBase-1.3/238/]) HBASE-14432 Procedure V2 - enforce ACL on procedure admin tasks (Stephen (syuanjiangdev: rev a6d90bcc97ea6e00d2d75381db0b598ab6c71026) * hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterAndRegionObserver.java * hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteColumnFamilyProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java * hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/protobuf/generated/TestProcedureProtos.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java * hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java * hbase-server/pom.xml * hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyColumnFamilyProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java * hbase-server/src/test/protobuf/TestProcedure.proto * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java * hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AddColumnFamilyProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java * hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java * hbase-common/src/main/java/org/apache/hadoop/hbase/ProcedureInfo.java > Procedure V2 - enforce ACL on procedure admin tasks > --- > > Key: HBASE-14432 > URL: https://issues.apache.org/jira/browse/HBASE-14432 > Project: HBase > Issue Type: Sub-task > Components: proc-v2 >Affects Versions: 2.0.0, 1.3.0 >Reporter: Stephen Yuan Jiang >Assignee: Stephen Yuan Jiang > Labels: security > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14432.v1-branch-1.patch, > HBASE-14432.v1-master.patch > > > In the Procedure class, the owner field is never set. We need to set it so > that we can enforce ACLs on admin tasks such as whether a user has privilege > to abort a procedure. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14386) Reset MutableHistogram's min/max/sum after snapshot
[ https://issues.apache.org/jira/browse/HBASE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14386: --- Status: Open (was: Patch Available) > Reset MutableHistogram's min/max/sum after snapshot > --- > > Key: HBASE-14386 > URL: https://issues.apache.org/jira/browse/HBASE-14386 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: Oliver > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14386.patch > > > Current MutableHistogram do not reset min/max/sum after snapshot, so we > affect by historical data. For example when i monitor the QueueCallTime_mean, > i see one host's QueueCallTime_mean metric is high, but when i trace the > host's regionserver log i see the QueueCallTime_mean has been lower, but the > metric is still high. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945602#comment-14945602 ] stack commented on HBASE-14559: --- These fixes here made it so branch-1 now passes on my internal rig. I can't speak to whether we should remove HBASE-13375 completely. It would seem to explain why some of the tweaks here were necessary in branch-1 but not in master -- that helps. > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14420) Zombie Stomping Session
[ https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945611#comment-14945611 ] stack commented on HBASE-14420: --- Master and branch-1 pass on my internal rig reliably without leaving zombies. I'm now into a new phase of zombie stomping. I am just going to just disable hangers from here on out if I can't find anything obvious inside a few minutes (spent a good while on TestHFileOutputFormat2 yesterday. ) > Zombie Stomping Session > --- > > Key: HBASE-14420 > URL: https://issues.apache.org/jira/browse/HBASE-14420 > Project: HBase > Issue Type: Umbrella > Components: test >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: hangers.txt > > > Patch build are now failing most of the time because we are dropping zombies. > I confirm we are doing this on non-apache build boxes too. > Left-over zombies consume resources on build boxes (OOME cannot create native > threads). Having to do multiple test runs in the hope that we can get a > non-zombie-making build or making (arbitrary) rulings that the zombies are > 'not related' is a productivity sink. And so on... > This is an umbrella issue for a zombie stomping session that started earlier > this week. Will hang sub-issues of this one. Am running builds back-to-back > on little cluster to turn out the monsters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14386) Reset MutableHistogram's min/max/sum after snapshot
[ https://issues.apache.org/jira/browse/HBASE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945618#comment-14945618 ] stack commented on HBASE-14386: --- Any numbers? Do we need to lock? In the past a mistake in metrics making cost tens of percents of read latency. Thanks. > Reset MutableHistogram's min/max/sum after snapshot > --- > > Key: HBASE-14386 > URL: https://issues.apache.org/jira/browse/HBASE-14386 > Project: HBase > Issue Type: Bug >Reporter: binlijin >Assignee: Oliver > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-14386.patch > > > Current MutableHistogram do not reset min/max/sum after snapshot, so we > affect by historical data. For example when i monitor the QueueCallTime_mean, > i see one host's QueueCallTime_mean metric is high, but when i trace the > host's regionserver log i see the QueueCallTime_mean has been lower, but the > metric is still high. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945623#comment-14945623 ] Mikhail Antonov commented on HBASE-14559: - Thanks for clarifying [~stack] regarding tests - I remember we upped the number of high priority handler used in all tests (and I admit to have suggested/supported that :), as after finding several tests where default number of thread wasn't enough after changes made in HBase-13375 bumping the default number of threads looked easy thing to do). Should we also lower DEFAULT_REGION_SERVER_HIGH_PRIORITY_HANDLER_COUNT back to 10 as default (it's 20 now on master, as I see, and set higher only in selected tests?) Regarding removing completely - I guess that's up to the judgement of folks running big clusters. The way we treat admin user requests _seems_ logical to me, but I definitely don't want to argue with production observations using logical conclusions :) > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945634#comment-14945634 ] stack commented on HBASE-14559: --- bq. Should we also lower DEFAULT_REGION_SERVER_HIGH_PRIORITY_HANDLER_COUNT back to 10 as default (it's 20 now on master, as I see, and set higher only in selected tests?) I'm just looking at tests at the moment. I saw priority handlers set to 40 in a few instances which seemed excessive. Other tests with many regions had reams of handlers just sitting there doing nothing clouding thread dumps where i was trying to figure why the test was hung... bq. The way we treat admin user requests seems logical to me, but I definitely don't want to argue with production observations using logical conclusions I'm with you. Lets get other input. > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14561) Disable zombie TestReplicationShell
stack created HBASE-14561: - Summary: Disable zombie TestReplicationShell Key: HBASE-14561 URL: https://issues.apache.org/jira/browse/HBASE-14561 Project: HBase Issue Type: Sub-task Reporter: stack It hung three times in last 40 test runs. Will file issue to reenable it when someone has chance to look at why it is hanging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14562) Fix and reenable zombie TestReplicationShell
stack created HBASE-14562: - Summary: Fix and reenable zombie TestReplicationShell Key: HBASE-14562 URL: https://issues.apache.org/jira/browse/HBASE-14562 Project: HBase Issue Type: Bug Reporter: stack Was disabled over in HBASE-14561 because it hangs with some regularity. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14561) Disable zombie TestReplicationShell
[ https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14561: -- Attachment: 14561.txt This is what I pushed to master. > Disable zombie TestReplicationShell > --- > > Key: HBASE-14561 > URL: https://issues.apache.org/jira/browse/HBASE-14561 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack > Attachments: 14561.txt > > > It hung three times in last 40 test runs. Will file issue to reenable it when > someone has chance to look at why it is hanging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14561) Disable zombie TestReplicationShell
[ https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945663#comment-14945663 ] stack commented on HBASE-14561: --- Link to issue to reenable this test when fixed. > Disable zombie TestReplicationShell > --- > > Key: HBASE-14561 > URL: https://issues.apache.org/jira/browse/HBASE-14561 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack > Attachments: 14561.txt > > > It hung three times in last 40 test runs. Will file issue to reenable it when > someone has chance to look at why it is hanging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14561) Disable zombie TestReplicationShell
[ https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-14561. --- Resolution: Fixed Assignee: stack Fix Version/s: 2.0.0 Pushed to master. > Disable zombie TestReplicationShell > --- > > Key: HBASE-14561 > URL: https://issues.apache.org/jira/browse/HBASE-14561 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0 > > Attachments: 14561.txt > > > It hung three times in last 40 test runs. Will file issue to reenable it when > someone has chance to look at why it is hanging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14420) Zombie Stomping Session
[ https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945672#comment-14945672 ] stack commented on HBASE-14420: --- Going over the last 40 patch builds: TestReplicationShell hangs three times. Was added to master only. HBASE-13084 adds it by running all shell commands again plus the new replication_admin_test.rb command. I'm going to disable it for now. HBASE-14561. TestHFileOutputFormat2 failed 5 times in last 40 runs. I spent time on it yesterday. Seems to be a reliance on test order but was having networking issues which complicated my being able to do diagnosis It seems like an ambitious amount of work to get done in a unit test: {code} * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}. * Sets up and runs a mapreduce job that writes hfile output. * Creates a few inner classes to implement splits and an inputformat that * emits keys and values like those of {@link PerformanceEvaluation}. {code} Was added a good while ago, here: commit e4f8a7419fb4bd0102eaf91e9747de6261e0b5c5 Author: jxiang Date: Fri Feb 21 20:39:21 2014 + HBASE-10526 Using Cell instead of KeyValue in HFileOutputFormat git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1570702 13f79535-47bb-0310-9956-ffa450edef68 I'm just going to disable it until someone wants to work on it. Here is the list of all test failures and their counts: 2 Hanging test : org.apache.hadoop.hbase.TestNodeHealthCheckChore 1 Hanging test : org.apache.hadoop.hbase.TestPartialResultsFromClientSide 2 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSide 1 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor 1 Hanging test : org.apache.hadoop.hbase.client.TestReplicasClient 3 Hanging test : org.apache.hadoop.hbase.client.TestReplicationShell 1 Hanging test : org.apache.hadoop.hbase.constraint.TestConstraint 1 Hanging test : org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd 1 Hanging test : org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite 2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestCopyTable 1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat 5 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat2 1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat 1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableInputFormat 1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2 1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableMapReduce 1 Hanging test : org.apache.hadoop.hbase.replication.TestMasterReplication 1 Hanging test : org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed 1 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint 1 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpointNoMaster 1 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager 1 Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController 1 Hanging test : org.apache.hadoop.hbase.security.access.TestCellACLs 1 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelReplicationWithExpAsString 1 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes 1 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay 1 Hanging test : org.apache.hadoop.hbase.snapshot.TestExportSnapshot 1 Hanging test : org.apache.hadoop.hbase.snapshot.TestMobExportSnapshot 1 Hanging test : org.apache.hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient 1 Hanging test : org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot 1 Hanging test : org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot > Zombie Stomping Session > --- > > Key: HBASE-14420 > URL: https://issues.apache.org/jira/browse/HBASE-14420 > Project: HBase > Issue Type: Umbrella > Components: test >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: hangers.txt > > > Patch build are now failing most of the time because we are dropping zombies. > I confirm we are doing this on non-apache build boxes too. > Left-over zombies consume resources on build boxes (OOME cannot create native > threads). Having to do multiple test runs in the hope that we can get a > non-zombie-making build or making (arbitrary) rulings that the zombies are > 'not related' is a productivity sink. And so on... > This is an umbrella issue for a zombie stomping session that started earlier > this week. Will hang sub-issues of t
[jira] [Created] (HBASE-14563) Disable zombie TestHFileOutputFormat2
stack created HBASE-14563: - Summary: Disable zombie TestHFileOutputFormat2 Key: HBASE-14563 URL: https://issues.apache.org/jira/browse/HBASE-14563 Project: HBase Issue Type: Sub-task Reporter: stack Assignee: stack Disabling until someone has a chance to look at it. I watched it in jvisualvm a while. Its starting and stopping clusters multiple times and then running mr jobs. Needs a rewrite at least and some shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14564) Fix and reenable TestHFileOutputFormat2
stack created HBASE-14564: - Summary: Fix and reenable TestHFileOutputFormat2 Key: HBASE-14564 URL: https://issues.apache.org/jira/browse/HBASE-14564 Project: HBase Issue Type: Bug Reporter: stack Was disabled as part of the zombie stomping session over in HBASE-14420. Test needs a rewrite and/or being split up. Scope of the test needs to be shrunk and made more targeted. Currently it does everything. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945723#comment-14945723 ] stack commented on HBASE-14563: --- >From parent issue: {code} TestHFileOutputFormat2 failed 5 times in last 40 runs. I spent time on it yesterday. Seems to be a reliance on test order but was having networking issues which complicated my being able to do diagnosis It seems like an ambitious amount of work to get done in a unit test: * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}. * Sets up and runs a mapreduce job that writes hfile output. * Creates a few inner classes to implement splits and an inputformat that * emits keys and values like those of {@link PerformanceEvaluation}. Was added a good while ago, here: commit e4f8a7419fb4bd0102eaf91e9747de6261e0b5c5 Author: jxiang Date: Fri Feb 21 20:39:21 2014 + HBASE-10526 Using Cell instead of KeyValue in HFileOutputFormat git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1570702 13f79535-47bb-0310-9956-ffa450edef68 {code} The test stops and starts clusters a few times and then runs MR jobs. Needs shrinking in size and scope. Needs to be more focused on testing a particular issue. HBASE-14564 is issue to reenable. > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945733#comment-14945733 ] stack commented on HBASE-14563: --- Looks like some of the tests were disabled in this suite already. > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14563: -- Attachment: 14563.txt What I pushed to master, branch-1, and branch-1.2. > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Attachments: 14563.txt > > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking
[ https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945745#comment-14945745 ] Hadoop QA commented on HBASE-14497: --- {color:red}-1 overall{color}. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12765206/14497-branch-1-v6.patch against branch-1 branch at commit 23079c02bf40c318fff4f77fa9182ebdfb230e90. ATTACHMENT ID: 12765206 {color:green}+1 @author{color}. The patch does not contain any @author tags. {color:green}+1 tests included{color}. The patch appears to include 4 new or modified tests. {color:green}+1 hadoop versions{color}. The patch compiles with all supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 2.7.1) {color:green}+1 javac{color}. The applied patch does not increase the total number of javac compiler warnings. {color:green}+1 protoc{color}. The applied patch does not increase the total number of protoc compiler warnings. {color:green}+1 javadoc{color}. The javadoc tool did not generate any warning messages. {color:green}+1 checkstyle{color}. The applied patch does not increase the total number of checkstyle errors {color:green}+1 findbugs{color}. The patch does not introduce any new Findbugs (version 2.0.3) warnings. {color:green}+1 release audit{color}. The applied patch does not increase the total number of release audit warnings. {color:green}+1 lineLengths{color}. The patch does not introduce lines longer than 100 {color:green}+1 site{color}. The mvn post-site goal succeeds with this patch. {color:green}+1 core tests{color}. The patch passed unit tests in . {color:red}-1 core zombie tests{color}. There are 1 zombie test(s): Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/15885//testReport/ Release Findbugs (version 2.0.3)warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/15885//artifact/patchprocess/newFindbugsWarnings.html Checkstyle Errors: https://builds.apache.org/job/PreCommit-HBASE-Build/15885//artifact/patchprocess/checkstyle-aggregate.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/15885//console This message is automatically generated. > Reverse Scan threw StackOverflow caused by readPt checking > -- > > Key: HBASE-14497 > URL: https://issues.apache.org/jira/browse/HBASE-14497 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 0.98.14, 1.3.0 >Reporter: Yerui Sun >Assignee: Yerui Sun > Fix For: 2.0.0 > > Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, > HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, > HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, > HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, > HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, > HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, > HBASE-14497-master.patch > > > I met stack overflow error in StoreFileScanner.seekToPreviousRow using > reversed scan. I searched and founded HBASE-14155, but it seems to be a > different reason. > The seekToPreviousRow will fetch the row which closest before, and compare > mvcc to the readPt, which acquired when scanner created. If the row's mvcc is > bigger than readPt, an recursive call of seekToPreviousRow will invoked, to > find the next closest before row. > Considering we created a scanner for reversed scan, and some data with > smaller rows was written and flushed, before calling scanner next. When > seekToPreviousRow was invoked, it would call itself recursively, until all > rows which written after scanner created were iterated. The depth of > recursive calling stack depends on the count of rows, the stack overflow > error will be threw if the count of rows is large, like 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-14563. --- Resolution: Fixed Fix Version/s: 1.3.0 1.2.0 2.0.0 Pushed to branch-1.2+ > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14563.txt > > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14420) Zombie Stomping Session
[ https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14420: -- Status: Patch Available (was: Open) Submitting a non-patch > Zombie Stomping Session > --- > > Key: HBASE-14420 > URL: https://issues.apache.org/jira/browse/HBASE-14420 > Project: HBase > Issue Type: Umbrella > Components: test >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: hangers.txt, none_fix.txt > > > Patch build are now failing most of the time because we are dropping zombies. > I confirm we are doing this on non-apache build boxes too. > Left-over zombies consume resources on build boxes (OOME cannot create native > threads). Having to do multiple test runs in the hope that we can get a > non-zombie-making build or making (arbitrary) rulings that the zombies are > 'not related' is a productivity sink. And so on... > This is an umbrella issue for a zombie stomping session that started earlier > this week. Will hang sub-issues of this one. Am running builds back-to-back > on little cluster to turn out the monsters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14420) Zombie Stomping Session
[ https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14420: -- Attachment: none_fix.txt A non-fix just to see how patch build is doing. It is currently quiet. > Zombie Stomping Session > --- > > Key: HBASE-14420 > URL: https://issues.apache.org/jira/browse/HBASE-14420 > Project: HBase > Issue Type: Umbrella > Components: test >Reporter: stack >Assignee: stack >Priority: Critical > Attachments: hangers.txt, none_fix.txt > > > Patch build are now failing most of the time because we are dropping zombies. > I confirm we are doing this on non-apache build boxes too. > Left-over zombies consume resources on build boxes (OOME cannot create native > threads). Having to do multiple test runs in the hope that we can get a > non-zombie-making build or making (arbitrary) rulings that the zombies are > 'not related' is a productivity sink. And so on... > This is an umbrella issue for a zombie stomping session that started earlier > this week. Will hang sub-issues of this one. Am running builds back-to-back > on little cluster to turn out the monsters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14519) Purge TestFavoredNodeAssignmentHelper, a test for an abandoned feature that can hang
[ https://issues.apache.org/jira/browse/HBASE-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14519: -- Resolution: Fixed Status: Resolved (was: Patch Available) This has been pushed. Resolving. > Purge TestFavoredNodeAssignmentHelper, a test for an abandoned feature that > can hang > > > Key: HBASE-14519 > URL: https://issues.apache.org/jira/browse/HBASE-14519 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 0.98.16 > > Attachments: 14519.txt, 14519v2.txt > > > It came in here: > commit 7a7ab8b8da795177f42e434b1ab1b468e5cd035a > Author: Devaraj Das > Date: Sun May 12 06:47:39 2013 + > HBASE-7932. Introduces Favored Nodes for region files. Adds a balancer > called FavoredNodeLoadBalancer that will honor favored nodes in the process > of balancing but the balance operation is currently a no-op (Devaraj Das) > git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1481476 > 13f79535-47bb-0310-9956-ffa450edef68 > I've already purged the other test that came in on this patch... over in > HBASE-14486 > The test hung here: > https://builds.apache.org/job/PreCommit-HBASE-Build/15823//console > ... though we seemed to have exited abnormally. > Will let this issue hang around a while in case someone disagrees on removal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking
[ https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945767#comment-14945767 ] Ted Yu commented on HBASE-14497: Test suite passed: {code} Fetching https://builds.apache.org/job/PreCommit-HBASE-Build/15885/consoleFull Building remotely on H0 (Hadoop Tez) in workspace /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build Testing patch for HBASE-14497. Testing patch on branch branch-1. [INFO] Apache HBase .. SUCCESS [2.722s] [INFO] Apache HBase - Checkstyle . SUCCESS [0.509s] [INFO] Apache HBase - Resource Bundle SUCCESS [0.162s] [INFO] Apache HBase - Annotations SUCCESS [0.911s] [INFO] Apache HBase - Protocol ... SUCCESS [11.037s] [INFO] Apache HBase - Common . SUCCESS [1:28.402s] [INFO] Apache HBase - Procedure .. SUCCESS [1:52.825s] [INFO] Apache HBase - Client . SUCCESS [1:20.748s] [INFO] Apache HBase - Hadoop Compatibility ... SUCCESS [7.393s] [INFO] Apache HBase - Hadoop Two Compatibility ... SUCCESS [7.066s] [INFO] Apache HBase - Prefix Tree SUCCESS [9.676s] [INFO] Apache HBase - Server . SUCCESS [1:36:58.528s] [INFO] Apache HBase - Testing Util ... SUCCESS [1.222s] [INFO] Apache HBase - Thrift . SUCCESS [3:20.890s] [INFO] Apache HBase - Rest ... SUCCESS [9:11.530s] [INFO] Apache HBase - Shell .. SUCCESS [5:26.924s] [INFO] Apache HBase - Integration Tests .. SUCCESS [1.363s] [INFO] Apache HBase - Examples ... SUCCESS [8.626s] [INFO] Apache HBase - External Block Cache ... SUCCESS [0.606s] [INFO] Apache HBase - Assembly ... SUCCESS [1.394s] [INFO] Apache HBase - Shaded . SUCCESS [0.083s] [INFO] Apache HBase - Shaded - Client SUCCESS [0.359s] [INFO] Apache HBase - Shaded - Server SUCCESS [0.483s] Printing hanging tests Printing Failing tests {code} > Reverse Scan threw StackOverflow caused by readPt checking > -- > > Key: HBASE-14497 > URL: https://issues.apache.org/jira/browse/HBASE-14497 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 0.98.14, 1.3.0 >Reporter: Yerui Sun >Assignee: Yerui Sun > Fix For: 2.0.0 > > Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, > HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, > HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, > HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, > HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, > HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, > HBASE-14497-master.patch > > > I met stack overflow error in StoreFileScanner.seekToPreviousRow using > reversed scan. I searched and founded HBASE-14155, but it seems to be a > different reason. > The seekToPreviousRow will fetch the row which closest before, and compare > mvcc to the readPt, which acquired when scanner created. If the row's mvcc is > bigger than readPt, an recursive call of seekToPreviousRow will invoked, to > find the next closest before row. > Considering we created a scanner for reversed scan, and some data with > smaller rows was written and flushed, before calling scanner next. When > seekToPreviousRow was invoked, it would call itself recursively, until all > rows which written after scanner created were iterated. The depth of > recursive calling stack depends on the count of rows, the stack overflow > error will be threw if the count of rows is large, like 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14511) StoreFile.Writer Meta Plugin
[ https://issues.apache.org/jira/browse/HBASE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945771#comment-14945771 ] Devaraj Das commented on HBASE-14511: - [~vladrodionov] when you say it doesn't work with MOB, could you say what's not working. Is there some test to repro the failure? > StoreFile.Writer Meta Plugin > > > Key: HBASE-14511 > URL: https://issues.apache.org/jira/browse/HBASE-14511 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: HBASE-14511.v1.patch, HBASE-14511.v2.patch > > > During my work on a new compaction policies (HBASE-14468, HBASE-14477) I had > to modify the existing code of a StoreFile.Writer to add additional meta-info > required by these new policies. I think that it should be done by means of a > new Plugin framework, because this seems to be a general capability/feature. > As a future enhancement this can become a part of a more general > StoreFileWriter/Reader plugin architecture. But I need only Meta section of a > store file. > This could be used, for example, to collect rowkeys distribution information > during hfile creation. This info can be used later to find the optimal region > split key or to create optimal set of sub-regions for M/R jobs or other jobs > which can operate on a sub-region level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking
[ https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14497: --- Fix Version/s: 1.3.0 > Reverse Scan threw StackOverflow caused by readPt checking > -- > > Key: HBASE-14497 > URL: https://issues.apache.org/jira/browse/HBASE-14497 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 0.98.14, 1.3.0 >Reporter: Yerui Sun >Assignee: Yerui Sun > Fix For: 2.0.0, 1.3.0 > > Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, > HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, > HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, > HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, > HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, > HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, > HBASE-14497-master.patch > > > I met stack overflow error in StoreFileScanner.seekToPreviousRow using > reversed scan. I searched and founded HBASE-14155, but it seems to be a > different reason. > The seekToPreviousRow will fetch the row which closest before, and compare > mvcc to the readPt, which acquired when scanner created. If the row's mvcc is > bigger than readPt, an recursive call of seekToPreviousRow will invoked, to > find the next closest before row. > Considering we created a scanner for reversed scan, and some data with > smaller rows was written and flushed, before calling scanner next. When > seekToPreviousRow was invoked, it would call itself recursively, until all > rows which written after scanner created were iterated. The depth of > recursive calling stack depends on the count of rows, the stack overflow > error will be threw if the count of rows is large, like 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945789#comment-14945789 ] Mikhail Antonov commented on HBASE-14559: - [~stack] when you traced the tests using excessive number of threads, did they timeout because they run slower with lower number of threads, or did they deadlock? I think there're still bugs lurking around in the implementation :( If we have number of thread handlers 3 rather than 40, I might expect things running noticeably slower, but not the deadlock? > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration
[ https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack resolved HBASE-14436. --- Resolution: Fixed Assignee: stack Hadoop Flags: Reviewed Fix Version/s: 0.98.16 1.1.3 1.0.3 1.3.0 1.2.0 2.0.0 Pushed to 0.98+ > HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create > new Configuration > --- > > Key: HBASE-14436 > URL: https://issues.apache.org/jira/browse/HBASE-14436 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.1 >Reporter: Jianwei Cui >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch > > > HTableDescriptor#addCoprocessor will set the coprocessor value as following > format: > {code} > public HTableDescriptor addCoprocessor(String className, Path jarFilePath, > int priority, final Map kvs) > throws IOException { > ... > String value = ((jarFilePath == null)? "" : jarFilePath.toString()) + > "|" + className + "|" + Integer.toString(priority) + "|" + > kvString.toString(); > ... > } > {code} > If the 'jarFilePath' is null, the 'value' will always has the format > '|className|priority|' even if 'kvs' is null, which means no extra arguments > for the coprocessor. Then, in the server side, > RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table > coprocessors as: > {code} > static List > getTableCoprocessorAttrsFromSchema(Configuration conf, > HTableDescriptor htd) { > ... > try { > cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the > format '|className|priority|' > } catch (IndexOutOfBoundsException ex) { > // ignore > } > Configuration ourConf; > if (cfgSpec != null) { // => cfgSpec will be '|' for the format > '|className|priority|' > ourConf = new Configuration(false); > HBaseConfiguration.merge(ourConf, conf); > } > ... > } > {code} > The 'cfgSpec' will be '|' for the coprocessor formatted as > '|className|priority|', so that always create a new Configuration. > In our production, there are a lot of tables having table-level coprocessors, > so that the region server will create new Configurations for each region of > the table, this will consume a certain number of memory when we have many > such regions. > To fix the problem, we can make the HTableDescriptor not append the '|' if no > extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in > server side which could avoid creating new Configurations for existed such > regions after the regions reopened. Discussions and suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster
Ted Yu created HBASE-14565: -- Summary: Make ZK connection timeout configurable in MiniZooKeeperCluster Key: HBASE-14565 URL: https://issues.apache.org/jira/browse/HBASE-14565 Project: HBase Issue Type: Improvement Reporter: Ted Yu Assignee: Ted Yu This request was made by [~swagle] who works on Ambari Metrics System. Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster This affects operation of Ambari Metrics System in standalone mode. This JIRA is to make the connection timeout configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster
[ https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14565: --- Status: Patch Available (was: Open) > Make ZK connection timeout configurable in MiniZooKeeperCluster > --- > > Key: HBASE-14565 > URL: https://issues.apache.org/jira/browse/HBASE-14565 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14565-v1.txt > > > This request was made by [~swagle] who works on Ambari Metrics System. > Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster > This affects operation of Ambari Metrics System in standalone mode. > This JIRA is to make the connection timeout configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster
[ https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14565: --- Attachment: 14565-v1.txt > Make ZK connection timeout configurable in MiniZooKeeperCluster > --- > > Key: HBASE-14565 > URL: https://issues.apache.org/jira/browse/HBASE-14565 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14565-v1.txt > > > This request was made by [~swagle] who works on Ambari Metrics System. > Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster > This affects operation of Ambari Metrics System in standalone mode. > This JIRA is to make the connection timeout configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945844#comment-14945844 ] Hudson commented on HBASE-14563: SUCCESS: Integrated in HBase-1.3-IT #214 (See [https://builds.apache.org/job/HBase-1.3-IT/214/]) HBASE-14563 Disable zombie TestHFileOutputFormat2 (stack: rev aeb3a624590be8bd276e58bba9d4debfb3e7759f) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14563.txt > > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14511) StoreFile.Writer Meta Plugin
[ https://issues.apache.org/jira/browse/HBASE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945848#comment-14945848 ] Vladimir Rodionov commented on HBASE-14511: --- Yes, two or three MOB tests fails constantly if there is additional data in a meta section of a store file. I think it is the MOB issue. > StoreFile.Writer Meta Plugin > > > Key: HBASE-14511 > URL: https://issues.apache.org/jira/browse/HBASE-14511 > Project: HBase > Issue Type: New Feature >Reporter: Vladimir Rodionov >Assignee: Vladimir Rodionov > Attachments: HBASE-14511.v1.patch, HBASE-14511.v2.patch > > > During my work on a new compaction policies (HBASE-14468, HBASE-14477) I had > to modify the existing code of a StoreFile.Writer to add additional meta-info > required by these new policies. I think that it should be done by means of a > new Plugin framework, because this seems to be a general capability/feature. > As a future enhancement this can become a part of a more general > StoreFileWriter/Reader plugin architecture. But I need only Meta section of a > store file. > This could be used, for example, to collect rowkeys distribution information > during hfile creation. This info can be used later to find the optimal region > split key or to create optimal set of sub-regions for M/R jobs or other jobs > which can operate on a sub-region level. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14561) Disable zombie TestReplicationShell
[ https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945862#comment-14945862 ] Hudson commented on HBASE-14561: FAILURE: Integrated in HBase-TRUNK #6878 (See [https://builds.apache.org/job/HBase-TRUNK/6878/]) HBASE-14561 Disable zombie TestReplicationShell (stack: rev fd6acbbf51998a964b6dc0c7d3ee471399a03baa) * hbase-shell/src/test/java/org/apache/hadoop/hbase/client/TestReplicationShell.java > Disable zombie TestReplicationShell > --- > > Key: HBASE-14561 > URL: https://issues.apache.org/jira/browse/HBASE-14561 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0 > > Attachments: 14561.txt > > > It hung three times in last 40 test runs. Will file issue to reenable it when > someone has chance to look at why it is hanging. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers
[ https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945864#comment-14945864 ] stack commented on HBASE-14559: --- bq. stack when you traced the tests using excessive number of threads, did they timeout because they run slower with lower number of threads, or did they deadlock? At the extreme, the test would not run (OOME could not create native thread). I cut the thread count down and then they would not complete (smile). bq. I think there're still bugs lurking around in the implementation If we have number of thread handlers 3 rather than 40, I might expect things running noticeably slower, but not the deadlock? The deadlock was handlers all occupied.. not enough for the test to complete. > branch-1 test tweeks; disable assert explicit region lands post-restart and > up a few handlers > - > > Key: HBASE-14559 > URL: https://issues.apache.org/jira/browse/HBASE-14559 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14559.branch-1.txt, 14559.master.txt > > > Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). > Small tweaks get tests to pass. Small one liners that up priority handler > count and disable assert that seems wrong -- that we'll always get an explcit > region to land on a newly started server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer
[ https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-13819: -- Attachment: HBASE-13819_branch-1.patch Retry Commit I'd say [~anoop.hbase] Needs a release note sir. > Make RPC layer CellBlock buffer a DirectByteBuffer > -- > > Key: HBASE-13819 > URL: https://issues.apache.org/jira/browse/HBASE-13819 > Project: HBase > Issue Type: Sub-task > Components: Scanners >Reporter: Anoop Sam John >Assignee: Anoop Sam John > Fix For: 2.0.0, 1.3.0 > > Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, > HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch > > > In RPC layer, when we make a cellBlock to put as RPC payload, we will make an > on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto > certain number of buffers. This jira aims at testing possibility for making > this buffers off heap ones. (DBB) The advantages > 1. Unsafe based writes to off heap is faster than that to on heap. Now we are > not using unsafe based writes at all. Even if we add, DBB will be better > 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap > writes will be better > 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer > to the socket channel, it will create a temp DBB and copy data to there and > only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can > avoid this one more level of copying. > Will do different perf testing with changed and report back. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12615) Document GC conserving guidelines for contributors
[ https://issues.apache.org/jira/browse/HBASE-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945889#comment-14945889 ] Jonathan Hsieh commented on HBASE-12615: Wow that is a lot of trailing space removal. caught this in one of them: nit : This will output smt like: -> This will output something like: lgtm +1. > Document GC conserving guidelines for contributors > -- > > Key: HBASE-12615 > URL: https://issues.apache.org/jira/browse/HBASE-12615 > Project: HBase > Issue Type: Bug > Components: documentation >Reporter: Andrew Purtell >Assignee: Misty Stanley-Jones > Attachments: HBASE-12615.patch > > > LinkedIn put up a blog post with a nice concise list of GC conserving > techniques we should document for contributors. Additionally, when we're at a > point our build supports custom error-prone plugins, we can develop warnings > for some of them. > Source: > http://engineering.linkedin.com/performance/linkedin-feed-faster-less-jvm-garbage > - Be careful with Iterators > - Estimate the size of a collection when initializing > - Defer expression evaluation > - Compile the regex patterns in advance > - Cache it if you can > - String Interns are useful but dangerous > All good advice and practice that I know we aim for. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12983) HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled
[ https://issues.apache.org/jira/browse/HBASE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945891#comment-14945891 ] Jonathan Hsieh commented on HBASE-12983: +1 lgtm. > HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled > -- > > Key: HBASE-12983 > URL: https://issues.apache.org/jira/browse/HBASE-12983 > Project: HBase > Issue Type: Bug > Components: documentation >Reporter: Esteban Gutierrez >Assignee: Misty Stanley-Jones > Attachments: HBASE-12983.patch > > > In the HBase book we say the following: > {quote} > A default HBase install uses insecure HTTP connections for web UIs for the > master and region servers. To enable secure HTTP (HTTPS) connections instead, > set *hadoop.ssl.enabled* to true in hbase-site.xml. This does not change the > port used by the Web UI. To change the port for the web UI for a given HBase > component, configure that port’s setting in hbase-site.xml. These settings > are: > {quote} > The property should be *hbase.ssl.enabled* instead. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13425) Documentation nit in REST Gateway impersonation section
[ https://issues.apache.org/jira/browse/HBASE-13425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945902#comment-14945902 ] Jonathan Hsieh commented on HBASE-13425: +1 lgtm. > Documentation nit in REST Gateway impersonation section > --- > > Key: HBASE-13425 > URL: https://issues.apache.org/jira/browse/HBASE-13425 > Project: HBase > Issue Type: Improvement > Components: documentation >Affects Versions: 2.0.0 >Reporter: Jeremie Gomez >Assignee: Misty Stanley-Jones >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-13425.patch > > > In section "55.8. REST Gateway Impersonation Configuration", there is another > property that needs to be set (and thus documented). > After this sentence ("To enable REST gateway impersonation, add the following > to the hbase-site.xml file for every REST gateway."), we should add : > >hbase.rest.support.proxyuser > true > > It not set, doing a curl call on the rest gateway gives the error "support > for proxyuser is not configured". -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13478) Document the change of default master ports being used .
[ https://issues.apache.org/jira/browse/HBASE-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945903#comment-14945903 ] Jonathan Hsieh commented on HBASE-13478: +1 lgtm. > Document the change of default master ports being used . > > > Key: HBASE-13478 > URL: https://issues.apache.org/jira/browse/HBASE-13478 > Project: HBase > Issue Type: Sub-task > Components: documentation >Reporter: Srikanth Srungarapu >Assignee: Misty Stanley-Jones >Priority: Minor > Attachments: HBASE-13478.patch > > > In 1.0.x, master by default binds to the region server ports. But in 1.1 and > 2.0 branches, we have undone this changes and brought back the usage of old > master ports to make the migration from 0.98 -> 1.1 hassle free. Please see > the parent jira for more background. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14424) Document that DisabledRegionSplitPolicy blocks manual splits
[ https://issues.apache.org/jira/browse/HBASE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945907#comment-14945907 ] Jonathan Hsieh commented on HBASE-14424: Change 'DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy etc. DisabledRegionSplitPolicy' to 'DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy, DisabledRegionSplitPolicy, etc.' then +1 lgtm. > Document that DisabledRegionSplitPolicy blocks manual splits > > > Key: HBASE-14424 > URL: https://issues.apache.org/jira/browse/HBASE-14424 > Project: HBase > Issue Type: Task > Components: documentation >Reporter: Misty Stanley-Jones >Assignee: Misty Stanley-Jones >Priority: Minor > Attachments: HBASE-14424.patch > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner
[ https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945926#comment-14945926 ] stack commented on HBASE-13082: --- bq. But in case of bulk loaded files, currently in between a scan if a new file is bulk loaded it gets included, so after this it will not be. is that behavioral change fine? Sorry, say more [~ram_krish]. So, bulk load won't show mid-scan... you have to get to the end? That would be fine. On the patch, can we get more of Lars comments in on what is going on Could we get rid of some of these getReaderLocks too... in hstorefile, in hstore, etc would be good to not let this stuff out if we can. > Coarsen StoreScanner locks to RegionScanner > --- > > Key: HBASE-13082 > URL: https://issues.apache.org/jira/browse/HBASE-13082 > Project: HBase > Issue Type: Bug >Reporter: Lars Hofhansl >Assignee: ramkrishna.s.vasudevan > Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, > 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, > next.png, next.png > > > Continuing where HBASE-10015 left of. > We can avoid locking (and memory fencing) inside StoreScanner by deferring to > the lock already held by the RegionScanner. > In tests this shows quite a scan improvement and reduced CPU (the fences make > the cores wait for memory fetches). > There are some drawbacks too: > * All calls to RegionScanner need to be remain synchronized > * Implementors of coprocessors need to be diligent in following the locking > contract. For example Phoenix does not lock RegionScanner.nextRaw() and > required in the documentation (not picking on Phoenix, this one is my fault > as I told them it's OK) > * possible starving of flushes and compaction with heavy read load. > RegionScanner operations would keep getting the locks and the > flushes/compactions would not be able finalize the set of files. > I'll have a patch soon. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster
[ https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14565: --- Attachment: 14565-v1.txt > Make ZK connection timeout configurable in MiniZooKeeperCluster > --- > > Key: HBASE-14565 > URL: https://issues.apache.org/jira/browse/HBASE-14565 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14565-v1.txt > > > This request was made by [~swagle] who works on Ambari Metrics System. > Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster > This affects operation of Ambari Metrics System in standalone mode. > This JIRA is to make the connection timeout configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster
[ https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14565: --- Attachment: (was: 14565-v1.txt) > Make ZK connection timeout configurable in MiniZooKeeperCluster > --- > > Key: HBASE-14565 > URL: https://issues.apache.org/jira/browse/HBASE-14565 > Project: HBase > Issue Type: Improvement >Reporter: Ted Yu >Assignee: Ted Yu > Attachments: 14565-v1.txt > > > This request was made by [~swagle] who works on Ambari Metrics System. > Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster > This affects operation of Ambari Metrics System in standalone mode. > This JIRA is to make the connection timeout configurable. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14420) Zombie Stomping Session
[ https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945948#comment-14945948 ] stack commented on HBASE-14420: --- Looking at trunk builds, I see these failures in last twenty builds: {code} 2 Hanging test : org.apache.hadoop.hbase.TestMovedRegionsCleaner 2 Hanging test : org.apache.hadoop.hbase.TestMultiVersions 2 Hanging test : org.apache.hadoop.hbase.TestPartialResultsFromClientSide 2 Hanging test : org.apache.hadoop.hbase.backup.TestHFileArchiving 10 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSide 8 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor 2 Hanging test : org.apache.hadoop.hbase.client.TestReplicasClient 2 Hanging test : org.apache.hadoop.hbase.http.TestHttpServer 2 Hanging test : org.apache.hadoop.hbase.mapred.TestMultiTableSnapshotInputFormat 4 Hanging test : org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat 2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat 4 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat2 2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestImportExport 2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery 2 Hanging test : org.apache.hadoop.hbase.master.TestDistributedLogSplitting 4 Hanging test : org.apache.hadoop.hbase.master.TestSplitLogManager 2 Hanging test : org.apache.hadoop.hbase.master.TestTableLockManager 2 Hanging test : org.apache.hadoop.hbase.master.TestWarmupRegion 2 Hanging test : org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer 2 Hanging test : org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2 4 Hanging test : org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures 1 Hanging test : org.apache.hadoop.hbase.regionserver.TestDefaultCompactSelection 2 Hanging test : org.apache.hadoop.hbase.regionserver.TestMobStoreScanner 2 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint 2 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager 2 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes 2 Hanging test : org.apache.hadoop.hbase.util.TestHBaseFsck 2 Hanging test : org.apache.hadoop.hbase.util.TestMiniClusterLoadEncoded 2 Hanging test : org.apache.hadoop.hbase.util.TestMiniClusterLoadParallel 2 Hanging test : org.apache.hadoop.hbase.util.TestRegionSplitter 2 Hanging test : org.apache.hadoop.hbase.wal.TestWALFiltering 2 Hanging test : org.apache.hadoop.hbase.wal.TestWALSplit 2 Hanging test : org.apache.hadoop.hbase.zookeeper.TestHQuorumPeer {code} Here is branch-1.1 builds: {code} 1 Hanging test : org.apache.hadoop.hbase.TestPartialResultsFromClientSide 1 Hanging test : org.apache.hadoop.hbase.client.TestAdmin1 1 Hanging test : org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient 1 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor 1 Hanging test : org.apache.hadoop.hbase.client.TestHCM 1 Hanging test : org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient 1 Hanging test : org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat 1 Hanging test : org.apache.hadoop.hbase.quotas.TestQuotaAdmin 1 Hanging test : org.apache.hadoop.hbase.quotas.TestQuotaThrottle 1 Hanging test : org.apache.hadoop.hbase.regionserver.TestJoinedScanners 1 Hanging test : org.apache.hadoop.hbase.regionserver.TestTags 1 Hanging test : org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster 1 Hanging test : org.apache.hadoop.hbase.regionserver.wal.TestLogRolling 1 Hanging test : org.apache.hadoop.hbase.replication.TestMasterReplication 1 Hanging test : org.apache.hadoop.hbase.replication.TestReplicationSmallTests 1 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint 1 Hanging test : org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager 1 Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController 1 Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController2 1 Hanging test : org.apache.hadoop.hbase.security.access.TestTablePermissions 1 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsReplication 1 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDefaultVisLabelService 2 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes 1 Hanging test : org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay {code} > Zombie Stomping Session > --- > >
[jira] [Updated] (HBASE-12911) Client-side metrics
[ https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Dimiduk updated HBASE-12911: - Attachment: 12911.yammer.v03.branch-1.patch Here's a backport of yammer.v03 to branch-1. [~busbey] I think we're close here. Are you interested in this for 1.2? > Client-side metrics > --- > > Key: HBASE-12911 > URL: https://issues.apache.org/jira/browse/HBASE-12911 > Project: HBase > Issue Type: New Feature > Components: Client, Operability, Performance >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk > Fix For: 2.0.0, 1.3.0 > > Attachments: 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, > 12911-branch-1.00.patch, 12911.yammer.jpg, 12911.yammer.v00.patch, > 12911.yammer.v01.patch, 12911.yammer.v02.patch, 12911.yammer.v02.patch, > 12911.yammer.v03.branch-1.patch, 12911.yammer.v03.patch, > 12911.yammer.v03.patch, am.jpg, client metrics RS-Master.jpg, client metrics > client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg > > > There's very little visibility into the hbase client. Folks who care to add > some kind of metrics collection end up wrapping Table method invocations with > {{System.currentTimeMillis()}}. For a crude example of this, have a look at > what I did in {{PerformanceEvaluation}} for exposing requests latencies up to > {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a > lot going on under the hood that is impossible to see right now without a > profiler. Being a crucial part of the performance of this distributed system, > we should have deeper visibility into the client's function. > I'm not sure that wiring into the hadoop metrics system is the right choice > because the client is often embedded as a library in a user's application. We > should have integration with our metrics tools so that, i.e., a client > embedded in a coprocessor can report metrics through the usual RS channels, > or a client used in a MR job can do the same. > I would propose an interface-based system with pluggable implementations. Out > of the box we'd include a hadoop-metrics implementation and one other, > possibly [dropwizard/metrics|https://github.com/dropwizard/metrics]. > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12593) Tags and Tag dictionary to work with BB
[ https://issues.apache.org/jira/browse/HBASE-12593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945953#comment-14945953 ] stack commented on HBASE-12593: --- What needs to be done here? > Tags and Tag dictionary to work with BB > --- > > Key: HBASE-12593 > URL: https://issues.apache.org/jira/browse/HBASE-12593 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: ramkrishna.s.vasudevan >Assignee: Anoop Sam John > > Adding the subtask so that we don't forget it. Came up while reviewing the > items required for this parent task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14398) Create the fake keys required in the scan path to avoid copy to byte[]
[ https://issues.apache.org/jira/browse/HBASE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945962#comment-14945962 ] stack commented on HBASE-14398: --- bq. This was a long discussion that we had before finalising. Yeah. I remember that one. Seems to be about a topic that is a little different to the question here. Why does ByteBufferedCell have to have getFamilyPositionInByteBuffer at all? Why can't I just call getFamilyOffset on the ByteBufferedCell implementation and it returns me an offset that makes sense on the ByteBuffer returned out of getFamilyByteBuffer? (A Cell can't be simultaneously onheap and offheap at same time, right) > Create the fake keys required in the scan path to avoid copy to byte[] > -- > > Key: HBASE-14398 > URL: https://issues.apache.org/jira/browse/HBASE-14398 > Project: HBase > Issue Type: Sub-task >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: HBASE-14398.patch, HBASE-14398_1.patch > > > Already we have created some fake keys for the ByteBufferedCells so that we > can avoid the copy requried to create fake keys. This JIRA aims to fill up > all such places so that the Offheap BBs are not copied to onheap byte[]. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14221) Reduce the number of time row comparison is done in a Scan
[ https://issues.apache.org/jira/browse/HBASE-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945966#comment-14945966 ] stack commented on HBASE-14221: --- [~lhofhansl] For you... > Reduce the number of time row comparison is done in a Scan > -- > > Key: HBASE-14221 > URL: https://issues.apache.org/jira/browse/HBASE-14221 > Project: HBase > Issue Type: Sub-task > Components: Scanners >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > Fix For: 2.0.0 > > Attachments: HBASE-14221.patch, HBASE-14221_1.patch, > HBASE-14221_1.patch, HBASE-14221_6.patch, withmatchingRowspatch.png, > withoutmatchingRowspatch.png > > > When we tried to do some profiling with the PE tool found this. > Currently we do row comparisons in 3 places in a simple Scan case. > 1) ScanQueryMatcher > {code} >int ret = this.rowComparator.compareRows(curCell, cell); > if (!this.isReversed) { > if (ret <= -1) { > return MatchCode.DONE; > } else if (ret >= 1) { > // could optimize this, if necessary? > // Could also be called SEEK_TO_CURRENT_ROW, but this > // should be rare/never happens. > return MatchCode.SEEK_NEXT_ROW; > } > } else { > if (ret <= -1) { > return MatchCode.SEEK_NEXT_ROW; > } else if (ret >= 1) { > return MatchCode.DONE; > } > } > {code} > 2) In StoreScanner next() while starting to scan the row > {code} > if (!scannerContext.hasAnyLimit(LimitScope.BETWEEN_CELLS) || > matcher.curCell == null || > isNewRow || !CellUtil.matchingRow(peeked, matcher.curCell)) { > this.countPerRow = 0; > matcher.setToNewRow(peeked); > } > {code} > Particularly to see if we are in a new row. > 3) In HRegion > {code} > scannerContext.setKeepProgress(true); > heap.next(results, scannerContext); > scannerContext.setKeepProgress(tmpKeepProgress); > nextKv = heap.peek(); > moreCellsInRow = moreCellsInRow(nextKv, currentRowCell); > {code} > Here again there are cases where we need to careful for a MultiCF case. Was > trying to solve this for the MultiCF case but is having lot of cases to > solve. But atleast for a single CF case I think these comparison can be > reduced. > So for a single CF case in the SQM we are able to find if we have crossed a > row using the code pasted above in SQM. That comparison is definitely needed. > Now in case of a single CF the HRegion is going to have only one element in > the heap and so the 3rd comparison can surely be avoided if the > StoreScanner.next() was over due to MatchCode.DONE caused by SQM. > Coming to the 2nd compareRows that we do in StoreScanner. next() - even that > can be avoided if we know that the previous next() call was over due to a new > row. Doing all this I found that the compareRows in the profiler which was > 19% got reduced to 13%. Initially we can solve for single CF case which can > be extended to MultiCF cases. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14117) Check DBEs where fields are being read from Bytebuffers but unused.
[ https://issues.apache.org/jira/browse/HBASE-14117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945968#comment-14945968 ] stack commented on HBASE-14117: --- What else is to be done here? If it is speculative benefit, move it out as subtask of the parent issue? > Check DBEs where fields are being read from Bytebuffers but unused. > --- > > Key: HBASE-14117 > URL: https://issues.apache.org/jira/browse/HBASE-14117 > Project: HBase > Issue Type: Sub-task >Reporter: ramkrishna.s.vasudevan >Assignee: Jingcheng Du > > {code} > public Cell getFirstKeyCellInBlock(ByteBuff block) { > block.mark(); > block.position(Bytes.SIZEOF_INT); > int keyLength = ByteBuff.readCompressedInt(block); > // TODO : See if we can avoid these reads as the read values are not > getting used > ByteBuff.readCompressedInt(block); > {code} > In DBEs many a places we read the integers just to skip them. This JIRA is to > see if we can avoid this and rather go position based, as per a review > comment in HBASE-12213. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12291) Create Read only buffers where ever possible
[ https://issues.apache.org/jira/browse/HBASE-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945970#comment-14945970 ] stack commented on HBASE-12291: --- That'd be nice. Does it have to be part of the parent issue given it speculative? > Create Read only buffers where ever possible > > > Key: HBASE-12291 > URL: https://issues.apache.org/jira/browse/HBASE-12291 > Project: HBase > Issue Type: Sub-task > Components: regionserver, Scanners >Reporter: ramkrishna.s.vasudevan >Assignee: ramkrishna.s.vasudevan > > This issue is to see if we can really create a Read only buffer in the read > path. Later can see if this needs to be BR or our own BB impl. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking
[ https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ted Yu updated HBASE-14497: --- Resolution: Fixed Status: Resolved (was: Patch Available) > Reverse Scan threw StackOverflow caused by readPt checking > -- > > Key: HBASE-14497 > URL: https://issues.apache.org/jira/browse/HBASE-14497 > Project: HBase > Issue Type: Bug >Affects Versions: 2.0.0, 0.98.14, 1.3.0 >Reporter: Yerui Sun >Assignee: Yerui Sun > Fix For: 2.0.0, 1.3.0 > > Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, > HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, > HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, > HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, > HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, > HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, > HBASE-14497-master.patch > > > I met stack overflow error in StoreFileScanner.seekToPreviousRow using > reversed scan. I searched and founded HBASE-14155, but it seems to be a > different reason. > The seekToPreviousRow will fetch the row which closest before, and compare > mvcc to the readPt, which acquired when scanner created. If the row's mvcc is > bigger than readPt, an recursive call of seekToPreviousRow will invoked, to > find the next closest before row. > Considering we created a scanner for reversed scan, and some data with > smaller rows was written and flushed, before calling scanner next. When > seekToPreviousRow was invoked, it would call itself recursively, until all > rows which written after scanner created were iterated. The depth of > recursive calling stack depends on the count of rows, the stack overflow > error will be threw if the count of rows is large, like 1. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration
[ https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945999#comment-14945999 ] Hudson commented on HBASE-14436: FAILURE: Integrated in HBase-1.0 #1073 (See [https://builds.apache.org/job/HBase-1.0/1073/]) HBASE-14436 HTableDescriptor#addCoprocessor will always make (stack: rev c1890b5b15a3cb3ed9c00f4326e4eb6b583c55a6) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java > HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create > new Configuration > --- > > Key: HBASE-14436 > URL: https://issues.apache.org/jira/browse/HBASE-14436 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.1 >Reporter: Jianwei Cui >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch > > > HTableDescriptor#addCoprocessor will set the coprocessor value as following > format: > {code} > public HTableDescriptor addCoprocessor(String className, Path jarFilePath, > int priority, final Map kvs) > throws IOException { > ... > String value = ((jarFilePath == null)? "" : jarFilePath.toString()) + > "|" + className + "|" + Integer.toString(priority) + "|" + > kvString.toString(); > ... > } > {code} > If the 'jarFilePath' is null, the 'value' will always has the format > '|className|priority|' even if 'kvs' is null, which means no extra arguments > for the coprocessor. Then, in the server side, > RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table > coprocessors as: > {code} > static List > getTableCoprocessorAttrsFromSchema(Configuration conf, > HTableDescriptor htd) { > ... > try { > cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the > format '|className|priority|' > } catch (IndexOutOfBoundsException ex) { > // ignore > } > Configuration ourConf; > if (cfgSpec != null) { // => cfgSpec will be '|' for the format > '|className|priority|' > ourConf = new Configuration(false); > HBaseConfiguration.merge(ourConf, conf); > } > ... > } > {code} > The 'cfgSpec' will be '|' for the coprocessor formatted as > '|className|priority|', so that always create a new Configuration. > In our production, there are a lot of tables having table-level coprocessors, > so that the region server will create new Configurations for each region of > the table, this will consume a certain number of memory when we have many > such regions. > To fix the problem, we can make the HTableDescriptor not append the '|' if no > extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in > server side which could avoid creating new Configurations for existed such > regions after the regions reopened. Discussions and suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14517) Show regionserver's version in master status page
[ https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946000#comment-14946000 ] stack commented on HBASE-14517: --- This looks really nice [~liushaohui] Operators will like it. Why you move the VersionInfo from RPC to HBase protos? Will that break anyone (I don't think so.. since you do not change the pb data structure) > Show regionserver's version in master status page > - > > Key: HBASE-14517 > URL: https://issues.apache.org/jira/browse/HBASE-14517 > Project: HBase > Issue Type: Improvement > Components: monitoring >Reporter: Liu Shaohui >Assignee: Liu Shaohui >Priority: Minor > Fix For: 2.0.0 > > Attachments: HBASE-14517-v1.diff > > > In production env, regionservers may be removed from the cluster for hardware > problems and rejoined the cluster after the repair. There is a potential risk > that the version of rejoined regionserver may diff from others because the > cluster has been upgraded through many versions. > To solve this, we can show the all regionservers' version in the server list > of master's status page, and highlight the regionserver when its version is > different from the master's version, similar to HDFS-3245 > Suggestions are welcome~ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14493) Upgrade the jamon-runtime dependency to the newer version MPL 2.0
[ https://issues.apache.org/jira/browse/HBASE-14493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946001#comment-14946001 ] stack commented on HBASE-14493: --- +1 from me [~apurtell] Yeah, you good w/ this [~busbey]? > Upgrade the jamon-runtime dependency to the newer version MPL 2.0 > - > > Key: HBASE-14493 > URL: https://issues.apache.org/jira/browse/HBASE-14493 > Project: HBase > Issue Type: Task >Affects Versions: 1.1.1 >Reporter: Newton Alex >Assignee: Andrew Purtell >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.16 > > Attachments: HBASE-14493-0.98.patch, HBASE-14493-branch-1.patch, > HBASE-14493.patch, HBASE-14493.patch > > > Current version of HBase uses MPL 1.1 which has legal restrictions. Newer > versions of jamon-runtime appear to be MPL 2.0. HBase should upgrade to a > safer licensed version of jamon. > 2.4.0 is MPL 1.1 : > http://grepcode.com/snapshot/repo1.maven.org/maven2/org.jamon/jamon-runtime/2.4.0 > 2.4.1 is MPL 2.0 : > http://grepcode.com/snapshot/repo1.maven.org/maven2/org.jamon/jamon-runtime/2.4.1 > Here’s a comparison of the equivalent sections of the respective licenses > dealing w/ Termination: > MPL 1.1 - Section 8 (Termination) Subsection 2: > 8.2. If You initiate litigation by asserting a patent infringement claim > (excluding declatory judgment actions) against Initial Developer or a > Contributor (the Initial Developer or Contributor against whom You file such > action is referred to as "Participant") alleging that: > such Participant's Contributor Version directly or indirectly infringes any > patent, then any and all rights granted by such Participant to You under > Sections 2.1 and/or 2.2 of this License shall, upon 60 days notice from > Participant terminate prospectively, unless if within 60 days after receipt > of notice You either: (i) agree in writing to pay Participant a mutually > agreeable reasonable royalty for Your past and future use of Modifications > made by such Participant, or (ii) withdraw Your litigation claim with respect > to the Contributor Version against such Participant. If within 60 days of > notice, a reasonable royalty and payment arrangement are not mutually agreed > upon in writing by the parties or the litigation claim is not withdrawn, the > rights granted by Participant to You under Sections 2.1 and/or 2.2 > automatically terminate at the expiration of the 60 day notice period > specified above. > any software, hardware, or device, other than such Participant's Contributor > Version, directly or indirectly infringes any patent, then any rights granted > to You by such Participant under Sections 2.1(b) and 2.2(b) are revoked > effective as of the date You first made, used, sold, distributed, or had > made, Modifications made by that Participant. > MPL 2.0 - Section 5 (Termination) Subsection 2: > 5.2. If You initiate litigation against any entity by asserting a patent > infringement claim (excluding declaratory judgment actions, counter-claims, > and cross-claims) alleging that a Contributor Version directly or indirectly > infringes any patent, then the rights granted to You by any and all > Contributors for the Covered Software under Section 2.1 of this License shall > terminate. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946013#comment-14946013 ] Hudson commented on HBASE-14563: FAILURE: Integrated in HBase-1.2 #230 (See [https://builds.apache.org/job/HBase-1.2/230/]) HBASE-14563 Disable zombie TestHFileOutputFormat2 (stack: rev 22c87d9644c600788a0df5456333464cba969c49) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14563.txt > > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14268) Improve KeyLocker
[ https://issues.apache.org/jira/browse/HBASE-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14268: -- Attachment: HBASE-14268-V7.patch Reattach. [~ikeda] That is interesting. Weak references will be collected by GC if in new gen but not if it makes it up into old gen (You should do a blog post on your findings here). > Improve KeyLocker > - > > Key: HBASE-14268 > URL: https://issues.apache.org/jira/browse/HBASE-14268 > Project: HBase > Issue Type: Improvement > Components: util >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Fix For: 2.0.0, 1.3.0 > > Attachments: 14268-V5.patch, HBASE-14268-V2.patch, > HBASE-14268-V3.patch, HBASE-14268-V4.patch, HBASE-14268-V5.patch, > HBASE-14268-V5.patch, HBASE-14268-V6.patch, HBASE-14268-V7.patch, > HBASE-14268-V7.patch, HBASE-14268-V7.patch, HBASE-14268-V7.patch, > HBASE-14268-V7.patch, HBASE-14268.patch, KeyLockerIncrKeysPerformance.java, > KeyLockerPerformance.java, ReferenceTestApp.java > > > 1. In the implementation of {{KeyLocker}} it uses atomic variables inside a > synchronized block, which doesn't make sense. Moreover, logic inside the > synchronized block is not trivial so that it makes less performance in heavy > multi-threaded environment. > 2. {{KeyLocker}} gives an instance of {{RentrantLock}} which is already > locked, but it doesn't follow the contract of {{ReentrantLock}} because you > are not allowed to freely invoke lock/unlock methods under that contract. > That introduces a potential risk; Whenever you see a variable of the type > {{RentrantLock}}, you should pay attention to what the included instance is > coming from. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration
[ https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946028#comment-14946028 ] Hudson commented on HBASE-14436: FAILURE: Integrated in HBase-1.1 #697 (See [https://builds.apache.org/job/HBase-1.1/697/]) HBASE-14436 HTableDescriptor#addCoprocessor will always make (stack: rev 2c662898037b6ad9e17399f0c7914bc785622202) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java > HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create > new Configuration > --- > > Key: HBASE-14436 > URL: https://issues.apache.org/jira/browse/HBASE-14436 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.1 >Reporter: Jianwei Cui >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch > > > HTableDescriptor#addCoprocessor will set the coprocessor value as following > format: > {code} > public HTableDescriptor addCoprocessor(String className, Path jarFilePath, > int priority, final Map kvs) > throws IOException { > ... > String value = ((jarFilePath == null)? "" : jarFilePath.toString()) + > "|" + className + "|" + Integer.toString(priority) + "|" + > kvString.toString(); > ... > } > {code} > If the 'jarFilePath' is null, the 'value' will always has the format > '|className|priority|' even if 'kvs' is null, which means no extra arguments > for the coprocessor. Then, in the server side, > RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table > coprocessors as: > {code} > static List > getTableCoprocessorAttrsFromSchema(Configuration conf, > HTableDescriptor htd) { > ... > try { > cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the > format '|className|priority|' > } catch (IndexOutOfBoundsException ex) { > // ignore > } > Configuration ourConf; > if (cfgSpec != null) { // => cfgSpec will be '|' for the format > '|className|priority|' > ourConf = new Configuration(false); > HBaseConfiguration.merge(ourConf, conf); > } > ... > } > {code} > The 'cfgSpec' will be '|' for the coprocessor formatted as > '|className|priority|', so that always create a new Configuration. > In our production, there are a lot of tables having table-level coprocessors, > so that the region server will create new Configurations for each region of > the table, this will consume a certain number of memory when we have many > such regions. > To fix the problem, we can make the HTableDescriptor not append the '|' if no > extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in > server side which could avoid creating new Configurations for existed such > regions after the regions reopened. Discussions and suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12911) Client-side metrics
[ https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946029#comment-14946029 ] stack commented on HBASE-12911: --- I'm good w/ use of pb (there is no unpacking that I saw...) +1'd the patch. How does an operator use this stuff? They'd have to look for client jmx footprint on a machine? Needs a bit of doc in the release note. Nice addition. > Client-side metrics > --- > > Key: HBASE-12911 > URL: https://issues.apache.org/jira/browse/HBASE-12911 > Project: HBase > Issue Type: New Feature > Components: Client, Operability, Performance >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk > Fix For: 2.0.0, 1.3.0 > > Attachments: 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, > 12911-branch-1.00.patch, 12911.yammer.jpg, 12911.yammer.v00.patch, > 12911.yammer.v01.patch, 12911.yammer.v02.patch, 12911.yammer.v02.patch, > 12911.yammer.v03.branch-1.patch, 12911.yammer.v03.patch, > 12911.yammer.v03.patch, am.jpg, client metrics RS-Master.jpg, client metrics > client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg > > > There's very little visibility into the hbase client. Folks who care to add > some kind of metrics collection end up wrapping Table method invocations with > {{System.currentTimeMillis()}}. For a crude example of this, have a look at > what I did in {{PerformanceEvaluation}} for exposing requests latencies up to > {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a > lot going on under the hood that is impossible to see right now without a > profiler. Being a crucial part of the performance of this distributed system, > we should have deeper visibility into the client's function. > I'm not sure that wiring into the hadoop metrics system is the right choice > because the client is often embedded as a library in a user's application. We > should have integration with our metrics tools so that, i.e., a client > embedded in a coprocessor can report metrics through the usual RS channels, > or a client used in a MR job can do the same. > I would propose an interface-based system with pluggable implementations. Out > of the box we'd include a hadoop-metrics implementation and one other, > possibly [dropwizard/metrics|https://github.com/dropwizard/metrics]. > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14451) Move on to htrace-4.0.1 (from htrace-3.2.0)
[ https://issues.apache.org/jira/browse/HBASE-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14451: -- Attachment: 14451.v10.txt Retry. Rebase. > Move on to htrace-4.0.1 (from htrace-3.2.0) > --- > > Key: HBASE-14451 > URL: https://issues.apache.org/jira/browse/HBASE-14451 > Project: HBase > Issue Type: Task >Reporter: stack >Assignee: stack > Attachments: 14451.txt, 14451.v10.txt, 14451v2.txt, 14451v3.txt, > 14451v4.txt, 14451v5.txt, 14451v6.txt, 14451v7.txt, 14451v8.txt, 14451v9.txt > > > htrace-4.0.0 was just release with a new API. Get up on it. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14479: -- Attachment: HBASE-14479-V2.patch Retry. > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, > HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader
[ https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946052#comment-14946052 ] stack commented on HBASE-14479: --- Here's link http://www.kircher-schwanninger.de/michael/publications/lf.pdf I like the explanation here too: http://stackoverflow.com/questions/3058272/explain-leader-follower-pattern Patch seems good. You tried it [~ikeda] (if you'd messed up, unit tests would be failing...) Anyway we could figure if a benefit? I can try running on a cluster and see Thanks [~ikeda] > Apply the Leader/Followers pattern to RpcServer's Reader > > > Key: HBASE-14479 > URL: https://issues.apache.org/jira/browse/HBASE-14479 > Project: HBase > Issue Type: Improvement > Components: IPC/RPC, Performance >Reporter: Hiroshi Ikeda >Assignee: Hiroshi Ikeda >Priority: Minor > Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, > HBASE-14479.patch > > > {{RpcServer}} uses multiple selectors to read data for load distribution, but > the distribution is just done by round-robin. It is uncertain, especially for > long run, whether load is equally divided and resources are used without > being wasted. > Moreover, multiple selectors may cause excessive context switches which give > priority to low latency (while we just add the requests to queues), and it is > possible to reduce throughput of the whole server. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HBASE-14458) AsyncRpcClient#createRpcChannel() should check and remove dead channel before creating new one to same server
[ https://issues.apache.org/jira/browse/HBASE-14458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] stack updated HBASE-14458: -- Attachment: HBASE-14458 (1).patch Retry > AsyncRpcClient#createRpcChannel() should check and remove dead channel before > creating new one to same server > - > > Key: HBASE-14458 > URL: https://issues.apache.org/jira/browse/HBASE-14458 > Project: HBase > Issue Type: Bug > Components: IPC/RPC >Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.1.3 >Reporter: Samir Ahmic >Assignee: Samir Ahmic >Priority: Critical > Fix For: 2.0.0 > > Attachments: HBASE-14458 (1).patch, HBASE-14458.patch, > HBASE-14458.patch > > > I have notice this issue while testing master branch in distributed mode. > Reproduction steps: > 1. Write some data with hbase ltt > 2. While ltt is writing execute $graceful_stop.sh --restart --reload [rs] > 3. Wait until script start to reload regions to restarted server. In that > moment ltt will stop writing and eventually fail. > After some digging i have notice that while ltt is working correctly there is > single connection per regionserver (lsof for single connection, 27109 is ltt > PID ) > {code} > java 27109 hbase 143u210579579 0t0TCP > hnode1:40423->hnode5:16020 (ESTABLISHED) > {code} > and when in this example hnode5 server is restarted and script starts to > reload regions on this server ltt start creating thousands of new tcp > connections to this server: > {code} > java 27109 hbase *623u 210674415 0t0TCP > hnode1:52948->hnode5:16020 (ESTABLISHED) > java 27109 hbase *624u 210674416 0t0TCP > hnode1:52949->hnode5:16020 (ESTABLISHED) > java 27109 hbase *625u 210674417 0t0TCP > hnode1:52950->hnode5:16020 (ESTABLISHED) > java 27109 hbase *627u 210674419 0t0TCP > hnode1:52952->hnode5:16020 (ESTABLISHED) > java 27109 hbase *628u 210674420 0t0TCP > hnode1:52953->hnode5:16020 (ESTABLISHED) > java 27109 hbase *633u 210674425 0t0TCP > hnode1:52958->hnode5:16020 (ESTABLISHED) > ... > {code} > So here is what happened based on some additional logging and debugging: > - AsyncRpcClient never detected that regionserver is restarted because > regions were moved and there was no write/read requests to this server and > there is no some sort of heart-bit mechanism implemented > - because of above dead {code}AsyncRpcChannel{code} stayed in > {code}PoolMap connections{code} > - when ltt detected that regions are moved back to hnode5 it tried to > reconnect to hnode5 leading this issue > I was able to resolve this issue by adding following to > AsyncRpcClient#createRpcChannel(): > {code} > synchronized (connections) { > if (closed) { > throw new StoppedRpcClientException(); > } > rpcChannel = connections.get(hashCode); > +if (rpcChannel != null && !rpcChannel.isAlive()) { > +LOG.debug(Removing dead channel from "+ > rpcChannel.address.toString()); > +connections.remove(hashCode); > + } > if (rpcChannel == null || !rpcChannel.isAlive()) { > rpcChannel = new AsyncRpcChannel(this.bootstrap, this, ticket, > serviceName, location); > connections.put(hashCode, rpcChannel); > {code} > I will attach patch after some more testing. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-12911) Client-side metrics
[ https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946058#comment-14946058 ] Nick Dimiduk commented on HBASE-12911: -- bq. there is no unpacking that I saw... Unpacking in that I'm reading into the PB Method and switching on the index of the entry; it's based on the generated code so I assume it's an implementation detail that could change in the future. See {{MetricsConnection#updateRpc}}. bq. How does an operator use this stuff? Let me add a release note. Right now they have to look at the JMX of the machine running the client. After HBASE-14381 we'll be exposing the metrics programatically. Do we want another follow-on to allow changing the reporter? This version of yammer also ships with a {{ConsoleReporter}} that allows reporting to System.out. What about disabling client-side metrics collection entirely? > Client-side metrics > --- > > Key: HBASE-12911 > URL: https://issues.apache.org/jira/browse/HBASE-12911 > Project: HBase > Issue Type: New Feature > Components: Client, Operability, Performance >Reporter: Nick Dimiduk >Assignee: Nick Dimiduk > Fix For: 2.0.0, 1.3.0 > > Attachments: 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, > 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, > 12911-branch-1.00.patch, 12911.yammer.jpg, 12911.yammer.v00.patch, > 12911.yammer.v01.patch, 12911.yammer.v02.patch, 12911.yammer.v02.patch, > 12911.yammer.v03.branch-1.patch, 12911.yammer.v03.patch, > 12911.yammer.v03.patch, am.jpg, client metrics RS-Master.jpg, client metrics > client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg > > > There's very little visibility into the hbase client. Folks who care to add > some kind of metrics collection end up wrapping Table method invocations with > {{System.currentTimeMillis()}}. For a crude example of this, have a look at > what I did in {{PerformanceEvaluation}} for exposing requests latencies up to > {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a > lot going on under the hood that is impossible to see right now without a > profiler. Being a crucial part of the performance of this distributed system, > we should have deeper visibility into the client's function. > I'm not sure that wiring into the hadoop metrics system is the right choice > because the client is often embedded as a library in a user's application. We > should have integration with our metrics tools so that, i.e., a client > embedded in a coprocessor can report metrics through the usual RS channels, > or a client used in a MR job can do the same. > I would propose an interface-based system with pluggable implementations. Out > of the box we'd include a hadoop-metrics implementation and one other, > possibly [dropwizard/metrics|https://github.com/dropwizard/metrics]. > Thoughts? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2
[ https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946064#comment-14946064 ] Hudson commented on HBASE-14563: FAILURE: Integrated in HBase-TRUNK #6879 (See [https://builds.apache.org/job/HBase-TRUNK/6879/]) HBASE-14563 Disable zombie TestHFileOutputFormat2 (stack: rev 8fcc8155042766121cb4e99433f23affe2d9ae2d) * hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java > Disable zombie TestHFileOutputFormat2 > - > > Key: HBASE-14563 > URL: https://issues.apache.org/jira/browse/HBASE-14563 > Project: HBase > Issue Type: Sub-task > Components: test >Reporter: stack >Assignee: stack > Fix For: 2.0.0, 1.2.0, 1.3.0 > > Attachments: 14563.txt > > > Disabling until someone has a chance to look at it. > I watched it in jvisualvm a while. Its starting and stopping clusters > multiple times and then running mr jobs. Needs a rewrite at least and some > shrinking of scope on what is tested. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration
[ https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946065#comment-14946065 ] Hudson commented on HBASE-14436: FAILURE: Integrated in HBase-TRUNK #6879 (See [https://builds.apache.org/job/HBase-TRUNK/6879/]) HBASE-14436 HTableDescriptor#addCoprocessor will always make (stack: rev 0ea1f8122709302ee19279aaa438b37dac30c25b) * hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java > HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create > new Configuration > --- > > Key: HBASE-14436 > URL: https://issues.apache.org/jira/browse/HBASE-14436 > Project: HBase > Issue Type: Improvement > Components: Coprocessors >Affects Versions: 1.2.1 >Reporter: Jianwei Cui >Assignee: stack >Priority: Minor > Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16 > > Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch > > > HTableDescriptor#addCoprocessor will set the coprocessor value as following > format: > {code} > public HTableDescriptor addCoprocessor(String className, Path jarFilePath, > int priority, final Map kvs) > throws IOException { > ... > String value = ((jarFilePath == null)? "" : jarFilePath.toString()) + > "|" + className + "|" + Integer.toString(priority) + "|" + > kvString.toString(); > ... > } > {code} > If the 'jarFilePath' is null, the 'value' will always has the format > '|className|priority|' even if 'kvs' is null, which means no extra arguments > for the coprocessor. Then, in the server side, > RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table > coprocessors as: > {code} > static List > getTableCoprocessorAttrsFromSchema(Configuration conf, > HTableDescriptor htd) { > ... > try { > cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the > format '|className|priority|' > } catch (IndexOutOfBoundsException ex) { > // ignore > } > Configuration ourConf; > if (cfgSpec != null) { // => cfgSpec will be '|' for the format > '|className|priority|' > ourConf = new Configuration(false); > HBaseConfiguration.merge(ourConf, conf); > } > ... > } > {code} > The 'cfgSpec' will be '|' for the coprocessor formatted as > '|className|priority|', so that always create a new Configuration. > In our production, there are a lot of tables having table-level coprocessors, > so that the region server will create new Configurations for each region of > the table, this will consume a certain number of memory when we have many > such regions. > To fix the problem, we can make the HTableDescriptor not append the '|' if no > extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in > server side which could avoid creating new Configurations for existed such > regions after the regions reopened. Discussions and suggestions are welcomed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)