date:20151006

[jira] [Commented] (HBASE-13770) Programmatic JAAS configuration option for secure zookeeper may be broken

2015-10-06 Thread Maddineni Sukumar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13770?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944603#comment-14944603
 ] 

Maddineni Sukumar commented on HBASE-13770:
---

Thanks [~apurtell]  for reviewing and pushing this. 

> Programmatic JAAS configuration option for secure zookeeper may be broken
> -
>
> Key: HBASE-13770
> URL: https://issues.apache.org/jira/browse/HBASE-13770
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.13, 1.2.0
>Reporter: Andrew Purtell
>Assignee: Maddineni Sukumar
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-13770-0.98.patch, HBASE-13770-0.98.patch, 
> HBASE-13770-v1.patch, HBASE-13770-v2-0.98.patch, HBASE-13770-v2.patch, 
> HBASE-13770-v3-0.98.patch, HBASE-13770-v4-0.98.patch, 
> HBASE-13770-v4-master.patch
>
>
> While verifying the patch fix for HBASE-13768 we were unable to successfully 
> test the programmatic JAAS configuration option for secure ZooKeeper 
> integration. Unclear if that was due to a bug or incorrect test configuration.
> Update the security section of the online book with clear instructions for 
> setting up the programmatic JAAS configuration option for secure ZooKeeper 
> integration.
> Verify it works.
> Fix as necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944604#comment-14944604
 ] 

Mikhail Antonov commented on HBASE-14559:
-

I think there're several things here (or maybe I'm missing something). 

Conceptually it looks logical to me that on large cluster overall 
responsiveness to admin commands would be improved by serving admin commands 
from separate threadpool (is that actually right? if not, we can revert it 
completely?). There were some deadlock-type bugs (these 2 referenced by 
[~eclark]), but that should be fixed? In theory having yet one more threadpool 
to separate admin commands from things like meta lookups might help prevent it, 
but looks overkill.

For the tests, these features are revealed in corner case, as in minicluster 
everything is running as admin, so admin threadpool is overloaded. On current 
master as I'm seeing in HConstants:

bq. public static final int DEFAULT_REGION_SERVER_HIGH_PRIORITY_HANDLER_COUNT = 
20;

as was set in HBASE-13351.

[~stack] so in these 2 oneliners, you're dropping down the number of high 
priority handlers (to tighten up thread usage?) right?




> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Assigned] (HBASE-13773) Replication should not use ZooKeeper at all for coordination

2015-10-06 Thread Maddineni Sukumar (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13773?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Maddineni Sukumar reassigned HBASE-13773:
-

Assignee: Maddineni Sukumar

> Replication should not use ZooKeeper at all for coordination
> 
>
> Key: HBASE-13773
> URL: https://issues.apache.org/jira/browse/HBASE-13773
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0, 1.0.1, 1.1.0, 0.98.12, 1.2.0
>Reporter: Andrew Purtell
>Assignee: Maddineni Sukumar
>Priority: Critical
>
> Introduce a new system table for replication state and use this table for 
> coordination instead of znodes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944668#comment-14944668
 ] 

Hudson commented on HBASE-14559:


FAILURE: Integrated in HBase-1.2 #229 (See 
[https://builds.apache.org/job/HBase-1.2/229/])
HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: 
rev e568cda7f05880577b27a8e693f8788cae372596)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks

2015-10-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944679#comment-14944679
 ] 

Hadoop QA commented on HBASE-14432:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12765127/HBASE-14432.v1-branch-1.patch
  against branch-1 branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193.
  ATTACHMENT ID: 12765127

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 13 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

 {color:red}-1 core tests{color}.  The patch failed these unit tests:
   
org.apache.hadoop.hbase.master.handler.TestEnableTableHandler

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15881//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15881//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15881//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15881//console

This message is automatically generated.

> Procedure V2 - enforce ACL on procedure admin tasks
> ---
>
> Key: HBASE-14432
> URL: https://issues.apache.org/jira/browse/HBASE-14432
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>  Labels: security
> Attachments: HBASE-14432.v1-branch-1.patch, 
> HBASE-14432.v1-master.patch
>
>
> In the Procedure class, the owner field is never set. We need to set it so 
> that we can enforce ACLs on admin tasks such as whether a user has privilege 
> to abort a procedure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944702#comment-14944702
 ] 

Hudson commented on HBASE-14559:


SUCCESS: Integrated in HBase-1.2-IT #192 (See 
[https://builds.apache.org/job/HBase-1.2-IT/192/])
HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: 
rev e568cda7f05880577b27a8e693f8788cae372596)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java


> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14558) Document ChaosMonkey enhancements from HBASE-14261

2015-10-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944706#comment-14944706
 ] 

Hadoop QA commented on HBASE-14558:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12765131/HBASE-14558.patch
  against master branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193.
  ATTACHMENT ID: 12765131

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 lineLengths{color}.  The patch introduces the following lines 
longer than 100:
+HBase 0.96 introduced a tool named `ChaosMonkey`, modeled after 
link:http://techblog.netflix.com/2012/07/chaos-monkey-released-into-wild.html
+policy, which is configured with all the available actions. It chose to run 
`RestartActiveMaster` and `RestartRandomRs` actions.
+$ bin/hbase org.apache.hadoop.hbase.IntegrationTestIngest -m slowDeterministic 
-monkeyProps monkey.properties

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15882//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15882//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15882//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15882//console

This message is automatically generated.

> Document ChaosMonkey enhancements from HBASE-14261
> --
>
> Key: HBASE-14558
> URL: https://issues.apache.org/jira/browse/HBASE-14558
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14558.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944733#comment-14944733
 ] 

Hudson commented on HBASE-14559:


SUCCESS: Integrated in HBase-1.3-IT #212 (See 
[https://builds.apache.org/job/HBase-1.3-IT/212/])
HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: 
rev 80961187aa7053d886c88be56311b88a4e02d28f)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944737#comment-14944737
 ] 

Hudson commented on HBASE-14559:


FAILURE: Integrated in HBase-1.3 #237 (See 
[https://builds.apache.org/job/HBase-1.3/237/])
HBASE-14559 branch-1 test tweeks; disable assert explicit region lands (stack: 
rev 80961187aa7053d886c88be56311b88a4e02d28f)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/handler/TestEnableTableHandler.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/visibility/TestVisibilityLabelsWithACL.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestNamespaceCommands.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestAssignmentManager.java


> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14525) Append and increment operation throws NullPointerException on non-existing column families.

2015-10-06 Thread Anoop Sam John (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944852#comment-14944852
 ] 

Anoop Sam John commented on HBASE-14525:


Patch LGTM

> Append and increment operation throws NullPointerException on non-existing 
> column families.
> ---
>
> Key: HBASE-14525
> URL: https://issues.apache.org/jira/browse/HBASE-14525
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Abhishek Kumar
>Assignee: Abhishek Kumar
>Priority: Minor
> Attachments: HBASE-14525-V1.patch, HBASE-14525.patch
>
>
> When performing append operation on non-existing column families, 
> NullPointerException is thrown in hbase shell as shown below:
> {noformat}
> hbase(main):007:0> append 't1', 'r1', 'none:c1', '123'
> ERROR: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hbase.regionserver.HRegion.doGet(HRegion.java:6987)
> at org.apache.hadoop.hbase.regionserver.HRegion.append(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.append(RSRpcServices.java:580)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2206)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32452)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133)
> ... 4 more
> {noformat}
> This seems to be caused by absence of check for valid family names as done in 
> other operations like 'Put' in HRegion.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14525) Append and increment operation throws NullPointerException on non-existing column families.

2015-10-06 Thread Anoop Sam John (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anoop Sam John updated HBASE-14525:
---
Hadoop Flags: Reviewed
  Status: Patch Available  (was: Open)

> Append and increment operation throws NullPointerException on non-existing 
> column families.
> ---
>
> Key: HBASE-14525
> URL: https://issues.apache.org/jira/browse/HBASE-14525
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Abhishek Kumar
>Assignee: Abhishek Kumar
>Priority: Minor
> Attachments: HBASE-14525-V1.patch, HBASE-14525.patch
>
>
> When performing append operation on non-existing column families, 
> NullPointerException is thrown in hbase shell as shown below:
> {noformat}
> hbase(main):007:0> append 't1', 'r1', 'none:c1', '123'
> ERROR: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hbase.regionserver.HRegion.doGet(HRegion.java:6987)
> at org.apache.hadoop.hbase.regionserver.HRegion.append(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.append(RSRpcServices.java:580)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2206)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32452)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133)
> ... 4 more
> {noformat}
> This seems to be caused by absence of check for valid family names as done in 
> other operations like 'Put' in HRegion.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans

2015-10-06 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14944894#comment-14944894
 ] 

ramkrishna.s.vasudevan commented on HBASE-12790:


bq.Did you mean: instead of waiting 20 seconds for one count query now we will 
see several point queries completing during that interval?
Yes [~apurtell]. That is right. 
bq.Should see clear improvement when the count query is running with the patch 
applied. (smile)
The count query still runs with the same amount of time but it is the smaller 
queries that stays behind the bigger queries gets benefited. I think that is a 
valid case and I can see that the point queries are lagging without the patch 
because the queues are filled up with the parallel scans launched by the bigger 
count query. Let me see how to present these results. 

> Support fairness across parallelized scans
> --
>
> Key: HBASE-12790
> URL: https://issues.apache.org/jira/browse/HBASE-12790
> Project: HBase
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: ramkrishna.s.vasudevan
>  Labels: Phoenix
> Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, 
> HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, 
> HBASE-12790_trunk_1.patch
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in 
> getting back results. This can lead to starvation with a loaded cluster and 
> interleaved scans, since the RPC queue will be ordered and processed on a 
> FIFO basis. For example, if there are two clients, A & B that submit largish 
> scans at the same time. Say each scan is broken down into 100 scans by the 
> client (broken down into equal depth chunks along the row key), and the 100 
> scans of client A are queued first, followed immediately by the 100 scans of 
> client B. In this case, client B will be starved out of getting any results 
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead 
> of the standard FIFO queue. The queue to be used could be (maybe it already 
> is) configurable based on a new config parameter. Using this queue would 
> require the client to have the same identifier for all of the 100 parallel 
> scans that represent a single logical scan from the clients point of view. 
> With this information, the round robin queue would pick off a task from the 
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent 
> starvation over interleaved parallelized scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14366) NPE in case visibility expression is not present in labels table during importtsv run

2015-10-06 Thread Bhupendra Kumar Jain (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Bhupendra Kumar Jain updated HBASE-14366:
-
Attachment: HBASE-14366-0.98.patch
HBASE-14366-branch-1.patch

Attached patches for 0.98 and branch-1. Please review 

> NPE in case visibility expression is not present in labels table during 
> importtsv run
> -
>
> Key: HBASE-14366
> URL: https://issues.apache.org/jira/browse/HBASE-14366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Y. SREENIVASULU REDDY
>Assignee: Bhupendra Kumar Jain
>Priority: Minor
> Attachments: 0001-HBASE-14366.patch, 0001-HBASE-14366_1.patch, 
> HBASE-14366-0.98.patch, HBASE-14366-branch-1.patch, HBASE-14366_2(1).patch, 
> HBASE-14366_2.patch
>
>
> Below exception is shown in logs if visibility expression is not present in 
> labels table during importtsv run. Appropriate exception / message should be 
> logged for the user to take further action.
> {code}
> WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver$1.getLabelOrdinal(DefaultVisibilityExpressionResolver.java:127)
> at 
> org.apache.hadoop.hbase.security.visibility.VisibilityUtils.getLabelOrdinals(VisibilityUtils.java:358)
> at 
> org.apache.hadoop.hbase.security.visibility.VisibilityUtils.createVisibilityExpTags(VisibilityUtils.java:323)
> at 
> org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver.createVisibilityExpTags(DefaultVisibilityExpressionResolver.java:137)
> at 
> org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.populatePut(TsvImporterMapper.java:205)
> at 
> org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:165)
> at 
> org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:1)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14520) Optimize the number of calls for tags creation in bulk load

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14520:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

Thanks for the patch, Bhupendra

Thanks for the review, Anoop.

> Optimize the number of calls for tags creation in bulk load
> ---
>
> Key: HBASE-14520
> URL: https://issues.apache.org/jira/browse/HBASE-14520
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Bhupendra Kumar Jain
> Fix For: 2.0.0
>
> Attachments: HBASE-14520.patch
>
>
> At present, ttl and Visibility expr is one per tsv line i.e. the values and 
> the tags remain same for all the columns present in that line. As per the 
> code, List of tags are created for each cell, Instead of creating new tags 
> for each cell, tags created once for the line can be reused by other cells.  
> Assume 1Million rows and 1000 columns. Currently tags creation will happen 
> for 1M * 1000 times. If reuse the tags, the tags creation can reduce to 1M 
> times. (i.e. one per tsv line). 
> This is applicable in both TsvImporterMapper and TextSortReducer logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14525) Append and increment operation throws NullPointerException on non-existing column families.

2015-10-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945049#comment-14945049
 ] 

Hadoop QA commented on HBASE-14525:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12765143/HBASE-14525-V1.patch
  against master branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193.
  ATTACHMENT ID: 12765143

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s):   
at 
org.apache.ambari.server.upgrade.UpgradeCatalog211Test.testExecuteDDLUpdates(UpgradeCatalog211Test.java:73)

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15883//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15883//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15883//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15883//console

This message is automatically generated.

> Append and increment operation throws NullPointerException on non-existing 
> column families.
> ---
>
> Key: HBASE-14525
> URL: https://issues.apache.org/jira/browse/HBASE-14525
> Project: HBase
>  Issue Type: Bug
>  Components: shell
>Affects Versions: 2.0.0
>Reporter: Abhishek Kumar
>Assignee: Abhishek Kumar
>Priority: Minor
> Attachments: HBASE-14525-V1.patch, HBASE-14525.patch
>
>
> When performing append operation on non-existing column families, 
> NullPointerException is thrown in hbase shell as shown below:
> {noformat}
> hbase(main):007:0> append 't1', 'r1', 'none:c1', '123'
> ERROR: java.io.IOException
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2175)
> at org.apache.hadoop.hbase.ipc.CallRunner.run(CallRunner.java:106)
> at org.apache.hadoop.hbase.ipc.RpcExecutor.consumerLoop(RpcExecutor.java:130)
> at org.apache.hadoop.hbase.ipc.RpcExecutor$1.run(RpcExecutor.java:107)
> at java.lang.Thread.run(Thread.java:745)
> Caused by: java.lang.NullPointerException
> at org.apache.hadoop.hbase.regionserver.HRegion.doGet(HRegion.java:6987)
> at org.apache.hadoop.hbase.regionserver.HRegion.append(HRegion.java:7048)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.append(RSRpcServices.java:580)
> at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.mutate(RSRpcServices.java:2206)
> at 
> org.apache.hadoop.hbase.protobuf.generated.ClientProtos$ClientService$2.callBlockingMethod(ClientProtos.java:32452)
> at org.apache.hadoop.hbase.ipc.RpcServer.call(RpcServer.java:2133)
> ... 4 more
> {noformat}
> This seems to be caused by absence of check for valid family names as done in 
> other operations like 'Put' in HRegion.java



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14366) NPE in case visibility expression is not present in labels table during importtsv run

2015-10-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945153#comment-14945153
 ] 

Hadoop QA commented on HBASE-14366:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12765169/HBASE-14366-0.98.patch
  against 0.98 branch at commit ed4c734b15c83b7f6b8ec1d170abccae9de1b193.
  ATTACHMENT ID: 12765169

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:red}-1 javadoc{color}.  The javadoc tool appears to have generated 
29 warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15884//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15884//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15884//artifact/patchprocess/checkstyle-aggregate.html

  Javadoc warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15884//artifact/patchprocess/patchJavadocWarnings.txt
Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15884//console

This message is automatically generated.

> NPE in case visibility expression is not present in labels table during 
> importtsv run
> -
>
> Key: HBASE-14366
> URL: https://issues.apache.org/jira/browse/HBASE-14366
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Y. SREENIVASULU REDDY
>Assignee: Bhupendra Kumar Jain
>Priority: Minor
> Attachments: 0001-HBASE-14366.patch, 0001-HBASE-14366_1.patch, 
> HBASE-14366-0.98.patch, HBASE-14366-branch-1.patch, HBASE-14366_2(1).patch, 
> HBASE-14366_2.patch
>
>
> Below exception is shown in logs if visibility expression is not present in 
> labels table during importtsv run. Appropriate exception / message should be 
> logged for the user to take further action.
> {code}
> WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : 
> java.lang.NullPointerException
> at 
> org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver$1.getLabelOrdinal(DefaultVisibilityExpressionResolver.java:127)
> at 
> org.apache.hadoop.hbase.security.visibility.VisibilityUtils.getLabelOrdinals(VisibilityUtils.java:358)
> at 
> org.apache.hadoop.hbase.security.visibility.VisibilityUtils.createVisibilityExpTags(VisibilityUtils.java:323)
> at 
> org.apache.hadoop.hbase.mapreduce.DefaultVisibilityExpressionResolver.createVisibilityExpTags(DefaultVisibilityExpressionResolver.java:137)
> at 
> org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.populatePut(TsvImporterMapper.java:205)
> at 
> org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:165)
> at 
> org.apache.hadoop.hbase.mapreduce.TsvImporterMapper.map(TsvImporterMapper.java:1)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:146)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945189#comment-14945189
 ] 

stack commented on HBASE-14559:
---

Thanks for noticing lads.

I'm running a little rig here w/ branch-1. The tests in the patch pass with the 
one-liner (upped priorities) where w/o they failed near 100% of the time. At 
this stage in the zombie stomping session, I'm -- ahem -- a little less 
concerned about root cause of a failure and more about just getting stuff going 
again so did not spend much time on why these fixes are needed in branch-1 and 
not on master. 

bq. stack so in these 2 oneliners, you're dropping down the number of high 
priority handlers (to tighten up thread usage?) right?

No sir. HBASE-14290 set the number of handlers down when I noticed tests with 
500 threads running... that failed to run on my local machine because OOME, 
could not create thread.  So, here, I'm upping the handlers on a few tests. I'd 
already done a pass on master -- a few tests there needed more handlers or they 
hung (we need to fix!) -- so was a bit surprised this necessary in branch-1 but 
it looks like you and [~eclark] have identified why.

We should revert HBASE-13635 and HBASE-14322 from branch-1?


> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans

2015-10-06 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945217#comment-14945217
 ] 

Andrew Purtell commented on HBASE-12790:


bq. The count query still runs with the same amount of time but it is the 
smaller queries that stays behind the bigger queries gets benefited.   

Yes, that's what I mean. And when no count query is running the point queries 
shouldn't show a penalty (or if they do then we discuss)

> Support fairness across parallelized scans
> --
>
> Key: HBASE-12790
> URL: https://issues.apache.org/jira/browse/HBASE-12790
> Project: HBase
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: ramkrishna.s.vasudevan
>  Labels: Phoenix
> Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, 
> HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, 
> HBASE-12790_trunk_1.patch
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in 
> getting back results. This can lead to starvation with a loaded cluster and 
> interleaved scans, since the RPC queue will be ordered and processed on a 
> FIFO basis. For example, if there are two clients, A & B that submit largish 
> scans at the same time. Say each scan is broken down into 100 scans by the 
> client (broken down into equal depth chunks along the row key), and the 100 
> scans of client A are queued first, followed immediately by the 100 scans of 
> client B. In this case, client B will be starved out of getting any results 
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead 
> of the standard FIFO queue. The queue to be used could be (maybe it already 
> is) configurable based on a new config parameter. Using this queue would 
> require the client to have the same identifier for all of the 100 parallel 
> scans that represent a single logical scan from the clients point of view. 
> With this information, the round robin queue would pick off a task from the 
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent 
> starvation over interleaved parallelized scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12790) Support fairness across parallelized scans

2015-10-06 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945252#comment-14945252
 ] 

ramkrishna.s.vasudevan commented on HBASE-12790:


bq.And when no count query is running the point queries shouldn't show a 
penalty (or if they do then we discuss)
That does not happen and I have verified that. Thanks Andy.

> Support fairness across parallelized scans
> --
>
> Key: HBASE-12790
> URL: https://issues.apache.org/jira/browse/HBASE-12790
> Project: HBase
>  Issue Type: New Feature
>Reporter: James Taylor
>Assignee: ramkrishna.s.vasudevan
>  Labels: Phoenix
> Attachments: AbstractRoundRobinQueue.java, HBASE-12790.patch, 
> HBASE-12790_1.patch, HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, 
> HBASE-12790_trunk_1.patch
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in 
> getting back results. This can lead to starvation with a loaded cluster and 
> interleaved scans, since the RPC queue will be ordered and processed on a 
> FIFO basis. For example, if there are two clients, A & B that submit largish 
> scans at the same time. Say each scan is broken down into 100 scans by the 
> client (broken down into equal depth chunks along the row key), and the 100 
> scans of client A are queued first, followed immediately by the 100 scans of 
> client B. In this case, client B will be starved out of getting any results 
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead 
> of the standard FIFO queue. The queue to be used could be (maybe it already 
> is) configurable based on a new config parameter. Using this queue would 
> require the client to have the same identifier for all of the 100 parallel 
> scans that represent a single logical scan from the clients point of view. 
> With this information, the round robin queue would pick off a task from the 
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent 
> starvation over interleaved parallelized scans.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks

2015-10-06 Thread Stephen Yuan Jiang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945268#comment-14945268
 ] 

Stephen Yuan Jiang commented on HBASE-14432:


The 
{{org.apache.hadoop.hbase.master.handler.TestEnableTableHandler.testEnableTableWithNoRegionServers}}
 failure is a bad assert, it was just fixed by stack after this patch was 
submitted (HBASE-14559).  Sync the latest branch-1 and re-run test, the problem 
went away.

> Procedure V2 - enforce ACL on procedure admin tasks
> ---
>
> Key: HBASE-14432
> URL: https://issues.apache.org/jira/browse/HBASE-14432
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>  Labels: security
> Attachments: HBASE-14432.v1-branch-1.patch, 
> HBASE-14432.v1-master.patch
>
>
> In the Procedure class, the owner field is never set. We need to set it so 
> that we can enforce ACLs on admin tasks such as whether a user has privilege 
> to abort a procedure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks

2015-10-06 Thread Stephen Yuan Jiang (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stephen Yuan Jiang updated HBASE-14432:
---
   Resolution: Fixed
Fix Version/s: 1.3.0
   2.0.0
   Status: Resolved  (was: Patch Available)

> Procedure V2 - enforce ACL on procedure admin tasks
> ---
>
> Key: HBASE-14432
> URL: https://issues.apache.org/jira/browse/HBASE-14432
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>  Labels: security
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14432.v1-branch-1.patch, 
> HBASE-14432.v1-master.patch
>
>
> In the Procedure class, the owner field is never set. We need to set it so 
> that we can enforce ACLs on admin tasks such as whether a user has privilege 
> to abort a procedure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14509) Configurable sparse indexes?

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945304#comment-14945304
 ] 

stack commented on HBASE-14509:
---

bq. We could add a method to filter, which is passed an HFile or a FileInfo or 
something, and based on that gets to decide whether to include the HFile or 
not. 

Is filter Interface, like CP, operating at too high a level for the ruling 
in/out of hfile?

bq, The other question is whether HFile is too large of a unit. 

On whether an hfile is too large a unit, block is the next natural construct; a 
BF of CQ per block so can skip blocks at a time?  The sparse index would go 
into the current block index as ancillary data rather than add at the head of a 
data block... We already load the hfile index BF per CQ or min/max could be 
part of this?

bq. Or we punt and just add the building blocks: 

Sounds like extra config/options to me... so no (smile). Could we start small? 
Add extra generic info on index -- a BF or min/max -- just so we can skip 
blocks as we scan?  min/max in hfile would be useful too... so could skip whole 
hfile (would be rare event but great when it happens)

> Configurable sparse indexes?
> 
>
> Key: HBASE-14509
> URL: https://issues.apache.org/jira/browse/HBASE-14509
> Project: HBase
>  Issue Type: Brainstorming
>Reporter: Lars Hofhansl
>
> This idea just popped up today and I wanted to record it for discussion:
> What if we kept sparse column indexes per region or HFile or per configurable 
> range?
> I.e. For any given CQ we record the lowest and highest value for a particular 
> range (HFile, Region, or a custom range like the Phoenix guide post).
> By tweaking the size of these ranges we can control the size of the index, vs 
> its selectivity.
> For example if we kept it by HFile we can almost instantly decide whether we 
> need scan a particular HFile at all to find a particular value in a Cell.
> We can also collect min/max values for each n MB of data, for example when we 
> can the region the first time. Assuming ranges are large enough we can always 
> keep the index in memory together with the region.
> Kind of a sparse local index. Might much easier than the buddy region stuff 
> we've been discussing.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945329#comment-14945329
 ] 

Elliott Clark commented on HBASE-14559:
---

bq.We should revert HBASE-13635 and HBASE-14322 from branch-1?
Nope those two jiras are partial reverts of HBASE-13375. I'm asking if we 
should remove HBASE-13375 completely. Since we've had to remove it when 
requests are going to master and we are upping the number of threads that a 
regionserver needs.

> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14557) MapReduce WALPlayer issue with NoTagsKeyValue

2015-10-06 Thread Jerry He (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945321#comment-14945321
 ] 

Jerry He commented on HBASE-14557:
--

Yes. [~ram_krish]
The Cells in the WALEdit can be KeyValue or NoTagsKeyValue depending on whether 
they have tag when written by the server.  Right?
Then we will have problem setting the OutputValueClass to either of the two 
classes.

> MapReduce WALPlayer issue with NoTagsKeyValue
> -
>
> Key: HBASE-14557
> URL: https://issues.apache.org/jira/browse/HBASE-14557
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0
>Reporter: Jerry He
>
> Running MapReduce WALPlayer to convert WAL info HFiles:
> {noformat}
> 15/10/05 20:28:08 INFO mapred.JobClient: Task Id : 
> attempt_201508031611_0029_m_00_0, Status : FAILED
> java.io.IOException: Type mismatch in value from map: expected 
> org.apache.hadoop.hbase.KeyValue, recieved 
> org.apache.hadoop.hbase.NoTagsKeyValue
> at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.collect(MapTask.java:997)
> at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.write(MapTask.java:689)
> at 
> org.apache.hadoop.mapreduce.task.TaskInputOutputContextImpl.write(TaskInputOutputContextImpl.java:89)
> at 
> org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.write(WrappedMapper.java:112)
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALKeyValueMapper.map(WALPlayer.java:111)
> at 
> org.apache.hadoop.hbase.mapreduce.WALPlayer$WALKeyValueMapper.map(WALPlayer.java:96)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:751)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:368)
> at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
> at 
> java.security.AccessController.doPrivileged(AccessController.java:369)
> at javax.security.auth.Subject.doAs(Subject.java:572)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1502)
> at org.apache.hadoop.mapred.Child.main(Child.java:249)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14558) Document ChaosMonkey enhancements from HBASE-14261

2015-10-06 Thread Elliott Clark (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945334#comment-14945334
 ] 

Elliott Clark commented on HBASE-14558:
---

{code}HBase 1.02 and newer adds the ability to restart{code}
1.0.2

{code}have no reasonable defaults{code}
Should we call out that they have no default because it's deployment specific ?

{code}in your ChaosMonkey properties file.{code}
This can be in hbase-site.xml 

> Document ChaosMonkey enhancements from HBASE-14261
> --
>
> Key: HBASE-14558
> URL: https://issues.apache.org/jira/browse/HBASE-14558
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.15, 1.0.3, 1.1.3
>
> Attachments: HBASE-14558.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945344#comment-14945344
 ] 

Hudson commented on HBASE-14432:


SUCCESS: Integrated in HBase-1.3-IT #213 (See 
[https://builds.apache.org/job/HBase-1.3-IT/213/])
HBASE-14432 Procedure V2 - enforce ACL on procedure admin tasks (Stephen 
(syuanjiangdev: rev a6d90bcc97ea6e00d2d75381db0b598ab6c71026)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java
* hbase-server/pom.xml
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
* 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* hbase-server/src/test/protobuf/TestProcedure.proto
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyColumnFamilyProcedure.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/protobuf/generated/TestProcedureProtos.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/ProcedureInfo.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java
* 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterAndRegionObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteColumnFamilyProcedure.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AddColumnFamilyProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java


> Procedure V2 - enforce ACL on procedure admin tasks
> ---
>
> Key: HBASE-14432
> URL: https://issues.apache.org/jira/browse/HBASE-14432
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>  Labels: security
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14432.v1-branch-1.patch, 
> HBASE-14432.v1-master.patch
>
>
> In the Procedure class, the owner field is never set. We need to set it so 
> that we can enforce ACLs on admin tasks such as whether a user has privilege 
> to abort a procedure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14520) Optimize the number of calls for tags creation in bulk load

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14520?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945362#comment-14945362
 ] 

Hudson commented on HBASE-14520:


FAILURE: Integrated in HBase-TRUNK #6877 (See 
[https://builds.apache.org/job/HBase-TRUNK/6877/])
HBASE-14520 Optimize the number of calls for tags creation in bulk load (tedyu: 
rev 23079c02bf40c318fff4f77fa9182ebdfb230e90)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TsvImporterMapper.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/mapreduce/TextSortReducer.java


> Optimize the number of calls for tags creation in bulk load
> ---
>
> Key: HBASE-14520
> URL: https://issues.apache.org/jira/browse/HBASE-14520
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 2.0.0
>Reporter: Bhupendra Kumar Jain
>Assignee: Bhupendra Kumar Jain
> Fix For: 2.0.0
>
> Attachments: HBASE-14520.patch
>
>
> At present, ttl and Visibility expr is one per tsv line i.e. the values and 
> the tags remain same for all the columns present in that line. As per the 
> code, List of tags are created for each cell, Instead of creating new tags 
> for each cell, tags created once for the line can be reused by other cells.  
> Assume 1Million rows and 1000 columns. Currently tags creation will happen 
> for 1M * 1000 times. If reuse the tags, the tags creation can reduce to 1M 
> times. (i.e. one per tsv line). 
> This is applicable in both TsvImporterMapper and TextSortReducer logic. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14268) Improve KeyLocker

2015-10-06 Thread Yu Li (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945398#comment-14945398
 ] 

Yu Li commented on HBASE-14268:
---

[~jingcheng...@intel.com] I guess we're discussing about lock fairness here? If 
so, it seems to me the original implementation also uses unfair lock and early 
waiting thread might also starve. Maybe another point to improve, though.

> Improve KeyLocker
> -
>
> Key: HBASE-14268
> URL: https://issues.apache.org/jira/browse/HBASE-14268
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
>Priority: Minor
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 14268-V5.patch, HBASE-14268-V2.patch, 
> HBASE-14268-V3.patch, HBASE-14268-V4.patch, HBASE-14268-V5.patch, 
> HBASE-14268-V5.patch, HBASE-14268-V6.patch, HBASE-14268-V7.patch, 
> HBASE-14268-V7.patch, HBASE-14268-V7.patch, HBASE-14268-V7.patch, 
> HBASE-14268.patch, KeyLockerIncrKeysPerformance.java, 
> KeyLockerPerformance.java, ReferenceTestApp.java
>
>
> 1. In the implementation of {{KeyLocker}} it uses atomic variables inside a 
> synchronized block, which doesn't make sense. Moreover, logic inside the 
> synchronized block is not trivial so that it makes less performance in heavy 
> multi-threaded environment.
> 2. {{KeyLocker}} gives an instance of {{RentrantLock}} which is already 
> locked, but it doesn't follow the contract of {{ReentrantLock}} because you 
> are not allowed to freely invoke lock/unlock methods under that contract. 
> That introduces a potential risk; Whenever you see a variable of the type 
> {{RentrantLock}}, you should pay attention to what the included instance is 
> coming from.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14535) Unit test for rpc connection concurrency / deadlock testing

2015-10-06 Thread Enis Soztutar (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945401#comment-14945401
 ] 

Enis Soztutar commented on HBASE-14535:
---

bq. I agree with Andrew (till I understand more). We've already got a fleet of 
'non-deterministic' tests, so many, our CI runs are of no value
A non-deterministic test in this context is different from a flaky test. We 
have a bunch of flaky tests that fail due to false negatives making jenkins 
runs useless. This test in particular will not fail for false negatives, but it 
might fail to catch deadlocks (false positive). If this test fails, I imagine 
we take a look at it rather than classify as a flaky test. 

> Unit test for rpc connection concurrency / deadlock testing 
> 
>
> Key: HBASE-14535
> URL: https://issues.apache.org/jira/browse/HBASE-14535
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: hbase-14535_v1.patch, hbase-14535_v2.patch
>
>
> As per parent jira and recent jiras  HBASE-14449 + HBASE-14241 and 
> HBASE-14313, we seem to be lacking some testing rpc connection concurrency 
> issues in a UT env. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14529) Respond to SIGHUP to reload config

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945408#comment-14945408
 ] 

stack commented on HBASE-14529:
---

Ok on 2.0 then with fat release note. Where else should it go? 1.3. Want this 
for 1.2 [~busbey]?

> Respond to SIGHUP to reload config
> --
>
> Key: HBASE-14529
> URL: https://issues.apache.org/jira/browse/HBASE-14529
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14529-v1.patch, HBASE-14529-v2.patch, 
> HBASE-14529.patch
>
>
> SIGHUP is the way everyone since the dawn of unix has done config reload.
> Lets not be a special unique snowflake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14535) Unit test for rpc connection concurrency / deadlock testing

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14535?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945439#comment-14945439
 ] 

stack commented on HBASE-14535:
---

bq.  I imagine we take a look at it rather than classify as a flaky test.

Thanks for the explanation.

How to designate difference between a flakey and a test that might fail with 
'real' issue that needs looking at? An outsider like myself trying to cleanup 
test failures would need to be able to distinguish between the two. Devs trying 
to get a clean run against their patch would need to be able to look at results 
and see that the fail was not theirs but because the test is a 
'non-deterministic'.

> Unit test for rpc connection concurrency / deadlock testing 
> 
>
> Key: HBASE-14535
> URL: https://issues.apache.org/jira/browse/HBASE-14535
> Project: HBase
>  Issue Type: Sub-task
>  Components: rpc
>Reporter: Enis Soztutar
>Assignee: Enis Soztutar
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: hbase-14535_v1.patch, hbase-14535_v2.patch
>
>
> As per parent jira and recent jiras  HBASE-14449 + HBASE-14241 and 
> HBASE-14313, we seem to be lacking some testing rpc connection concurrency 
> issues in a UT env. 
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14497:
---
Attachment: 14497-branch-1-v6.patch

> Reverse Scan threw StackOverflow caused by readPt checking
> --
>
> Key: HBASE-14497
> URL: https://issues.apache.org/jira/browse/HBASE-14497
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 0.98.14, 1.3.0
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: 2.0.0
>
> Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, 
> HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, 
> HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, 
> HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, 
> HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, 
> HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, 
> HBASE-14497-master.patch
>
>
> I met stack overflow error in StoreFileScanner.seekToPreviousRow using 
> reversed scan. I searched and founded HBASE-14155, but it seems to be a 
> different reason.
> The seekToPreviousRow will fetch the row which closest before, and compare 
> mvcc to the readPt, which acquired when scanner created. If the row's mvcc is 
> bigger than readPt, an recursive call of seekToPreviousRow will invoked, to 
> find the next closest before row.
> Considering we created a scanner for reversed scan, and some data with 
> smaller rows was written and flushed, before calling scanner next. When 
> seekToPreviousRow was invoked, it would call itself recursively, until all 
> rows which written after scanner created were iterated. The depth of 
> recursive calling stack depends on the count of rows, the stack overflow 
> error will be threw if the count of rows is large, like 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14529) Respond to SIGHUP to reload config

2015-10-06 Thread Sean Busbey (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945481#comment-14945481
 ] 

Sean Busbey commented on HBASE-14529:
-

+1 for 1.2 with release note.

> Respond to SIGHUP to reload config
> --
>
> Key: HBASE-14529
> URL: https://issues.apache.org/jira/browse/HBASE-14529
> Project: HBase
>  Issue Type: New Feature
>Affects Versions: 1.2.0
>Reporter: Elliott Clark
>Assignee: Elliott Clark
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14529-v1.patch, HBASE-14529-v2.patch, 
> HBASE-14529.patch
>
>
> SIGHUP is the way everyone since the dawn of unix has done config reload.
> Lets not be a special unique snowflake.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14560) TestNamespacesInstanceModel#testToXML fails when JDK 1.8 is used

2015-10-06 Thread Ted Yu (JIRA)

Ted Yu created HBASE-14560:
--

 Summary: TestNamespacesInstanceModel#testToXML fails when JDK 1.8 
is used
 Key: HBASE-14560
 URL: https://issues.apache.org/jira/browse/HBASE-14560
 Project: HBase
  Issue Type: Test
Reporter: Ted Yu
Priority: Minor


>From 
>https://builds.apache.org/job/HBase-1.3/jdk=latest1.8,label=Hadoop/237/consoleFull
> :
{code}
org.apache.hadoop.hbase.rest.model.TestNamespacesInstanceModel
testToXML(org.apache.hadoop.hbase.rest.model.TestNamespacesInstanceModel)  Time 
elapsed: 0.017 sec  <<< FAILURE!
junit.framework.ComparisonFailure: 
expected:<...perties>[NAMEtestNamespaceKEY_2VALUE_2KEY_1VALUE_1]
 but 
was:<...perties>[KEY_1VALUE_1KEY_2VALUE_2NAMEtestNamespace]
at junit.framework.Assert.assertEquals(Assert.java:100)
at junit.framework.Assert.assertEquals(Assert.java:107)
at junit.framework.TestCase.assertEquals(TestCase.java:269)
at 
org.apache.hadoop.hbase.rest.model.TestModelBase.testToXML(TestModelBase.java:115)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
{code}
The above test failure can be reproduced locally.
It was likely caused by the different behavior w.r.t. JAXBContext between JDK 
1.7 and 1.8



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14432) Procedure V2 - enforce ACL on procedure admin tasks

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945514#comment-14945514
 ] 

Hudson commented on HBASE-14432:


FAILURE: Integrated in HBase-1.3 #238 (See 
[https://builds.apache.org/job/HBase-1.3/238/])
HBASE-14432 Procedure V2 - enforce ACL on procedure admin tasks (Stephen 
(syuanjiangdev: rev a6d90bcc97ea6e00d2d75381db0b598ab6c71026)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/security/access/AccessController.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DisableTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterAndRegionObserver.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestCatalogJanitor.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteColumnFamilyProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/EnableTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteNamespaceProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterCoprocessorHost.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/ipc/protobuf/generated/TestProcedureProtos.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyTableProcedure.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/HMaster.java
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/coprocessor/TestMasterObserver.java
* hbase-server/pom.xml
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/security/access/TestAccessController.java
* hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterServices.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyColumnFamilyProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/DeleteTableProcedure.java
* hbase-server/src/test/protobuf/TestProcedure.proto
* 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/ProcedureExecutor.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/ModifyNamespaceProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/MasterObserver.java
* 
hbase-procedure/src/main/java/org/apache/hadoop/hbase/procedure2/Procedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/coprocessor/BaseMasterObserver.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/CreateNamespaceProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/AddColumnFamilyProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/procedure/TruncateTableProcedure.java
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/master/MasterRpcServices.java
* hbase-common/src/main/java/org/apache/hadoop/hbase/ProcedureInfo.java


> Procedure V2 - enforce ACL on procedure admin tasks
> ---
>
> Key: HBASE-14432
> URL: https://issues.apache.org/jira/browse/HBASE-14432
> Project: HBase
>  Issue Type: Sub-task
>  Components: proc-v2
>Affects Versions: 2.0.0, 1.3.0
>Reporter: Stephen Yuan Jiang
>Assignee: Stephen Yuan Jiang
>  Labels: security
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14432.v1-branch-1.patch, 
> HBASE-14432.v1-master.patch
>
>
> In the Procedure class, the owner field is never set. We need to set it so 
> that we can enforce ACLs on admin tasks such as whether a user has privilege 
> to abort a procedure.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14386) Reset MutableHistogram's min/max/sum after snapshot

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14386:
---
Status: Open  (was: Patch Available)

> Reset MutableHistogram's min/max/sum after snapshot
> ---
>
> Key: HBASE-14386
> URL: https://issues.apache.org/jira/browse/HBASE-14386
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: Oliver
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14386.patch
>
>
> Current MutableHistogram do not reset min/max/sum after snapshot, so we 
> affect by historical data. For example when i monitor the QueueCallTime_mean, 
> i see one host's QueueCallTime_mean metric is high, but when i trace the 
> host's regionserver log i see the QueueCallTime_mean has been lower, but the 
> metric is still high.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945602#comment-14945602
 ] 

stack commented on HBASE-14559:
---

These fixes here made it so branch-1 now passes on my internal rig.

I can't speak to whether we should remove HBASE-13375 completely. It would seem 
to explain why some of the tweaks here were necessary in branch-1 but not in 
master -- that  helps.

> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945611#comment-14945611
 ] 

stack commented on HBASE-14420:
---

Master and branch-1 pass on my internal rig reliably without leaving zombies.

I'm now into a new phase of zombie stomping. I am just going to just disable 
hangers from here on out if I can't find anything obvious inside a few minutes 
(spent a good while on TestHFileOutputFormat2 yesterday. )


> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14386) Reset MutableHistogram's min/max/sum after snapshot

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945618#comment-14945618
 ] 

stack commented on HBASE-14386:
---

Any numbers? Do we need to lock? In the past a mistake in metrics making cost 
tens of percents of read latency. Thanks.

> Reset MutableHistogram's min/max/sum after snapshot
> ---
>
> Key: HBASE-14386
> URL: https://issues.apache.org/jira/browse/HBASE-14386
> Project: HBase
>  Issue Type: Bug
>Reporter: binlijin
>Assignee: Oliver
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-14386.patch
>
>
> Current MutableHistogram do not reset min/max/sum after snapshot, so we 
> affect by historical data. For example when i monitor the QueueCallTime_mean, 
> i see one host's QueueCallTime_mean metric is high, but when i trace the 
> host's regionserver log i see the QueueCallTime_mean has been lower, but the 
> metric is still high.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945623#comment-14945623
 ] 

Mikhail Antonov commented on HBASE-14559:
-

Thanks for clarifying [~stack]

regarding tests - I remember we upped the number of high priority handler used 
in all tests (and I admit to have suggested/supported that :), as after finding 
several tests where default number of thread wasn't enough after changes made 
in HBase-13375 bumping the default number of threads looked easy thing to do). 
Should we also lower DEFAULT_REGION_SERVER_HIGH_PRIORITY_HANDLER_COUNT back to 
10 as default (it's 20 now on master, as I see, and set higher only in selected 
tests?)

Regarding removing completely - I guess that's up to the judgement of folks 
running big clusters. The way we treat admin user requests _seems_ logical to 
me, but I definitely don't want to argue with production observations using 
logical conclusions :)

> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945634#comment-14945634
 ] 

stack commented on HBASE-14559:
---

bq. Should we also lower DEFAULT_REGION_SERVER_HIGH_PRIORITY_HANDLER_COUNT back 
to 10 as default (it's 20 now on master, as I see, and set higher only in 
selected tests?)

I'm just looking at tests at the moment. I saw priority handlers set to 40 in a 
few instances which seemed excessive. Other tests with many regions had reams 
of handlers just sitting there doing nothing clouding thread dumps where i was 
trying to figure why the test was hung... 

bq. The way we treat admin user requests seems logical to me, but I definitely 
don't want to argue with production observations using logical conclusions

I'm with you. Lets get other input.

> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14561) Disable zombie TestReplicationShell

2015-10-06 Thread stack (JIRA)

stack created HBASE-14561:
-

 Summary: Disable zombie TestReplicationShell
 Key: HBASE-14561
 URL: https://issues.apache.org/jira/browse/HBASE-14561
 Project: HBase
  Issue Type: Sub-task
Reporter: stack


It hung three times in last 40 test runs. Will file issue to reenable it when 
someone has chance to look at why it is hanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14562) Fix and reenable zombie TestReplicationShell

2015-10-06 Thread stack (JIRA)

stack created HBASE-14562:
-

 Summary: Fix and reenable zombie TestReplicationShell
 Key: HBASE-14562
 URL: https://issues.apache.org/jira/browse/HBASE-14562
 Project: HBase
  Issue Type: Bug
Reporter: stack


Was disabled over in HBASE-14561 because it hangs with some regularity.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14561) Disable zombie TestReplicationShell

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14561:
--
Attachment: 14561.txt

This is what I pushed to master.

> Disable zombie TestReplicationShell
> ---
>
> Key: HBASE-14561
> URL: https://issues.apache.org/jira/browse/HBASE-14561
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Attachments: 14561.txt
>
>
> It hung three times in last 40 test runs. Will file issue to reenable it when 
> someone has chance to look at why it is hanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14561) Disable zombie TestReplicationShell

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945663#comment-14945663
 ] 

stack commented on HBASE-14561:
---

Link to issue to reenable this test when fixed.

> Disable zombie TestReplicationShell
> ---
>
> Key: HBASE-14561
> URL: https://issues.apache.org/jira/browse/HBASE-14561
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
> Attachments: 14561.txt
>
>
> It hung three times in last 40 test runs. Will file issue to reenable it when 
> someone has chance to look at why it is hanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-14561) Disable zombie TestReplicationShell

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14561.
---
   Resolution: Fixed
 Assignee: stack
Fix Version/s: 2.0.0

Pushed to master.

> Disable zombie TestReplicationShell
> ---
>
> Key: HBASE-14561
> URL: https://issues.apache.org/jira/browse/HBASE-14561
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: 14561.txt
>
>
> It hung three times in last 40 test runs. Will file issue to reenable it when 
> someone has chance to look at why it is hanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945672#comment-14945672
 ] 

stack commented on HBASE-14420:
---

Going over the last 40 patch builds:

TestReplicationShell hangs three times. Was added to master only. HBASE-13084 
adds it by running all shell commands again plus the new 
replication_admin_test.rb command. I'm going to disable it for now.  
HBASE-14561.

TestHFileOutputFormat2 failed 5 times in last 40 runs. I spent time on it 
yesterday. Seems to be a reliance on test order but was having networking 
issues which complicated my being able to do diagnosis  It seems like an 
ambitious amount of work to get done in a unit test:

{code}
 * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}.
 * Sets up and runs a mapreduce job that writes hfile output.
 * Creates a few inner classes to implement splits and an inputformat that
 * emits keys and values like those of {@link PerformanceEvaluation}.
{code}

Was added a good while ago, here:

commit e4f8a7419fb4bd0102eaf91e9747de6261e0b5c5
Author: jxiang 
Date:   Fri Feb 21 20:39:21 2014 +

HBASE-10526 Using Cell instead of KeyValue in HFileOutputFormat

git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1570702 
13f79535-47bb-0310-9956-ffa450edef68

I'm just going to disable it until someone wants to work on it.

Here is the list of all test failures and their counts:

   2 Hanging test : org.apache.hadoop.hbase.TestNodeHealthCheckChore
   1 Hanging test : org.apache.hadoop.hbase.TestPartialResultsFromClientSide
   2 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSide
   1 Hanging test : 
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor
   1 Hanging test : org.apache.hadoop.hbase.client.TestReplicasClient
   3 Hanging test : org.apache.hadoop.hbase.client.TestReplicationShell
   1 Hanging test : org.apache.hadoop.hbase.constraint.TestConstraint
   1 Hanging test : org.apache.hadoop.hbase.filter.TestFuzzyRowFilterEndToEnd
   1 Hanging test : org.apache.hadoop.hbase.io.hfile.TestCacheOnWrite
   2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestCopyTable
   1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
   5 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat2
   1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestMultiTableInputFormat
   1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableInputFormat
   1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableInputFormatScan2
   1 Hanging test : org.apache.hadoop.hbase.mapreduce.TestTableMapReduce
   1 Hanging test : org.apache.hadoop.hbase.replication.TestMasterReplication
   1 Hanging test : 
org.apache.hadoop.hbase.replication.TestReplicationKillMasterRSCompressed
   1 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint
   1 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpointNoMaster
   1 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager
   1 Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController
   1 Hanging test : org.apache.hadoop.hbase.security.access.TestCellACLs
   1 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelReplicationWithExpAsString
   1 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes
   1 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay
   1 Hanging test : org.apache.hadoop.hbase.snapshot.TestExportSnapshot
   1 Hanging test : org.apache.hadoop.hbase.snapshot.TestMobExportSnapshot
   1 Hanging test : 
org.apache.hadoop.hbase.snapshot.TestMobFlushSnapshotFromClient
   1 Hanging test : org.apache.hadoop.hbase.snapshot.TestMobSecureExportSnapshot
   1 Hanging test : org.apache.hadoop.hbase.snapshot.TestSecureExportSnapshot




> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of t

[jira] [Created] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread stack (JIRA)

stack created HBASE-14563:
-

 Summary: Disable zombie TestHFileOutputFormat2
 Key: HBASE-14563
 URL: https://issues.apache.org/jira/browse/HBASE-14563
 Project: HBase
  Issue Type: Sub-task
Reporter: stack
Assignee: stack


Disabling until someone has a chance to look at it.

I watched it in jvisualvm a while. Its starting and stopping clusters multiple 
times and then running mr jobs. Needs a rewrite at least and some shrinking of 
scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14564) Fix and reenable TestHFileOutputFormat2

2015-10-06 Thread stack (JIRA)

stack created HBASE-14564:
-

 Summary: Fix and reenable TestHFileOutputFormat2
 Key: HBASE-14564
 URL: https://issues.apache.org/jira/browse/HBASE-14564
 Project: HBase
  Issue Type: Bug
Reporter: stack


Was disabled as part of the zombie stomping session over in HBASE-14420. Test 
needs a rewrite and/or being split up. Scope of the test needs to be shrunk and 
made more targeted. Currently it does everything.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945723#comment-14945723
 ] 

stack commented on HBASE-14563:
---

>From parent issue:
{code}
TestHFileOutputFormat2 failed 5 times in last 40 runs. I spent time on it 
yesterday. Seems to be a reliance on test order but was having networking 
issues which complicated my being able to do diagnosis It seems like an 
ambitious amount of work to get done in a unit test:
 * Simple test for {@link CellSortReducer} and {@link HFileOutputFormat2}.
 * Sets up and runs a mapreduce job that writes hfile output.
 * Creates a few inner classes to implement splits and an inputformat that
 * emits keys and values like those of {@link PerformanceEvaluation}.
Was added a good while ago, here:
commit e4f8a7419fb4bd0102eaf91e9747de6261e0b5c5
Author: jxiang 
Date: Fri Feb 21 20:39:21 2014 +
HBASE-10526 Using Cell instead of KeyValue in HFileOutputFormat
git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1570702 
13f79535-47bb-0310-9956-ffa450edef68
{code}

The test stops and starts clusters a few times and then runs MR jobs. Needs 
shrinking in size and scope. Needs to be more focused on testing a particular 
issue. 

HBASE-14564 is issue to reenable.

> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945733#comment-14945733
 ] 

stack commented on HBASE-14563:
---

Looks like some of the tests were disabled in this suite already.

> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14563:
--
Attachment: 14563.txt

What I pushed to master, branch-1, and branch-1.2.

> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Attachments: 14563.txt
>
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking

2015-10-06 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945745#comment-14945745
 ] 

Hadoop QA commented on HBASE-14497:
---

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  
http://issues.apache.org/jira/secure/attachment/12765206/14497-branch-1-v6.patch
  against branch-1 branch at commit 23079c02bf40c318fff4f77fa9182ebdfb230e90.
  ATTACHMENT ID: 12765206

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 4 new 
or modified tests.

{color:green}+1 hadoop versions{color}. The patch compiles with all 
supported hadoop versions (2.4.0 2.4.1 2.5.0 2.5.1 2.5.2 2.6.0 2.6.1 2.7.0 
2.7.1)

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 protoc{color}.  The applied patch does not increase the 
total number of protoc compiler warnings.

{color:green}+1 javadoc{color}.  The javadoc tool did not generate any 
warning messages.

{color:green}+1 checkstyle{color}.  The applied patch does not increase the 
total number of checkstyle errors

{color:green}+1 findbugs{color}.  The patch does not introduce any  new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:green}+1 lineLengths{color}.  The patch does not introduce lines 
longer than 100

  {color:green}+1 site{color}.  The mvn post-site goal succeeds with this patch.

{color:green}+1 core tests{color}.  The patch passed unit tests in .

 {color:red}-1 core zombie tests{color}.  There are 1 zombie test(s): 

Test results: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15885//testReport/
Release Findbugs (version 2.0.3)warnings: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15885//artifact/patchprocess/newFindbugsWarnings.html
Checkstyle Errors: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15885//artifact/patchprocess/checkstyle-aggregate.html

  Console output: 
https://builds.apache.org/job/PreCommit-HBASE-Build/15885//console

This message is automatically generated.

> Reverse Scan threw StackOverflow caused by readPt checking
> --
>
> Key: HBASE-14497
> URL: https://issues.apache.org/jira/browse/HBASE-14497
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 0.98.14, 1.3.0
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: 2.0.0
>
> Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, 
> HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, 
> HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, 
> HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, 
> HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, 
> HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, 
> HBASE-14497-master.patch
>
>
> I met stack overflow error in StoreFileScanner.seekToPreviousRow using 
> reversed scan. I searched and founded HBASE-14155, but it seems to be a 
> different reason.
> The seekToPreviousRow will fetch the row which closest before, and compare 
> mvcc to the readPt, which acquired when scanner created. If the row's mvcc is 
> bigger than readPt, an recursive call of seekToPreviousRow will invoked, to 
> find the next closest before row.
> Considering we created a scanner for reversed scan, and some data with 
> smaller rows was written and flushed, before calling scanner next. When 
> seekToPreviousRow was invoked, it would call itself recursively, until all 
> rows which written after scanner created were iterated. The depth of 
> recursive calling stack depends on the count of rows, the stack overflow 
> error will be threw if the count of rows is large, like 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14563.
---
   Resolution: Fixed
Fix Version/s: 1.3.0
   1.2.0
   2.0.0

Pushed to branch-1.2+

> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14563.txt
>
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14420) Zombie Stomping Session

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14420:
--
Status: Patch Available  (was: Open)

Submitting a non-patch

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14420) Zombie Stomping Session

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14420:
--
Attachment: none_fix.txt

A non-fix just to see how patch build is doing. It is currently quiet.

> Zombie Stomping Session
> ---
>
> Key: HBASE-14420
> URL: https://issues.apache.org/jira/browse/HBASE-14420
> Project: HBase
>  Issue Type: Umbrella
>  Components: test
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Attachments: hangers.txt, none_fix.txt
>
>
> Patch build are now failing most of the time because we are dropping zombies. 
> I confirm we are doing this on non-apache build boxes too.
> Left-over zombies consume resources on build boxes (OOME cannot create native 
> threads). Having to do multiple test runs in the hope that we can get a 
> non-zombie-making build or making (arbitrary) rulings that the zombies are 
> 'not related' is a productivity sink. And so on...
> This is an umbrella issue for a zombie stomping session that started earlier 
> this week. Will hang sub-issues of this one. Am running builds back-to-back 
> on little cluster to turn out the monsters.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14519) Purge TestFavoredNodeAssignmentHelper, a test for an abandoned feature that can hang

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14519:
--
Resolution: Fixed
Status: Resolved  (was: Patch Available)

This has been pushed. Resolving.

> Purge TestFavoredNodeAssignmentHelper, a test for an abandoned feature that 
> can hang
> 
>
> Key: HBASE-14519
> URL: https://issues.apache.org/jira/browse/HBASE-14519
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 0.98.16
>
> Attachments: 14519.txt, 14519v2.txt
>
>
> It came in here:
> commit 7a7ab8b8da795177f42e434b1ab1b468e5cd035a
> Author: Devaraj Das 
> Date:   Sun May 12 06:47:39 2013 +
> HBASE-7932. Introduces Favored Nodes for region files. Adds a balancer 
> called FavoredNodeLoadBalancer that will honor favored nodes in the process 
> of balancing but the balance operation is currently a no-op (Devaraj Das)
> git-svn-id: https://svn.apache.org/repos/asf/hbase/trunk@1481476 
> 13f79535-47bb-0310-9956-ffa450edef68
> I've already purged the other test that came in on this patch...  over in 
> HBASE-14486
> The test hung here:
> https://builds.apache.org/job/PreCommit-HBASE-Build/15823//console
> ... though we seemed to have exited abnormally.
> Will let this issue hang around a while in case someone  disagrees on removal.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking

2015-10-06 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945767#comment-14945767
 ] 

Ted Yu commented on HBASE-14497:


Test suite passed:
{code}
Fetching https://builds.apache.org/job/PreCommit-HBASE-Build/15885/consoleFull
Building remotely on H0 (Hadoop 
Tez) in workspace /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build
Testing patch for HBASE-14497.
Testing patch on branch branch-1.
[INFO] Apache HBase .. SUCCESS [2.722s]
[INFO] Apache HBase - Checkstyle . SUCCESS [0.509s]
[INFO] Apache HBase - Resource Bundle  SUCCESS [0.162s]
[INFO] Apache HBase - Annotations  SUCCESS [0.911s]
[INFO] Apache HBase - Protocol ... SUCCESS [11.037s]
[INFO] Apache HBase - Common . SUCCESS [1:28.402s]
[INFO] Apache HBase - Procedure .. SUCCESS [1:52.825s]
[INFO] Apache HBase - Client . SUCCESS [1:20.748s]
[INFO] Apache HBase - Hadoop Compatibility ... SUCCESS [7.393s]
[INFO] Apache HBase - Hadoop Two Compatibility ... SUCCESS [7.066s]
[INFO] Apache HBase - Prefix Tree  SUCCESS [9.676s]
[INFO] Apache HBase - Server . SUCCESS 
[1:36:58.528s]
[INFO] Apache HBase - Testing Util ... SUCCESS [1.222s]
[INFO] Apache HBase - Thrift . SUCCESS [3:20.890s]
[INFO] Apache HBase - Rest ... SUCCESS [9:11.530s]
[INFO] Apache HBase - Shell .. SUCCESS [5:26.924s]
[INFO] Apache HBase - Integration Tests .. SUCCESS [1.363s]
[INFO] Apache HBase - Examples ... SUCCESS [8.626s]
[INFO] Apache HBase - External Block Cache ... SUCCESS [0.606s]
[INFO] Apache HBase - Assembly ... SUCCESS [1.394s]
[INFO] Apache HBase - Shaded . SUCCESS [0.083s]
[INFO] Apache HBase - Shaded - Client  SUCCESS [0.359s]
[INFO] Apache HBase - Shaded - Server  SUCCESS [0.483s]
Printing hanging tests
Printing Failing tests
{code}

> Reverse Scan threw StackOverflow caused by readPt checking
> --
>
> Key: HBASE-14497
> URL: https://issues.apache.org/jira/browse/HBASE-14497
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 0.98.14, 1.3.0
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: 2.0.0
>
> Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, 
> HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, 
> HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, 
> HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, 
> HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, 
> HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, 
> HBASE-14497-master.patch
>
>
> I met stack overflow error in StoreFileScanner.seekToPreviousRow using 
> reversed scan. I searched and founded HBASE-14155, but it seems to be a 
> different reason.
> The seekToPreviousRow will fetch the row which closest before, and compare 
> mvcc to the readPt, which acquired when scanner created. If the row's mvcc is 
> bigger than readPt, an recursive call of seekToPreviousRow will invoked, to 
> find the next closest before row.
> Considering we created a scanner for reversed scan, and some data with 
> smaller rows was written and flushed, before calling scanner next. When 
> seekToPreviousRow was invoked, it would call itself recursively, until all 
> rows which written after scanner created were iterated. The depth of 
> recursive calling stack depends on the count of rows, the stack overflow 
> error will be threw if the count of rows is large, like 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14511) StoreFile.Writer Meta Plugin

2015-10-06 Thread Devaraj Das (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945771#comment-14945771
 ] 

Devaraj Das commented on HBASE-14511:
-

[~vladrodionov] when you say it doesn't work with MOB, could you say what's not 
working. Is there some test to repro the failure?

> StoreFile.Writer Meta Plugin
> 
>
> Key: HBASE-14511
> URL: https://issues.apache.org/jira/browse/HBASE-14511
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Attachments: HBASE-14511.v1.patch, HBASE-14511.v2.patch
>
>
> During my work on a new compaction policies (HBASE-14468, HBASE-14477) I had 
> to modify the existing code of a StoreFile.Writer to add additional meta-info 
> required by these new  policies. I think that it should be done by means of a 
> new Plugin framework, because this seems to be a general capability/feature. 
> As a future enhancement this can become a part of a more general 
> StoreFileWriter/Reader plugin architecture. But I need only Meta section of a 
> store file.
> This could be used, for example, to collect rowkeys distribution information 
> during hfile creation. This info can be used later to find the optimal region 
> split key or to create optimal set of sub-regions for M/R jobs or other jobs 
> which can operate on a sub-region level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14497:
---
Fix Version/s: 1.3.0

> Reverse Scan threw StackOverflow caused by readPt checking
> --
>
> Key: HBASE-14497
> URL: https://issues.apache.org/jira/browse/HBASE-14497
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 0.98.14, 1.3.0
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, 
> HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, 
> HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, 
> HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, 
> HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, 
> HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, 
> HBASE-14497-master.patch
>
>
> I met stack overflow error in StoreFileScanner.seekToPreviousRow using 
> reversed scan. I searched and founded HBASE-14155, but it seems to be a 
> different reason.
> The seekToPreviousRow will fetch the row which closest before, and compare 
> mvcc to the readPt, which acquired when scanner created. If the row's mvcc is 
> bigger than readPt, an recursive call of seekToPreviousRow will invoked, to 
> find the next closest before row.
> Considering we created a scanner for reversed scan, and some data with 
> smaller rows was written and flushed, before calling scanner next. When 
> seekToPreviousRow was invoked, it would call itself recursively, until all 
> rows which written after scanner created were iterated. The depth of 
> recursive calling stack depends on the count of rows, the stack overflow 
> error will be threw if the count of rows is large, like 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread Mikhail Antonov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945789#comment-14945789
 ] 

Mikhail Antonov commented on HBASE-14559:
-

[~stack] when you traced the tests using excessive number of threads, did they 
timeout because they run slower with lower number of threads, or did they 
deadlock?

I think there're still bugs lurking around in the implementation :( If we have 
number of thread handlers 3 rather than 40, I might expect things running 
noticeably slower, but not the deadlock?

> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Resolved] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack resolved HBASE-14436.
---
   Resolution: Fixed
 Assignee: stack
 Hadoop Flags: Reviewed
Fix Version/s: 0.98.16
   1.1.3
   1.0.3
   1.3.0
   1.2.0
   2.0.0

Pushed to 0.98+

> HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create 
> new Configuration
> ---
>
> Key: HBASE-14436
> URL: https://issues.apache.org/jira/browse/HBASE-14436
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch
>
>
> HTableDescriptor#addCoprocessor will set the coprocessor value as following 
> format:
> {code}
>  public HTableDescriptor addCoprocessor(String className, Path jarFilePath,
>  int priority, final Map kvs)
>   throws IOException {
>   ...
>   String value = ((jarFilePath == null)? "" : jarFilePath.toString()) +
> "|" + className + "|" + Integer.toString(priority) + "|" +
> kvString.toString();
>   ...
> }
> {code}
> If the 'jarFilePath' is null,  the 'value' will always has the format 
> '|className|priority|'  even if 'kvs' is null, which means no extra arguments 
> for the coprocessor. Then, in the server side, 
> RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table 
> coprocessors as:
> {code}
>   static List 
> getTableCoprocessorAttrsFromSchema(Configuration conf,
>   HTableDescriptor htd) {
> ...
> try {
>   cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the 
> format '|className|priority|'
> } catch (IndexOutOfBoundsException ex) {
>   // ignore
> }
> Configuration ourConf;
> if (cfgSpec != null) {  // => cfgSpec will be '|' for the format 
> '|className|priority|'
>   ourConf = new Configuration(false);
>   HBaseConfiguration.merge(ourConf, conf);
> }
> ...
> }
> {code}
> The 'cfgSpec' will be '|' for the coprocessor formatted as 
> '|className|priority|', so that always create a new Configuration.
> In our production, there are a lot of tables having table-level coprocessors, 
> so that the region server will create new Configurations for each region of 
> the table, this will consume a certain number of memory when we have many 
> such regions.
> To fix the problem, we can make the HTableDescriptor not append the '|' if no 
> extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in 
> server side which could avoid creating new Configurations for existed such 
> regions after the regions reopened. Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Created] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster

2015-10-06 Thread Ted Yu (JIRA)

Ted Yu created HBASE-14565:
--

 Summary: Make ZK connection timeout configurable in 
MiniZooKeeperCluster
 Key: HBASE-14565
 URL: https://issues.apache.org/jira/browse/HBASE-14565
 Project: HBase
  Issue Type: Improvement
Reporter: Ted Yu
Assignee: Ted Yu


This request was made by [~swagle] who works on Ambari Metrics System.

Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster

This affects operation of Ambari Metrics System in standalone mode.

This JIRA is to make the connection timeout configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14565:
---
Status: Patch Available  (was: Open)

> Make ZK connection timeout configurable in MiniZooKeeperCluster
> ---
>
> Key: HBASE-14565
> URL: https://issues.apache.org/jira/browse/HBASE-14565
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14565-v1.txt
>
>
> This request was made by [~swagle] who works on Ambari Metrics System.
> Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster
> This affects operation of Ambari Metrics System in standalone mode.
> This JIRA is to make the connection timeout configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14565:
---
Attachment: 14565-v1.txt

> Make ZK connection timeout configurable in MiniZooKeeperCluster
> ---
>
> Key: HBASE-14565
> URL: https://issues.apache.org/jira/browse/HBASE-14565
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14565-v1.txt
>
>
> This request was made by [~swagle] who works on Ambari Metrics System.
> Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster
> This affects operation of Ambari Metrics System in standalone mode.
> This JIRA is to make the connection timeout configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945844#comment-14945844
 ] 

Hudson commented on HBASE-14563:


SUCCESS: Integrated in HBase-1.3-IT #214 (See 
[https://builds.apache.org/job/HBase-1.3-IT/214/])
HBASE-14563 Disable zombie TestHFileOutputFormat2 (stack: rev 
aeb3a624590be8bd276e58bba9d4debfb3e7759f)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java


> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14563.txt
>
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14511) StoreFile.Writer Meta Plugin

2015-10-06 Thread Vladimir Rodionov (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14511?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945848#comment-14945848
 ] 

Vladimir Rodionov commented on HBASE-14511:
---

Yes, two or three MOB tests fails constantly if there is additional data in a 
meta section of a store file. I think it is the MOB issue.

> StoreFile.Writer Meta Plugin
> 
>
> Key: HBASE-14511
> URL: https://issues.apache.org/jira/browse/HBASE-14511
> Project: HBase
>  Issue Type: New Feature
>Reporter: Vladimir Rodionov
>Assignee: Vladimir Rodionov
> Attachments: HBASE-14511.v1.patch, HBASE-14511.v2.patch
>
>
> During my work on a new compaction policies (HBASE-14468, HBASE-14477) I had 
> to modify the existing code of a StoreFile.Writer to add additional meta-info 
> required by these new  policies. I think that it should be done by means of a 
> new Plugin framework, because this seems to be a general capability/feature. 
> As a future enhancement this can become a part of a more general 
> StoreFileWriter/Reader plugin architecture. But I need only Meta section of a 
> store file.
> This could be used, for example, to collect rowkeys distribution information 
> during hfile creation. This info can be used later to find the optimal region 
> split key or to create optimal set of sub-regions for M/R jobs or other jobs 
> which can operate on a sub-region level.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14561) Disable zombie TestReplicationShell

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945862#comment-14945862
 ] 

Hudson commented on HBASE-14561:


FAILURE: Integrated in HBase-TRUNK #6878 (See 
[https://builds.apache.org/job/HBase-TRUNK/6878/])
HBASE-14561 Disable zombie TestReplicationShell (stack: rev 
fd6acbbf51998a964b6dc0c7d3ee471399a03baa)
* 
hbase-shell/src/test/java/org/apache/hadoop/hbase/client/TestReplicationShell.java


> Disable zombie TestReplicationShell
> ---
>
> Key: HBASE-14561
> URL: https://issues.apache.org/jira/browse/HBASE-14561
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0
>
> Attachments: 14561.txt
>
>
> It hung three times in last 40 test runs. Will file issue to reenable it when 
> someone has chance to look at why it is hanging.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14559) branch-1 test tweeks; disable assert explicit region lands post-restart and up a few handlers

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945864#comment-14945864
 ] 

stack commented on HBASE-14559:
---

bq. stack when you traced the tests using excessive number of threads, did they 
timeout because they run slower with lower number of threads, or did they 
deadlock?

At the extreme, the test would not run (OOME could not create native thread). I 
cut the thread count down and then they would not complete (smile).

bq. I think there're still bugs lurking around in the implementation  If we 
have number of thread handlers 3 rather than 40, I might expect things running 
noticeably slower, but not the deadlock?

The deadlock was handlers all occupied.. not enough for the test to complete.

> branch-1 test tweeks; disable assert explicit region lands post-restart and 
> up a few handlers
> -
>
> Key: HBASE-14559
> URL: https://issues.apache.org/jira/browse/HBASE-14559
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14559.branch-1.txt, 14559.master.txt
>
>
> Running branch-1 on internal rig (trying to stabilize branch-1/branch-1.2). 
> Small tweaks get tests to pass. Small one liners that up priority handler 
> count and disable assert that seems wrong -- that we'll always get an explcit 
> region to land on a newly started server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-13819) Make RPC layer CellBlock buffer a DirectByteBuffer

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-13819?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-13819:
--
Attachment: HBASE-13819_branch-1.patch

Retry

Commit I'd say [~anoop.hbase]  Needs a release note sir.

> Make RPC layer CellBlock buffer a DirectByteBuffer
> --
>
> Key: HBASE-13819
> URL: https://issues.apache.org/jira/browse/HBASE-13819
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: Anoop Sam John
>Assignee: Anoop Sam John
> Fix For: 2.0.0, 1.3.0
>
> Attachments: HBASE-13819.patch, HBASE-13819_branch-1.patch, 
> HBASE-13819_branch-1.patch, HBASE-13819_branch-1.patch
>
>
> In RPC layer, when we make a cellBlock to put as RPC payload, we will make an 
> on heap byte buffer (via BoundedByteBufferPool). The pool will keep upto 
> certain number of buffers. This jira aims at testing possibility for making 
> this buffers off heap ones. (DBB)  The advantages
> 1. Unsafe based writes to off heap is faster than that to on heap. Now we are 
> not using unsafe based writes at all. Even if we add, DBB will be better
> 2. When Cells are backed by off heap (HBASE-11425) off heap to off heap 
> writes will be better
> 3. When checked the code in SocketChannel impl, if we pass a HeapByteBuffer 
> to the socket channel, it will create a temp DBB and copy data to there and 
> only DBBs will be moved to Sockets. If we make DBB 1st hand itself, we can  
> avoid this one more level of copying.
> Will do different perf testing with changed and report back.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12615) Document GC conserving guidelines for contributors

2015-10-06 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12615?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945889#comment-14945889
 ] 

Jonathan Hsieh commented on HBASE-12615:


Wow that is a lot of trailing space removal.

caught this in one of them:
nit : This will output smt like: -> This will output something like:

lgtm +1.

> Document GC conserving guidelines for contributors
> --
>
> Key: HBASE-12615
> URL: https://issues.apache.org/jira/browse/HBASE-12615
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: Andrew Purtell
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-12615.patch
>
>
> LinkedIn put up a blog post with a nice concise list of GC conserving 
> techniques we should document for contributors. Additionally, when we're at a 
> point our build supports custom error-prone plugins, we can develop warnings 
> for some of them. 
> Source: 
> http://engineering.linkedin.com/performance/linkedin-feed-faster-less-jvm-garbage
> - Be careful with Iterators
> - Estimate the size of a collection when initializing
> - Defer expression evaluation
> - Compile the regex patterns in advance
> - Cache it if you can
> - String Interns are useful but dangerous
> All good advice and practice that I know we aim for. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12983) HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled

2015-10-06 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945891#comment-14945891
 ] 

Jonathan Hsieh commented on HBASE-12983:


+1 lgtm.

> HBase book mentions hadoop.ssl.enabled when it should be hbase.ssl.enabled
> --
>
> Key: HBASE-12983
> URL: https://issues.apache.org/jira/browse/HBASE-12983
> Project: HBase
>  Issue Type: Bug
>  Components: documentation
>Reporter: Esteban Gutierrez
>Assignee: Misty Stanley-Jones
> Attachments: HBASE-12983.patch
>
>
> In the HBase book we say the following:
> {quote}
> A default HBase install uses insecure HTTP connections for web UIs for the 
> master and region servers. To enable secure HTTP (HTTPS) connections instead, 
> set *hadoop.ssl.enabled* to true in hbase-site.xml. This does not change the 
> port used by the Web UI. To change the port for the web UI for a given HBase 
> component, configure that port’s setting in hbase-site.xml. These settings 
> are:
> {quote}
> The property should be *hbase.ssl.enabled* instead. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13425) Documentation nit in REST Gateway impersonation section

2015-10-06 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13425?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945902#comment-14945902
 ] 

Jonathan Hsieh commented on HBASE-13425:


+1 lgtm.

> Documentation nit in REST Gateway impersonation section
> ---
>
> Key: HBASE-13425
> URL: https://issues.apache.org/jira/browse/HBASE-13425
> Project: HBase
>  Issue Type: Improvement
>  Components: documentation
>Affects Versions: 2.0.0
>Reporter: Jeremie Gomez
>Assignee: Misty Stanley-Jones
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-13425.patch
>
>
> In section "55.8. REST Gateway Impersonation Configuration", there is another 
> property that needs to be set (and thus documented).
> After this sentence ("To enable REST gateway impersonation, add the following 
> to the hbase-site.xml file for every REST gateway."), we should add :
> 
>hbase.rest.support.proxyuser
> true
> 
> It not set, doing a curl call on the rest gateway gives the error "support 
> for proxyuser is not configured".



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13478) Document the change of default master ports being used .

2015-10-06 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13478?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945903#comment-14945903
 ] 

Jonathan Hsieh commented on HBASE-13478:


+1 lgtm.

> Document the change of default master ports being used .
> 
>
> Key: HBASE-13478
> URL: https://issues.apache.org/jira/browse/HBASE-13478
> Project: HBase
>  Issue Type: Sub-task
>  Components: documentation
>Reporter: Srikanth Srungarapu
>Assignee: Misty Stanley-Jones
>Priority: Minor
> Attachments: HBASE-13478.patch
>
>
> In 1.0.x, master by default binds to the region server ports. But in 1.1 and 
> 2.0 branches, we have undone this changes and brought back the usage of old 
> master ports to make the migration from 0.98 -> 1.1 hassle free.  Please see 
> the parent jira for more background.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14424) Document that DisabledRegionSplitPolicy blocks manual splits

2015-10-06 Thread Jonathan Hsieh (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945907#comment-14945907
 ] 

Jonathan Hsieh commented on HBASE-14424:


Change 
'DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy etc. 
DisabledRegionSplitPolicy'
to
'DelimitedKeyPrefixRegionSplitPolicy, KeyPrefixRegionSplitPolicy, 
DisabledRegionSplitPolicy, etc.'
then +1 lgtm.

> Document that DisabledRegionSplitPolicy blocks manual splits
> 
>
> Key: HBASE-14424
> URL: https://issues.apache.org/jira/browse/HBASE-14424
> Project: HBase
>  Issue Type: Task
>  Components: documentation
>Reporter: Misty Stanley-Jones
>Assignee: Misty Stanley-Jones
>Priority: Minor
> Attachments: HBASE-14424.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-13082) Coarsen StoreScanner locks to RegionScanner

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-13082?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945926#comment-14945926
 ] 

stack commented on HBASE-13082:
---

bq. But in case of bulk loaded files, currently in between a scan if a new file 
is bulk loaded it gets included, so after this it will not be. is that 
behavioral change fine?

Sorry, say more [~ram_krish]. So, bulk load won't show mid-scan... you have to 
get to the end? That would be fine.

On the patch, can we get more of Lars comments in on what is going on  
Could we get rid of some of these getReaderLocks too... in hstorefile, in 
hstore, etc would be good to not let this stuff out if we can.



> Coarsen StoreScanner locks to RegionScanner
> ---
>
> Key: HBASE-13082
> URL: https://issues.apache.org/jira/browse/HBASE-13082
> Project: HBase
>  Issue Type: Bug
>Reporter: Lars Hofhansl
>Assignee: ramkrishna.s.vasudevan
> Attachments: 13082-test.txt, 13082-v2.txt, 13082-v3.txt, 
> 13082-v4.txt, 13082.txt, 13082.txt, gc.png, gc.png, gc.png, hits.png, 
> next.png, next.png
>
>
> Continuing where HBASE-10015 left of.
> We can avoid locking (and memory fencing) inside StoreScanner by deferring to 
> the lock already held by the RegionScanner.
> In tests this shows quite a scan improvement and reduced CPU (the fences make 
> the cores wait for memory fetches).
> There are some drawbacks too:
> * All calls to RegionScanner need to be remain synchronized
> * Implementors of coprocessors need to be diligent in following the locking 
> contract. For example Phoenix does not lock RegionScanner.nextRaw() and 
> required in the documentation (not picking on Phoenix, this one is my fault 
> as I told them it's OK)
> * possible starving of flushes and compaction with heavy read load. 
> RegionScanner operations would keep getting the locks and the 
> flushes/compactions would not be able finalize the set of files.
> I'll have a patch soon.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14565:
---
Attachment: 14565-v1.txt

> Make ZK connection timeout configurable in MiniZooKeeperCluster
> ---
>
> Key: HBASE-14565
> URL: https://issues.apache.org/jira/browse/HBASE-14565
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14565-v1.txt
>
>
> This request was made by [~swagle] who works on Ambari Metrics System.
> Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster
> This affects operation of Ambari Metrics System in standalone mode.
> This JIRA is to make the connection timeout configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14565) Make ZK connection timeout configurable in MiniZooKeeperCluster

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14565?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14565:
---
Attachment: (was: 14565-v1.txt)

> Make ZK connection timeout configurable in MiniZooKeeperCluster
> ---
>
> Key: HBASE-14565
> URL: https://issues.apache.org/jira/browse/HBASE-14565
> Project: HBase
>  Issue Type: Improvement
>Reporter: Ted Yu
>Assignee: Ted Yu
> Attachments: 14565-v1.txt
>
>
> This request was made by [~swagle] who works on Ambari Metrics System.
> Currently a hardcoded timeout of 30s is used by MiniZooKeeperCluster
> This affects operation of Ambari Metrics System in standalone mode.
> This JIRA is to make the connection timeout configurable.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14420) Zombie Stomping Session

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945948#comment-14945948
 ] 

stack commented on HBASE-14420:
---

Looking at trunk builds, I see these failures in last twenty builds:

{code}
   2 Hanging test : org.apache.hadoop.hbase.TestMovedRegionsCleaner
   2 Hanging test : org.apache.hadoop.hbase.TestMultiVersions
   2 Hanging test : org.apache.hadoop.hbase.TestPartialResultsFromClientSide
   2 Hanging test : org.apache.hadoop.hbase.backup.TestHFileArchiving
  10 Hanging test : org.apache.hadoop.hbase.client.TestFromClientSide
   8 Hanging test : 
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor
   2 Hanging test : org.apache.hadoop.hbase.client.TestReplicasClient
   2 Hanging test : org.apache.hadoop.hbase.http.TestHttpServer
   2 Hanging test : 
org.apache.hadoop.hbase.mapred.TestMultiTableSnapshotInputFormat
   4 Hanging test : org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat
   2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat
   4 Hanging test : org.apache.hadoop.hbase.mapreduce.TestHFileOutputFormat2
   2 Hanging test : org.apache.hadoop.hbase.mapreduce.TestImportExport
   2 Hanging test : 
org.apache.hadoop.hbase.mapreduce.TestSecureLoadIncrementalHFilesSplitRecovery
   2 Hanging test : org.apache.hadoop.hbase.master.TestDistributedLogSplitting
   4 Hanging test : org.apache.hadoop.hbase.master.TestSplitLogManager
   2 Hanging test : org.apache.hadoop.hbase.master.TestTableLockManager
   2 Hanging test : org.apache.hadoop.hbase.master.TestWarmupRegion
   2 Hanging test : 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer
   2 Hanging test : 
org.apache.hadoop.hbase.master.balancer.TestStochasticLoadBalancer2
   4 Hanging test : 
org.apache.hadoop.hbase.master.procedure.TestMasterFailoverWithProcedures
   1 Hanging test : 
org.apache.hadoop.hbase.regionserver.TestDefaultCompactSelection
   2 Hanging test : org.apache.hadoop.hbase.regionserver.TestMobStoreScanner
   2 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint
   2 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager
   2 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes
   2 Hanging test : org.apache.hadoop.hbase.util.TestHBaseFsck
   2 Hanging test : org.apache.hadoop.hbase.util.TestMiniClusterLoadEncoded
   2 Hanging test : org.apache.hadoop.hbase.util.TestMiniClusterLoadParallel
   2 Hanging test : org.apache.hadoop.hbase.util.TestRegionSplitter
   2 Hanging test : org.apache.hadoop.hbase.wal.TestWALFiltering
   2 Hanging test : org.apache.hadoop.hbase.wal.TestWALSplit
   2 Hanging test : org.apache.hadoop.hbase.zookeeper.TestHQuorumPeer
{code}

Here is branch-1.1 builds:

{code}
   1 Hanging test : org.apache.hadoop.hbase.TestPartialResultsFromClientSide
   1 Hanging test : org.apache.hadoop.hbase.client.TestAdmin1
   1 Hanging test : org.apache.hadoop.hbase.client.TestCloneSnapshotFromClient
   1 Hanging test : 
org.apache.hadoop.hbase.client.TestFromClientSideWithCoprocessor
   1 Hanging test : org.apache.hadoop.hbase.client.TestHCM
   1 Hanging test : org.apache.hadoop.hbase.client.TestRestoreSnapshotFromClient
   1 Hanging test : org.apache.hadoop.hbase.mapred.TestTableSnapshotInputFormat
   1 Hanging test : org.apache.hadoop.hbase.quotas.TestQuotaAdmin
   1 Hanging test : org.apache.hadoop.hbase.quotas.TestQuotaThrottle
   1 Hanging test : org.apache.hadoop.hbase.regionserver.TestJoinedScanners
   1 Hanging test : org.apache.hadoop.hbase.regionserver.TestTags
   1 Hanging test : 
org.apache.hadoop.hbase.regionserver.TestZKLessSplitOnCluster
   1 Hanging test : org.apache.hadoop.hbase.regionserver.wal.TestLogRolling
   1 Hanging test : org.apache.hadoop.hbase.replication.TestMasterReplication
   1 Hanging test : 
org.apache.hadoop.hbase.replication.TestReplicationSmallTests
   1 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestRegionReplicaReplicationEndpoint
   1 Hanging test : 
org.apache.hadoop.hbase.replication.regionserver.TestReplicationWALReaderManager
   1 Hanging test : org.apache.hadoop.hbase.security.access.TestAccessController
   1 Hanging test : 
org.apache.hadoop.hbase.security.access.TestAccessController2
   1 Hanging test : org.apache.hadoop.hbase.security.access.TestTablePermissions
   1 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsReplication
   1 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDefaultVisLabelService
   2 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDeletes
   1 Hanging test : 
org.apache.hadoop.hbase.security.visibility.TestVisibilityLabelsWithDistributedLogReplay
{code}



> Zombie Stomping Session
> ---
>
>

[jira] [Updated] (HBASE-12911) Client-side metrics

2015-10-06 Thread Nick Dimiduk (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Nick Dimiduk updated HBASE-12911:
-
Attachment: 12911.yammer.v03.branch-1.patch

Here's a backport of yammer.v03 to branch-1.

[~busbey] I think we're close here. Are you interested in this for 1.2?

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, 12911.yammer.jpg, 12911.yammer.v00.patch, 
> 12911.yammer.v01.patch, 12911.yammer.v02.patch, 12911.yammer.v02.patch, 
> 12911.yammer.v03.branch-1.patch, 12911.yammer.v03.patch, 
> 12911.yammer.v03.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12593) Tags and Tag dictionary to work with BB

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12593?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945953#comment-14945953
 ] 

stack commented on HBASE-12593:
---

What needs to be done here?

> Tags and Tag dictionary to work with BB
> ---
>
> Key: HBASE-12593
> URL: https://issues.apache.org/jira/browse/HBASE-12593
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: ramkrishna.s.vasudevan
>Assignee: Anoop Sam John
>
> Adding the subtask so that we don't forget it. Came up while reviewing the 
> items required for this parent task.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14398) Create the fake keys required in the scan path to avoid copy to byte[]

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945962#comment-14945962
 ] 

stack commented on HBASE-14398:
---

bq. This was a long discussion that we had before finalising. 

Yeah. I remember that one. Seems to be about a topic that is a little different 
to the question here.

Why does ByteBufferedCell have to have getFamilyPositionInByteBuffer at all? 
Why can't I just call getFamilyOffset on the ByteBufferedCell implementation 
and it returns me an offset that makes sense on the ByteBuffer returned out of 
getFamilyByteBuffer? (A Cell can't be simultaneously onheap and offheap at same 
time, right)

> Create the fake keys required in the scan path to avoid copy to byte[]
> --
>
> Key: HBASE-14398
> URL: https://issues.apache.org/jira/browse/HBASE-14398
> Project: HBase
>  Issue Type: Sub-task
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-14398.patch, HBASE-14398_1.patch
>
>
> Already we have created some fake keys for the ByteBufferedCells so that we 
> can avoid the copy requried to create fake keys. This JIRA aims to fill up 
> all such places so that the Offheap BBs are not copied to onheap byte[].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14221) Reduce the number of time row comparison is done in a Scan

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945966#comment-14945966
 ] 

stack commented on HBASE-14221:
---

[~lhofhansl] For you...

> Reduce the number of time row comparison is done in a Scan
> --
>
> Key: HBASE-14221
> URL: https://issues.apache.org/jira/browse/HBASE-14221
> Project: HBase
>  Issue Type: Sub-task
>  Components: Scanners
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
> Fix For: 2.0.0
>
> Attachments: HBASE-14221.patch, HBASE-14221_1.patch, 
> HBASE-14221_1.patch, HBASE-14221_6.patch, withmatchingRowspatch.png, 
> withoutmatchingRowspatch.png
>
>
> When we tried to do some profiling with the PE tool found this.
> Currently we do row comparisons in 3 places in a simple Scan case.
> 1) ScanQueryMatcher
> {code}
>int ret = this.rowComparator.compareRows(curCell, cell);
> if (!this.isReversed) {
>   if (ret <= -1) {
> return MatchCode.DONE;
>   } else if (ret >= 1) {
> // could optimize this, if necessary?
> // Could also be called SEEK_TO_CURRENT_ROW, but this
> // should be rare/never happens.
> return MatchCode.SEEK_NEXT_ROW;
>   }
> } else {
>   if (ret <= -1) {
> return MatchCode.SEEK_NEXT_ROW;
>   } else if (ret >= 1) {
> return MatchCode.DONE;
>   }
> }
> {code}
> 2) In StoreScanner next() while starting to scan the row
> {code}
> if (!scannerContext.hasAnyLimit(LimitScope.BETWEEN_CELLS) || 
> matcher.curCell == null ||
> isNewRow || !CellUtil.matchingRow(peeked, matcher.curCell)) {
>   this.countPerRow = 0;
>   matcher.setToNewRow(peeked);
> }
> {code}
> Particularly to see if we are in a new row.
> 3) In HRegion
> {code}
>   scannerContext.setKeepProgress(true);
>   heap.next(results, scannerContext);
>   scannerContext.setKeepProgress(tmpKeepProgress);
>   nextKv = heap.peek();
> moreCellsInRow = moreCellsInRow(nextKv, currentRowCell);
> {code}
> Here again there are cases where we need to careful for a MultiCF case.  Was 
> trying to solve this for the MultiCF case but is having lot of cases to 
> solve. But atleast for a single CF case I think these comparison can be 
> reduced.
> So for a single CF case in the SQM we are able to find if we have crossed a 
> row using the code pasted above in SQM. That comparison is definitely needed.
> Now in case of a single CF the HRegion is going to have only one element in 
> the heap and so the 3rd comparison can surely be avoided if the 
> StoreScanner.next() was over due to MatchCode.DONE caused by SQM.
> Coming to the 2nd compareRows that we do in StoreScanner. next() - even that 
> can be avoided if we know that the previous next() call was over due to a new 
> row. Doing all this I found that the compareRows in the profiler which was 
> 19% got reduced to 13%. Initially we can solve for single CF case which can 
> be extended to MultiCF cases.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14117) Check DBEs where fields are being read from Bytebuffers but unused.

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945968#comment-14945968
 ] 

stack commented on HBASE-14117:
---

What else is to be done here? If it is speculative benefit, move it out as 
subtask of the parent issue?

> Check DBEs where fields are being read from Bytebuffers but unused.
> ---
>
> Key: HBASE-14117
> URL: https://issues.apache.org/jira/browse/HBASE-14117
> Project: HBase
>  Issue Type: Sub-task
>Reporter: ramkrishna.s.vasudevan
>Assignee: Jingcheng Du
>
> {code}
> public Cell getFirstKeyCellInBlock(ByteBuff block) {
> block.mark();
> block.position(Bytes.SIZEOF_INT);
> int keyLength = ByteBuff.readCompressedInt(block);
> // TODO : See if we can avoid these reads as the read values are not 
> getting used
> ByteBuff.readCompressedInt(block);
> {code}
> In DBEs many a places we read the integers just to skip them. This JIRA is to 
> see if we can avoid this and rather go position based, as per a review 
> comment in HBASE-12213.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12291) Create Read only buffers where ever possible

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945970#comment-14945970
 ] 

stack commented on HBASE-12291:
---

That'd be nice. Does it have to be part of the parent issue given it 
speculative?

> Create Read only buffers where ever possible
> 
>
> Key: HBASE-12291
> URL: https://issues.apache.org/jira/browse/HBASE-12291
> Project: HBase
>  Issue Type: Sub-task
>  Components: regionserver, Scanners
>Reporter: ramkrishna.s.vasudevan
>Assignee: ramkrishna.s.vasudevan
>
> This issue is to see if we can really create a Read only buffer in the read 
> path. Later can see if this needs to be BR or our own BB impl.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14497) Reverse Scan threw StackOverflow caused by readPt checking

2015-10-06 Thread Ted Yu (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ted Yu updated HBASE-14497:
---
Resolution: Fixed
Status: Resolved  (was: Patch Available)

> Reverse Scan threw StackOverflow caused by readPt checking
> --
>
> Key: HBASE-14497
> URL: https://issues.apache.org/jira/browse/HBASE-14497
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 2.0.0, 0.98.14, 1.3.0
>Reporter: Yerui Sun
>Assignee: Yerui Sun
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 14497-branch-1-v6.patch, 14497-master-v6.patch, 
> HBASE-14497-0.98.patch, HBASE-14497-branch-1-v2.patch, 
> HBASE-14497-branch-1-v3.patch, HBASE-14497-branch-1-v6.patch, 
> HBASE-14497-branch-1.patch, HBASE-14497-master-v2.patch, 
> HBASE-14497-master-v3.patch, HBASE-14497-master-v3.patch, 
> HBASE-14497-master-v4.patch, HBASE-14497-master-v5.patch, 
> HBASE-14497-master.patch
>
>
> I met stack overflow error in StoreFileScanner.seekToPreviousRow using 
> reversed scan. I searched and founded HBASE-14155, but it seems to be a 
> different reason.
> The seekToPreviousRow will fetch the row which closest before, and compare 
> mvcc to the readPt, which acquired when scanner created. If the row's mvcc is 
> bigger than readPt, an recursive call of seekToPreviousRow will invoked, to 
> find the next closest before row.
> Considering we created a scanner for reversed scan, and some data with 
> smaller rows was written and flushed, before calling scanner next. When 
> seekToPreviousRow was invoked, it would call itself recursively, until all 
> rows which written after scanner created were iterated. The depth of 
> recursive calling stack depends on the count of rows, the stack overflow 
> error will be threw if the count of rows is large, like 1.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14945999#comment-14945999
 ] 

Hudson commented on HBASE-14436:


FAILURE: Integrated in HBase-1.0 #1073 (See 
[https://builds.apache.org/job/HBase-1.0/1073/])
HBASE-14436 HTableDescriptor#addCoprocessor will always make (stack: rev 
c1890b5b15a3cb3ed9c00f4326e4eb6b583c55a6)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java


> HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create 
> new Configuration
> ---
>
> Key: HBASE-14436
> URL: https://issues.apache.org/jira/browse/HBASE-14436
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch
>
>
> HTableDescriptor#addCoprocessor will set the coprocessor value as following 
> format:
> {code}
>  public HTableDescriptor addCoprocessor(String className, Path jarFilePath,
>  int priority, final Map kvs)
>   throws IOException {
>   ...
>   String value = ((jarFilePath == null)? "" : jarFilePath.toString()) +
> "|" + className + "|" + Integer.toString(priority) + "|" +
> kvString.toString();
>   ...
> }
> {code}
> If the 'jarFilePath' is null,  the 'value' will always has the format 
> '|className|priority|'  even if 'kvs' is null, which means no extra arguments 
> for the coprocessor. Then, in the server side, 
> RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table 
> coprocessors as:
> {code}
>   static List 
> getTableCoprocessorAttrsFromSchema(Configuration conf,
>   HTableDescriptor htd) {
> ...
> try {
>   cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the 
> format '|className|priority|'
> } catch (IndexOutOfBoundsException ex) {
>   // ignore
> }
> Configuration ourConf;
> if (cfgSpec != null) {  // => cfgSpec will be '|' for the format 
> '|className|priority|'
>   ourConf = new Configuration(false);
>   HBaseConfiguration.merge(ourConf, conf);
> }
> ...
> }
> {code}
> The 'cfgSpec' will be '|' for the coprocessor formatted as 
> '|className|priority|', so that always create a new Configuration.
> In our production, there are a lot of tables having table-level coprocessors, 
> so that the region server will create new Configurations for each region of 
> the table, this will consume a certain number of memory when we have many 
> such regions.
> To fix the problem, we can make the HTableDescriptor not append the '|' if no 
> extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in 
> server side which could avoid creating new Configurations for existed such 
> regions after the regions reopened. Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14517) Show regionserver's version in master status page

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14517?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946000#comment-14946000
 ] 

stack commented on HBASE-14517:
---

This looks really nice [~liushaohui] Operators will like it. Why you move the 
VersionInfo from RPC to HBase protos? Will that break anyone (I don't think 
so.. since you do not change the pb data structure)

> Show regionserver's version in master status page
> -
>
> Key: HBASE-14517
> URL: https://issues.apache.org/jira/browse/HBASE-14517
> Project: HBase
>  Issue Type: Improvement
>  Components: monitoring
>Reporter: Liu Shaohui
>Assignee: Liu Shaohui
>Priority: Minor
> Fix For: 2.0.0
>
> Attachments: HBASE-14517-v1.diff
>
>
> In production env, regionservers may be removed from the cluster for hardware 
> problems and rejoined the cluster after the repair. There is a potential risk 
> that the version of rejoined regionserver may diff from others because the 
> cluster has been upgraded through many versions. 
> To solve this, we can show the all regionservers' version in the server list 
> of master's status page, and highlight the regionserver when its version is 
> different from the master's version, similar to HDFS-3245
> Suggestions are welcome~



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14493) Upgrade the jamon-runtime dependency to the newer version MPL 2.0

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14493?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946001#comment-14946001
 ] 

stack commented on HBASE-14493:
---

+1 from me [~apurtell]

Yeah, you good w/ this [~busbey]?

> Upgrade the jamon-runtime dependency to the newer version MPL 2.0
> -
>
> Key: HBASE-14493
> URL: https://issues.apache.org/jira/browse/HBASE-14493
> Project: HBase
>  Issue Type: Task
>Affects Versions: 1.1.1
>Reporter: Newton Alex
>Assignee: Andrew Purtell
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 0.98.16
>
> Attachments: HBASE-14493-0.98.patch, HBASE-14493-branch-1.patch, 
> HBASE-14493.patch, HBASE-14493.patch
>
>
> Current version of HBase uses MPL 1.1 which has legal restrictions. Newer 
> versions of jamon-runtime appear to be MPL 2.0. HBase should upgrade to a 
> safer licensed version of jamon.
> 2.4.0 is MPL 1.1 : 
> http://grepcode.com/snapshot/repo1.maven.org/maven2/org.jamon/jamon-runtime/2.4.0
> 2.4.1 is MPL 2.0 : 
> http://grepcode.com/snapshot/repo1.maven.org/maven2/org.jamon/jamon-runtime/2.4.1
> Here’s a comparison of the equivalent sections of the respective licenses 
> dealing w/ Termination:
> MPL 1.1 - Section 8 (Termination) Subsection 2:
> 8.2. If You initiate litigation by asserting a patent infringement claim 
> (excluding declatory judgment actions) against Initial Developer or a 
> Contributor (the Initial Developer or Contributor against whom You file such 
> action is referred to as "Participant") alleging that:
> such Participant's Contributor Version directly or indirectly infringes any 
> patent, then any and all rights granted by such Participant to You under 
> Sections 2.1 and/or 2.2 of this License shall, upon 60 days notice from 
> Participant terminate prospectively, unless if within 60 days after receipt 
> of notice You either: (i) agree in writing to pay Participant a mutually 
> agreeable reasonable royalty for Your past and future use of Modifications 
> made by such Participant, or (ii) withdraw Your litigation claim with respect 
> to the Contributor Version against such Participant. If within 60 days of 
> notice, a reasonable royalty and payment arrangement are not mutually agreed 
> upon in writing by the parties or the litigation claim is not withdrawn, the 
> rights granted by Participant to You under Sections 2.1 and/or 2.2 
> automatically terminate at the expiration of the 60 day notice period 
> specified above.
> any software, hardware, or device, other than such Participant's Contributor 
> Version, directly or indirectly infringes any patent, then any rights granted 
> to You by such Participant under Sections 2.1(b) and 2.2(b) are revoked 
> effective as of the date You first made, used, sold, distributed, or had 
> made, Modifications made by that Participant.
> MPL 2.0 - Section 5 (Termination) Subsection 2:
> 5.2. If You initiate litigation against any entity by asserting a patent 
> infringement claim (excluding declaratory judgment actions, counter-claims, 
> and cross-claims) alleging that a Contributor Version directly or indirectly 
> infringes any patent, then the rights granted to You by any and all 
> Contributors for the Covered Software under Section 2.1 of this License shall 
> terminate.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946013#comment-14946013
 ] 

Hudson commented on HBASE-14563:


FAILURE: Integrated in HBase-1.2 #230 (See 
[https://builds.apache.org/job/HBase-1.2/230/])
HBASE-14563 Disable zombie TestHFileOutputFormat2 (stack: rev 
22c87d9644c600788a0df5456333464cba969c49)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java


> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14563.txt
>
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14268) Improve KeyLocker

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14268:
--
Attachment: HBASE-14268-V7.patch

Reattach.

[~ikeda] That is interesting. Weak references will be collected by GC if in new 
gen but not if it makes it up into old gen (You should do a blog post on your 
findings here).

> Improve KeyLocker
> -
>
> Key: HBASE-14268
> URL: https://issues.apache.org/jira/browse/HBASE-14268
> Project: HBase
>  Issue Type: Improvement
>  Components: util
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
>Priority: Minor
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 14268-V5.patch, HBASE-14268-V2.patch, 
> HBASE-14268-V3.patch, HBASE-14268-V4.patch, HBASE-14268-V5.patch, 
> HBASE-14268-V5.patch, HBASE-14268-V6.patch, HBASE-14268-V7.patch, 
> HBASE-14268-V7.patch, HBASE-14268-V7.patch, HBASE-14268-V7.patch, 
> HBASE-14268-V7.patch, HBASE-14268.patch, KeyLockerIncrKeysPerformance.java, 
> KeyLockerPerformance.java, ReferenceTestApp.java
>
>
> 1. In the implementation of {{KeyLocker}} it uses atomic variables inside a 
> synchronized block, which doesn't make sense. Moreover, logic inside the 
> synchronized block is not trivial so that it makes less performance in heavy 
> multi-threaded environment.
> 2. {{KeyLocker}} gives an instance of {{RentrantLock}} which is already 
> locked, but it doesn't follow the contract of {{ReentrantLock}} because you 
> are not allowed to freely invoke lock/unlock methods under that contract. 
> That introduces a potential risk; Whenever you see a variable of the type 
> {{RentrantLock}}, you should pay attention to what the included instance is 
> coming from.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946028#comment-14946028
 ] 

Hudson commented on HBASE-14436:


FAILURE: Integrated in HBase-1.1 #697 (See 
[https://builds.apache.org/job/HBase-1.1/697/])
HBASE-14436 HTableDescriptor#addCoprocessor will always make (stack: rev 
2c662898037b6ad9e17399f0c7914bc785622202)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java


> HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create 
> new Configuration
> ---
>
> Key: HBASE-14436
> URL: https://issues.apache.org/jira/browse/HBASE-14436
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch
>
>
> HTableDescriptor#addCoprocessor will set the coprocessor value as following 
> format:
> {code}
>  public HTableDescriptor addCoprocessor(String className, Path jarFilePath,
>  int priority, final Map kvs)
>   throws IOException {
>   ...
>   String value = ((jarFilePath == null)? "" : jarFilePath.toString()) +
> "|" + className + "|" + Integer.toString(priority) + "|" +
> kvString.toString();
>   ...
> }
> {code}
> If the 'jarFilePath' is null,  the 'value' will always has the format 
> '|className|priority|'  even if 'kvs' is null, which means no extra arguments 
> for the coprocessor. Then, in the server side, 
> RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table 
> coprocessors as:
> {code}
>   static List 
> getTableCoprocessorAttrsFromSchema(Configuration conf,
>   HTableDescriptor htd) {
> ...
> try {
>   cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the 
> format '|className|priority|'
> } catch (IndexOutOfBoundsException ex) {
>   // ignore
> }
> Configuration ourConf;
> if (cfgSpec != null) {  // => cfgSpec will be '|' for the format 
> '|className|priority|'
>   ourConf = new Configuration(false);
>   HBaseConfiguration.merge(ourConf, conf);
> }
> ...
> }
> {code}
> The 'cfgSpec' will be '|' for the coprocessor formatted as 
> '|className|priority|', so that always create a new Configuration.
> In our production, there are a lot of tables having table-level coprocessors, 
> so that the region server will create new Configurations for each region of 
> the table, this will consume a certain number of memory when we have many 
> such regions.
> To fix the problem, we can make the HTableDescriptor not append the '|' if no 
> extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in 
> server side which could avoid creating new Configurations for existed such 
> regions after the regions reopened. Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12911) Client-side metrics

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946029#comment-14946029
 ] 

stack commented on HBASE-12911:
---

I'm good w/ use of pb (there is no unpacking that I saw...) +1'd the patch.

How does an operator use this stuff? They'd have to look for client jmx 
footprint on a machine? Needs a bit of doc in the release note. Nice addition.

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, 12911.yammer.jpg, 12911.yammer.v00.patch, 
> 12911.yammer.v01.patch, 12911.yammer.v02.patch, 12911.yammer.v02.patch, 
> 12911.yammer.v03.branch-1.patch, 12911.yammer.v03.patch, 
> 12911.yammer.v03.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14451) Move on to htrace-4.0.1 (from htrace-3.2.0)

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14451:
--
Attachment: 14451.v10.txt

Retry. Rebase.

> Move on to htrace-4.0.1 (from htrace-3.2.0)
> ---
>
> Key: HBASE-14451
> URL: https://issues.apache.org/jira/browse/HBASE-14451
> Project: HBase
>  Issue Type: Task
>Reporter: stack
>Assignee: stack
> Attachments: 14451.txt, 14451.v10.txt, 14451v2.txt, 14451v3.txt, 
> 14451v4.txt, 14451v5.txt, 14451v6.txt, 14451v7.txt, 14451v8.txt, 14451v9.txt
>
>
> htrace-4.0.0 was just release with a new API. Get up on it.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14479:
--
Attachment: HBASE-14479-V2.patch

Retry.

> Apply the Leader/Followers pattern to RpcServer's Reader
> 
>
> Key: HBASE-14479
> URL: https://issues.apache.org/jira/browse/HBASE-14479
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
>Priority: Minor
> Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, 
> HBASE-14479.patch
>
>
> {{RpcServer}} uses multiple selectors to read data for load distribution, but 
> the distribution is just done by round-robin. It is uncertain, especially for 
> long run, whether load is equally divided and resources are used without 
> being wasted.
> Moreover, multiple selectors may cause excessive context switches which give 
> priority to low latency (while we just add the requests to queues), and it is 
> possible to reduce throughput of the whole server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14479) Apply the Leader/Followers pattern to RpcServer's Reader

2015-10-06 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14479?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946052#comment-14946052
 ] 

stack commented on HBASE-14479:
---

Here's link http://www.kircher-schwanninger.de/michael/publications/lf.pdf I 
like the explanation here too: 
http://stackoverflow.com/questions/3058272/explain-leader-follower-pattern

Patch seems good. You tried it [~ikeda] (if you'd messed up, unit tests would 
be failing...) Anyway we could figure if a benefit? I can try running on a 
cluster and see Thanks [~ikeda]

> Apply the Leader/Followers pattern to RpcServer's Reader
> 
>
> Key: HBASE-14479
> URL: https://issues.apache.org/jira/browse/HBASE-14479
> Project: HBase
>  Issue Type: Improvement
>  Components: IPC/RPC, Performance
>Reporter: Hiroshi Ikeda
>Assignee: Hiroshi Ikeda
>Priority: Minor
> Attachments: HBASE-14479-V2.patch, HBASE-14479-V2.patch, 
> HBASE-14479.patch
>
>
> {{RpcServer}} uses multiple selectors to read data for load distribution, but 
> the distribution is just done by round-robin. It is uncertain, especially for 
> long run, whether load is equally divided and resources are used without 
> being wasted.
> Moreover, multiple selectors may cause excessive context switches which give 
> priority to low latency (while we just add the requests to queues), and it is 
> possible to reduce throughput of the whole server.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Updated] (HBASE-14458) AsyncRpcClient#createRpcChannel() should check and remove dead channel before creating new one to same server

2015-10-06 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-14458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-14458:
--
Attachment: HBASE-14458 (1).patch

Retry

> AsyncRpcClient#createRpcChannel() should check and remove dead channel before 
> creating new one to same server
> -
>
> Key: HBASE-14458
> URL: https://issues.apache.org/jira/browse/HBASE-14458
> Project: HBase
>  Issue Type: Bug
>  Components: IPC/RPC
>Affects Versions: 2.0.0, 1.2.0, 1.3.0, 1.1.3
>Reporter: Samir Ahmic
>Assignee: Samir Ahmic
>Priority: Critical
> Fix For: 2.0.0
>
> Attachments: HBASE-14458 (1).patch, HBASE-14458.patch, 
> HBASE-14458.patch
>
>
> I have notice this issue while testing master branch in distributed mode. 
> Reproduction steps:
> 1. Write some data with hbase ltt 
> 2. While ltt is writing execute $graceful_stop.sh --restart --reload [rs] 
> 3. Wait until script start to reload regions to restarted server. In that 
> moment ltt will stop writing and eventually fail. 
> After some digging i have notice that while ltt is working correctly there is 
> single connection per regionserver (lsof for single connection, 27109 is  ltt 
> PID )
> {code}
> java  27109   hbase  143u210579579  0t0TCP 
> hnode1:40423->hnode5:16020 (ESTABLISHED)
> {code}  
> and when in this example hnode5 server is restarted and script starts to 
> reload regions on this server ltt start creating thousands of new tcp 
> connections to this server:
> {code}
> java  27109   hbase *623u  210674415  0t0TCP 
> hnode1:52948->hnode5:16020 (ESTABLISHED)
> java  27109   hbase *624u   210674416  0t0TCP 
> hnode1:52949->hnode5:16020 (ESTABLISHED)
> java  27109   hbase *625u   210674417  0t0TCP 
> hnode1:52950->hnode5:16020 (ESTABLISHED)
> java  27109   hbase *627u   210674419  0t0TCP 
> hnode1:52952->hnode5:16020 (ESTABLISHED)
> java  27109   hbase *628u   210674420  0t0TCP 
> hnode1:52953->hnode5:16020 (ESTABLISHED)
> java  27109   hbase *633u   210674425  0t0TCP 
> hnode1:52958->hnode5:16020 (ESTABLISHED)
> ...
> {code}
> So here is what happened based on some additional logging and debugging:
> - AsyncRpcClient never detected that regionserver is restarted because 
> regions were moved and there was no write/read requests to this server and  
> there is no some sort of heart-bit mechanism implemented
> -  because of above dead {code}AsyncRpcChannel{code} stayed in 
> {code}PoolMap connections{code}
> - when ltt detected that regions are moved back to hnode5  it tried to 
> reconnect to hnode5  leading this issue
> I was able to resolve this issue by adding following to 
> AsyncRpcClient#createRpcChannel():
> {code}
> synchronized (connections) {
>   if (closed) {
> throw new StoppedRpcClientException();
>   }
>   rpcChannel = connections.get(hashCode);
> +if (rpcChannel != null && !rpcChannel.isAlive()) {
> +LOG.debug(Removing dead channel from "+ 
> rpcChannel.address.toString());
> +connections.remove(hashCode);
> +  }  
>   if (rpcChannel == null || !rpcChannel.isAlive()) {
> rpcChannel = new AsyncRpcChannel(this.bootstrap, this, ticket, 
> serviceName, location);
> connections.put(hashCode, rpcChannel);
> {code}
>  I will attach patch after some more testing.
>  



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-12911) Client-side metrics

2015-10-06 Thread Nick Dimiduk (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-12911?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946058#comment-14946058
 ] 

Nick Dimiduk commented on HBASE-12911:
--

bq. there is no unpacking that I saw...

Unpacking in that I'm reading into the PB Method and switching on the index of 
the entry; it's based on the generated code so I assume it's an implementation 
detail that could change in the future. See {{MetricsConnection#updateRpc}}.

bq. How does an operator use this stuff?

Let me add a release note. Right now they have to look at the JMX of the 
machine running the client. After HBASE-14381 we'll be exposing the metrics 
programatically. Do we want another follow-on to allow changing the reporter? 
This version of yammer also ships with a {{ConsoleReporter}} that allows 
reporting to System.out. What about disabling client-side metrics collection 
entirely?

> Client-side metrics
> ---
>
> Key: HBASE-12911
> URL: https://issues.apache.org/jira/browse/HBASE-12911
> Project: HBase
>  Issue Type: New Feature
>  Components: Client, Operability, Performance
>Reporter: Nick Dimiduk
>Assignee: Nick Dimiduk
> Fix For: 2.0.0, 1.3.0
>
> Attachments: 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 
> 0001-HBASE-12911-Client-side-metrics.patch, 12911-0.98.00.patch, 
> 12911-branch-1.00.patch, 12911.yammer.jpg, 12911.yammer.v00.patch, 
> 12911.yammer.v01.patch, 12911.yammer.v02.patch, 12911.yammer.v02.patch, 
> 12911.yammer.v03.branch-1.patch, 12911.yammer.v03.patch, 
> 12911.yammer.v03.patch, am.jpg, client metrics RS-Master.jpg, client metrics 
> client.jpg, conn_agg.jpg, connection attributes.jpg, ltt.jpg, standalone.jpg
>
>
> There's very little visibility into the hbase client. Folks who care to add 
> some kind of metrics collection end up wrapping Table method invocations with 
> {{System.currentTimeMillis()}}. For a crude example of this, have a look at 
> what I did in {{PerformanceEvaluation}} for exposing requests latencies up to 
> {{IntegrationTestRegionReplicaPerf}}. The client is quite complex, there's a 
> lot going on under the hood that is impossible to see right now without a 
> profiler. Being a crucial part of the performance of this distributed system, 
> we should have deeper visibility into the client's function.
> I'm not sure that wiring into the hadoop metrics system is the right choice 
> because the client is often embedded as a library in a user's application. We 
> should have integration with our metrics tools so that, i.e., a client 
> embedded in a coprocessor can report metrics through the usual RS channels, 
> or a client used in a MR job can do the same.
> I would propose an interface-based system with pluggable implementations. Out 
> of the box we'd include a hadoop-metrics implementation and one other, 
> possibly [dropwizard/metrics|https://github.com/dropwizard/metrics].
> Thoughts?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14563) Disable zombie TestHFileOutputFormat2

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946064#comment-14946064
 ] 

Hudson commented on HBASE-14563:


FAILURE: Integrated in HBase-TRUNK #6879 (See 
[https://builds.apache.org/job/HBase-TRUNK/6879/])
HBASE-14563 Disable zombie TestHFileOutputFormat2 (stack: rev 
8fcc8155042766121cb4e99433f23affe2d9ae2d)
* 
hbase-server/src/test/java/org/apache/hadoop/hbase/mapreduce/TestHFileOutputFormat2.java


> Disable zombie TestHFileOutputFormat2
> -
>
> Key: HBASE-14563
> URL: https://issues.apache.org/jira/browse/HBASE-14563
> Project: HBase
>  Issue Type: Sub-task
>  Components: test
>Reporter: stack
>Assignee: stack
> Fix For: 2.0.0, 1.2.0, 1.3.0
>
> Attachments: 14563.txt
>
>
> Disabling until someone has a chance to look at it.
> I watched it in jvisualvm a while. Its starting and stopping clusters 
> multiple times and then running mr jobs. Needs a rewrite at least and some 
> shrinking of scope on what is tested.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HBASE-14436) HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create new Configuration

2015-10-06 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-14436?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14946065#comment-14946065
 ] 

Hudson commented on HBASE-14436:


FAILURE: Integrated in HBase-TRUNK #6879 (See 
[https://builds.apache.org/job/HBase-TRUNK/6879/])
HBASE-14436 HTableDescriptor#addCoprocessor will always make (stack: rev 
0ea1f8122709302ee19279aaa438b37dac30c25b)
* 
hbase-server/src/main/java/org/apache/hadoop/hbase/regionserver/RegionCoprocessorHost.java


> HTableDescriptor#addCoprocessor will always make RegionCoprocessorHost create 
> new Configuration
> ---
>
> Key: HBASE-14436
> URL: https://issues.apache.org/jira/browse/HBASE-14436
> Project: HBase
>  Issue Type: Improvement
>  Components: Coprocessors
>Affects Versions: 1.2.1
>Reporter: Jianwei Cui
>Assignee: stack
>Priority: Minor
> Fix For: 2.0.0, 1.2.0, 1.3.0, 1.0.3, 1.1.3, 0.98.16
>
> Attachments: HBASE-14436-trunk-v1.patch, HBASE-14436-trunk-v2.patch
>
>
> HTableDescriptor#addCoprocessor will set the coprocessor value as following 
> format:
> {code}
>  public HTableDescriptor addCoprocessor(String className, Path jarFilePath,
>  int priority, final Map kvs)
>   throws IOException {
>   ...
>   String value = ((jarFilePath == null)? "" : jarFilePath.toString()) +
> "|" + className + "|" + Integer.toString(priority) + "|" +
> kvString.toString();
>   ...
> }
> {code}
> If the 'jarFilePath' is null,  the 'value' will always has the format 
> '|className|priority|'  even if 'kvs' is null, which means no extra arguments 
> for the coprocessor. Then, in the server side, 
> RegionCoprocessorHost#getTableCoprocessorAttrsFromSchema will load the table 
> coprocessors as:
> {code}
>   static List 
> getTableCoprocessorAttrsFromSchema(Configuration conf,
>   HTableDescriptor htd) {
> ...
> try {
>   cfgSpec = matcher.group(4); // => cfgSpec will be '|' for the 
> format '|className|priority|'
> } catch (IndexOutOfBoundsException ex) {
>   // ignore
> }
> Configuration ourConf;
> if (cfgSpec != null) {  // => cfgSpec will be '|' for the format 
> '|className|priority|'
>   ourConf = new Configuration(false);
>   HBaseConfiguration.merge(ourConf, conf);
> }
> ...
> }
> {code}
> The 'cfgSpec' will be '|' for the coprocessor formatted as 
> '|className|priority|', so that always create a new Configuration.
> In our production, there are a lot of tables having table-level coprocessors, 
> so that the region server will create new Configurations for each region of 
> the table, this will consume a certain number of memory when we have many 
> such regions.
> To fix the problem, we can make the HTableDescriptor not append the '|' if no 
> extra arguments for the coprocessor, or check the 'cfgSpec' more strictly in 
> server side which could avoid creating new Configurations for existed such 
> regions after the regions reopened. Discussions and suggestions are welcomed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

1 2 >

1 - 100 of 185 matches

Mail list logo