[jira] [Commented] (HBASE-20506) Add doc and test for unused RetryCounter, useful-looking utility

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639271#comment-16639271
 ] 

Hudson commented on HBASE-20506:


Results for branch branch-2.0
[build #905 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> Add doc and test for unused RetryCounter, useful-looking utility
> 
>
> Key: HBASE-20506
> URL: https://issues.apache.org/jira/browse/HBASE-20506
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.3
>
> Attachments: 20506.txt, HBASE-20506.master.001.patch, 
> HBASE-20506.master.002.patch
>
>
> I thought I could use RetryCounter, old facility added years ago, for doing 
> backoff calculations. In the end, it didn't work for me because it is lacking 
> pb serialization. While trying to use it, I added a bit of doc and a test. 
> Might help the next dev that trips along this way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21223) [amv2] Remove abort_procedure from shell

2018-10-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639327#comment-16639327
 ] 

stack commented on HBASE-21223:
---

Pushed this on branch-2.0. Patch deprecates API, doesn't remove it. We do 
remove the abort_procedure shell command. This command can do damage to a 
cluster. Better it is gone.

> [amv2] Remove abort_procedure from shell
> 
>
> Key: HBASE-21223
> URL: https://issues.apache.org/jira/browse/HBASE-21223
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, hbck2, shell
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21223.branch-2.1.001.patch, 
> HBASE-21223.branch-2.1.002.patch
>
>
> Remove this command. It will cause more damage than it could ever solve. It 
> should exist, it should be out in hbck2, not here in user-space.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639326#comment-16639326
 ] 

Hudson commented on HBASE-21242:


Results for branch branch-2.1
[build #419 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> [amv2] Miscellaneous minor log and assign procedure create improvements
> ---
>
> Key: HBASE-21242
> URL: https://issues.apache.org/jira/browse/HBASE-21242
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, Operability
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21242.branch-2.0.001.patch, 
> HBASE-21242.branch-2.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.002.patch
>
>
> Some minor fixups:
> {code}
> For RIT Duration, do better than print ms/seconds. Remove redundant UI
> column dedicated to duration when we log it in the status field too.
> Make bypass log at INFO level -- when DEBUG we can miss important
> fixup detail like why we failed.
> Make it so on complete of subprocedure, we note count of outstanding
> siblings so we have a clue how much further the parent has to go before
> it is done (Helpful when hundreds of servers doing SCP).
> Have the SCP run the AP preflight check before creating an AP; saves
> creation of hundreds of thousands of APs during fixup of this big cluster
> of mine.
> Don't log tablename three times when reporting remote call failed.
> If lock is held already, note who has it. Also log after we get lock
> or if we have to wait rather than log on entrance though we may
> later have to wait (or we may have just picked up the lock).
> {code}
> Posting patch in a sec but let me try it on cluster too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639325#comment-16639325
 ] 

Hudson commented on HBASE-21213:


Results for branch branch-2.1
[build #419 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> [hbck2] bypass leaves behind state in RegionStates when assign/unassign
> ---
>
> Key: HBASE-21213
> URL: https://issues.apache.org/jira/browse/HBASE-21213
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, hbck2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21213.branch-2.1.001.patch, 
> HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, 
> HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, 
> HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, 
> HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch, 
> HBASE-21213.branch-2.1.009.patch, HBASE-21213.branch-2.1.010.patch, 
> HBASE-21213.branch-2.1.011.patch
>
>
> This is a follow-on from HBASE-21083 which added the 'bypass' functionality. 
> On bypass, there is more state to be cleared if we are allow new Procedures 
> to be scheduled.
> For example, here is a bypass:
> {code}
> 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: 
> pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, 
> bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, 
> region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null 
> to finish it
> 2018-09-20 05:45:44,022 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, 
> state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, 
> region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec
> {code}
> ... but then when I try to assign the bypassed region later, I get this:
> {code}
> 2018-09-20 05:46:31,435 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is 
> already another procedure running on this region this=pid=100450, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
> owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, 
> location=ve1233.halxg.cloudera.com,22101,1537397961664
> 2018-09-20 05:46:31,510 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, 
> state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: 
> There is already another procedure running on this region this=pid=100450, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
> owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
> exec-time=473msec
> {code}
> ... which is a long-winded way of saying the Unassign Procedure still exists 
> still in RegionStateNodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21271) [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21271:
--
Status: Patch Available  (was: Open)

.001 Just log, don't throw. First attempt. Testing to see how this does and 
will report back.

> [amv2] Don't throw UnsupportedOperationException when rollback called on 
> Assign/Unassign; spiral of death
> -
>
> Key: HBASE-21271
> URL: https://issues.apache.org/jira/browse/HBASE-21271
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21271.branch-2.1.001.patch
>
>
> I can't repro reliably but if an AssignProcedure or UnassignProcedure is a 
> subprocedure of an Enable/Disable and for whatever reason the parent decides 
> it needs to rollback -- can't get an entity lock -- it will ask the 
> subprocedures to rollback. UP and AP don't support rollback on all steps. For 
> steps where not supported, we have been throwing a 
> UnsupportedOperationException The Framework reschedules the rollback. And 
> so on filling logs and Procedure WALs.
> Instead just note no rollback supported and intervention may be needed (until 
> we to to 2.2 when AP/UP go away).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21223) [amv2] Remove abort_procedure from shell

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21223:
--
Fix Version/s: 2.0.3

> [amv2] Remove abort_procedure from shell
> 
>
> Key: HBASE-21223
> URL: https://issues.apache.org/jira/browse/HBASE-21223
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, hbck2, shell
>Reporter: stack
>Assignee: stack
>Priority: Critical
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21223.branch-2.1.001.patch, 
> HBASE-21223.branch-2.1.002.patch
>
>
> Remove this command. It will cause more damage than it could ever solve. It 
> should exist, it should be out in hbck2, not here in user-space.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639324#comment-16639324
 ] 

Hudson commented on HBASE-21185:


Results for branch branch-2.1
[build #419 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//General_Nightly_Build_Report/]




(/) {color:green}+1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/419//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639315#comment-16639315
 ] 

Hudson commented on HBASE-21185:


Results for branch branch-2
[build #1343 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639316#comment-16639316
 ] 

Hudson commented on HBASE-21265:


Results for branch branch-2
[build #1343 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1343//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements

2018-10-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639272#comment-16639272
 ] 

Hadoop QA commented on HBASE-21242:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
14s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
23s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
22s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  2m 
 1s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
41s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
28s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
2s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
 6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  2m 
51s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
12s{color} | {color:red} hbase-server: The patch generated 1 new + 7 unchanged 
- 0 fixed = 8 total (was 7) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 42s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  3m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
59s{color} | {color:green} hbase-client in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
50s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}209m 50s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 2s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}262m  1s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.master.assignment.TestAssignmentManager |
|   | hadoop.hbase.master.TestRestartCluster |
|   | hadoop.hbase.regionserver.TestMutateRowsRecovery |
|   | hadoop.hbase.regionserver.wal.TestWALReplayBoundedLogWriterCreation |
|   | 
hadoop.hbase.replication.multiwal.TestReplicationKillMasterRSCompressedWithMultipleWAL
 |
|   | 
hadoop.hbase.replication.multiwal.TestReplicationSyncUpToolWithMult

[jira] [Commented] (HBASE-21200) Memstore flush doesn't finish because of seekToPreviousRow() in memstore scanner.

2018-10-04 Thread ramkrishna.s.vasudevan (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639292#comment-16639292
 ] 

ramkrishna.s.vasudevan commented on HBASE-21200:


[~brfrn169]
So before this patch - the same test runs successfully is it? So the cells 
greater than the readPt getting skipped inside the seek() call I believe.
Even there we were doing getIterator() and iterating through the elements and 
comparing the seqId with the readPt. 
So exactly how does the speed up happen here?

> Memstore flush doesn't finish because of seekToPreviousRow() in memstore 
> scanner.
> -
>
> Key: HBASE-21200
> URL: https://issues.apache.org/jira/browse/HBASE-21200
> Project: HBase
>  Issue Type: Bug
>  Components: Scanners
>Reporter: dongjin2193.jeon
>Assignee: Toshihiro Suzuki
>Priority: Major
> Attachments: HBASE-21200-UT.patch, HBASE-21200.master.001.patch, 
> RegionServerJstack.log
>
>
> The  issue of delaying memstore flush still occurs after backport hbase-15871.
> Reverse scan takes a long time to seek previous row in the memstore full of 
> deleted cells.
>  
> jstack :
> "MemStoreFlusher.0" #114 prio=5 os_prio=0 tid=0x7fa3d0729000 nid=0x486a 
> waiting on condition [0x7fa3b9b6b000]
>    java.lang.Thread.State: WAITING (parking)
>     at sun.misc.Unsafe.park(Native Method)
>     - parking to wait for  <0xa465fe60> (a 
> java.util.concurrent.locks.ReentrantLock$NonfairSync)
>     at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireQueued(AbstractQueuedSynchronizer.java:870)
>     at 
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:1199)
>     at 
> java.util.concurrent.locks.ReentrantLock$NonfairSync.lock(ReentrantLock.java:209)
>     at 
> java.util.concurrent.locks.ReentrantLock.lock(ReentrantLock.java:285)
>     at 
> org.apache.hadoop.hbase.regionserver.*StoreScanner.updateReaders(StoreScanner.java:695)*
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.notifyChangedReadersObservers(HStore.java:1127)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.updateStorefiles(HStore.java:1106)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore.access$600(HStore.java:130)
>     at 
> org.apache.hadoop.hbase.regionserver.HStore$StoreFlusherImpl.commit(HStore.java:2455)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushCacheAndCommit(HRegion.java:2519)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2256)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.internalFlushcache(HRegion.java:2218)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flushcache(HRegion.java:2110)
>     at 
> org.apache.hadoop.hbase.regionserver.HRegion.flush(HRegion.java:2036)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:501)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.flushRegion(MemStoreFlusher.java:471)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher.access$800(MemStoreFlusher.java:75)
>     at 
> org.apache.hadoop.hbase.regionserver.MemStoreFlusher$FlushHandler.run(MemStoreFlusher.java:259)
>     at java.lang.Thread.run(Thread.java:748)
>  
> "RpcServer.FifoWFPBQ.default.handler=27,queue=0,port=16020" #65 daemon prio=5 
> os_prio=0 tid=0x7fa3e628 nid=0x4801 runnable [0x7fa3bd29a000]
>    java.lang.Thread.State: RUNNABLE
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.getNext(DefaultMemStore.java:780)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekInSubLists(DefaultMemStore.java:826)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seek(DefaultMemStore.java:818)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner.seekToPreviousRow(DefaultMemStore.java:1000)
>     - locked <0xb45aa5b8> (a 
> org.apache.hadoop.hbase.regionserver.DefaultMemStore$MemStoreScanner)
>     at 
> org.apache.hadoop.hbase.regionserver.ReversedKeyValueHeap.next(ReversedKeyValueHeap.java:136)
>     at 
> org.apache.hadoop.hbase.regionser

[jira] [Commented] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639270#comment-16639270
 ] 

Hudson commented on HBASE-21185:


Results for branch branch-2.0
[build #905 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905//JDK8_Nightly_Build_Report_(Hadoop2)/]


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.0/905//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21268) Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639261#comment-16639261
 ] 

Hadoop QA commented on HBASE-21268:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
29s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 7 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
41s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
10s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 16m 
46s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  3m 
 0s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 5s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  7m 
26s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  2m 
45s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
50s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 20m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 20m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 20m  
5s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
40s{color} | {color:red} hbase-client: The patch generated 1 new + 112 
unchanged - 0 fixed = 113 total (was 112) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
18s{color} | {color:red} hbase-procedure: The patch generated 1 new + 20 
unchanged - 0 fixed = 21 total (was 20) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
28s{color} | {color:red} hbase-server: The patch generated 4 new + 319 
unchanged - 0 fixed = 323 total (was 319) {color} |
| {color:red}-1{color} | {color:red} rubocop {color} | {color:red}  0m  
7s{color} | {color:red} The patch generated 2 new + 23 unchanged - 1 fixed = 25 
total (was 24) {color} |
| {color:green}+1{color} | {color:green} ruby-lint {color} | {color:green}  0m  
4s{color} | {color:green} There were no new ruby-lint issues. {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  5m 
27s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
3m 20s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  9m  
2s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javadoc {color} | {color:red}  0m 
16s{color} | {color:red} hbase-procedure generated 1 new + 0 unchanged - 0 
fixed = 1 total (was 0) {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
37s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
51s{col

[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639233#comment-16639233
 ] 

Andrew Purtell commented on HBASE-21265:


File an issue for that. I’d prefer to remove it as last I checked it was 
broken. 

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20952) Re-visit the WAL API

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639235#comment-16639235
 ] 

Hudson commented on HBASE-20952:


Results for branch HBASE-20952
[build #8 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/8/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/8//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/8//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/HBASE-20952/8//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Re-visit the WAL API
> 
>
> Key: HBASE-20952
> URL: https://issues.apache.org/jira/browse/HBASE-20952
> Project: HBase
>  Issue Type: Improvement
>  Components: wal
>Reporter: Josh Elser
>Priority: Major
> Attachments: 20952.v1.txt
>
>
> Take a step back from the current WAL implementations and think about what an 
> HBase WAL API should look like. What are the primitive calls that we require 
> to guarantee durability of writes with a high degree of performance?
> The API needs to take the current implementations into consideration. We 
> should also have a mind for what is happening in the Ratis LogService (but 
> the LogService should not dictate what HBase's WAL API looks like RATIS-272).
> Other "systems" inside of HBase that use WALs are replication and 
> backup&restore. Replication has the use-case for "tail"'ing the WAL which we 
> should provide via our new API. B&R doesn't do anything fancy (IIRC). We 
> should make sure all consumers are generally going to be OK with the API we 
> create.
> The API may be "OK" (or OK in a part). We need to also consider other methods 
> which were "bolted" on such as {{AbstractFSWAL}} and 
> {{WALFileLengthProvider}}. Other corners of "WAL use" (like the 
> {{WALSplitter}} should also be looked at to use WAL-APIs only).
> We also need to make sure that adequate interface audience and stability 
> annotations are chosen.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639225#comment-16639225
 ] 

Ted Yu commented on HBASE-21265:


{code}
public class IntegrationTestRSGroup extends TestRSGroupsBase {
{code}
Now that the RS Group tests are broken out of TestRSGroupsBase, should 
IntegrationTestRSGroup incur similar changes ?

Thanks

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639181#comment-16639181
 ] 

Hadoop QA commented on HBASE-21098:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
18s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:blue}0{color} | {color:blue} findbugs {color} | {color:blue}  0m  
1s{color} | {color:blue} Findbugs executables are not available. {color} |
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 8 new or modified test 
files. {color} |
|| || || || {color:brown} branch-1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  5m 
51s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  3m 
27s{color} | {color:green} branch-1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
59s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
53s{color} | {color:green} branch-1 passed {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  2m 
59s{color} | {color:blue} branch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
44s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
44s{color} | {color:green} branch-1 passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
55s{color} | {color:green} branch-1 passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
42s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  1m 
27s{color} | {color:red} hbase-server: The patch generated 2 new + 434 
unchanged - 1 fixed = 436 total (was 435) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:red}-1{color} | {color:red} xml {color} | {color:red}  0m  1s{color} | 
{color:red} The patch has 1 ill-formed XML file(s). {color} |
| {color:blue}0{color} | {color:blue} refguide {color} | {color:blue}  2m 
43s{color} | {color:blue} patch has no errors when building the reference 
guide. See footer for rendered docs, which you should manually inspect. {color} 
|
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  2m 
43s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
1m 39s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4. {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed with JDK v1.8.0_181 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
53s{color} | {color:green} the patch passed with JDK v1.7.0_191 {color} |
|| || || || {color:brown} Other Tests {color

[jira] [Created] (HBASE-21271) [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death

2018-10-04 Thread stack (JIRA)
stack created HBASE-21271:
-

 Summary: [amv2] Don't throw UnsupportedOperationException when 
rollback called on Assign/Unassign; spiral of death
 Key: HBASE-21271
 URL: https://issues.apache.org/jira/browse/HBASE-21271
 Project: HBase
  Issue Type: Bug
  Components: amv2
Reporter: stack
Assignee: stack
 Fix For: 2.2.0, 2.1.1, 2.0.3






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21271) [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21271:
--
Description: 
I can't repro reliably but if an AssignProcedure or UnassignProcedure is a 
subprocedure of an Enable/Disable and for whatever reason the parent decides it 
needs to rollback -- can't get an entity lock -- it will ask the subprocedures 
to rollback. UP and AP don't support rollback on all steps. For steps where not 
supported, we have been throwing a UnsupportedOperationException The 
Framework reschedules the rollback. And so on filling logs and Procedure WALs.

Instead just note no rollback supported and intervention may be needed (until 
we to to 2.2 when AP/UP go away).

> [amv2] Don't throw UnsupportedOperationException when rollback called on 
> Assign/Unassign; spiral of death
> -
>
> Key: HBASE-21271
> URL: https://issues.apache.org/jira/browse/HBASE-21271
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
>
> I can't repro reliably but if an AssignProcedure or UnassignProcedure is a 
> subprocedure of an Enable/Disable and for whatever reason the parent decides 
> it needs to rollback -- can't get an entity lock -- it will ask the 
> subprocedures to rollback. UP and AP don't support rollback on all steps. For 
> steps where not supported, we have been throwing a 
> UnsupportedOperationException The Framework reschedules the rollback. And 
> so on filling logs and Procedure WALs.
> Instead just note no rollback supported and intervention may be needed (until 
> we to to 2.2 when AP/UP go away).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21271) [amv2] Don't throw UnsupportedOperationException when rollback called on Assign/Unassign; spiral of death

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21271?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21271:
--
Attachment: HBASE-21271.branch-2.1.001.patch

> [amv2] Don't throw UnsupportedOperationException when rollback called on 
> Assign/Unassign; spiral of death
> -
>
> Key: HBASE-21271
> URL: https://issues.apache.org/jira/browse/HBASE-21271
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21271.branch-2.1.001.patch
>
>
> I can't repro reliably but if an AssignProcedure or UnassignProcedure is a 
> subprocedure of an Enable/Disable and for whatever reason the parent decides 
> it needs to rollback -- can't get an entity lock -- it will ask the 
> subprocedures to rollback. UP and AP don't support rollback on all steps. For 
> steps where not supported, we have been throwing a 
> UnsupportedOperationException The Framework reschedules the rollback. And 
> so on filling logs and Procedure WALs.
> Instead just note no rollback supported and intervention may be needed (until 
> we to to 2.2 when AP/UP go away).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639109#comment-16639109
 ] 

Xu Cang commented on HBASE-21266:
-

the patch LGTM. +1

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639114#comment-16639114
 ] 

stack commented on HBASE-21270:
---

My one concern is that we catch too much but only unchecked should be bubbling 
up at the questionable point so release the lock seems appropriate.

.002 added log.

> [amv2] Let go of Procedure entity lock on CODE-BUG or 
> UnsupportedOperationException
> ---
>
> Key: HBASE-21270
> URL: https://issues.apache.org/jira/browse/HBASE-21270
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21270.branch-2.1.001.patch, 
> HBASE-21270.branch-2.1.002.patch
>
>
> Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Attachment: HBASE-21266-branch-1.patch

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21270:
--
Attachment: HBASE-21270.branch-2.1.002.patch

> [amv2] Let go of Procedure entity lock on CODE-BUG or 
> UnsupportedOperationException
> ---
>
> Key: HBASE-21270
> URL: https://issues.apache.org/jira/browse/HBASE-21270
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21270.branch-2.1.001.patch, 
> HBASE-21270.branch-2.1.002.patch
>
>
> Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell reassigned HBASE-21266:
--

Assignee: Andrew Purtell

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Attachment: (was: HBASE-21266-branch-1.patch)

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21266:
---
Attachment: HBASE-21266-branch-1.patch

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
> Attachments: HBASE-21266-branch-1.patch
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639107#comment-16639107
 ] 

stack commented on HBASE-21270:
---

.001 Small patch. Wrap execute in a sort of try/finally.

> [amv2] Let go of Procedure entity lock on CODE-BUG or 
> UnsupportedOperationException
> ---
>
> Key: HBASE-21270
> URL: https://issues.apache.org/jira/browse/HBASE-21270
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21270.branch-2.1.001.patch
>
>
> Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-04 Thread stack (JIRA)
stack created HBASE-21270:
-

 Summary: [amv2] Let go of Procedure entity lock on CODE-BUG or 
UnsupportedOperationException
 Key: HBASE-21270
 URL: https://issues.apache.org/jira/browse/HBASE-21270
 Project: HBase
  Issue Type: Bug
  Components: amv2
Reporter: stack
 Fix For: 2.2.0, 2.1.1, 2.0.3


Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21270) [amv2] Let go of Procedure entity lock on CODE-BUG or UnsupportedOperationException

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21270:
--
Attachment: HBASE-21270.branch-2.1.001.patch

> [amv2] Let go of Procedure entity lock on CODE-BUG or 
> UnsupportedOperationException
> ---
>
> Key: HBASE-21270
> URL: https://issues.apache.org/jira/browse/HBASE-21270
> Project: HBase
>  Issue Type: Bug
>  Components: amv2
>Reporter: stack
>Priority: Major
> Fix For: 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21270.branch-2.1.001.patch
>
>
> Small patch but helps when we run into ugly bugs... Let go of the entity lock.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639104#comment-16639104
 ] 

Andrew Purtell commented on HBASE-21098:


Who should this issue be assigned to? Let's give it proper attribution. 

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.master.001.patch, HBASE-21098.master.002.patch, 
> HBASE-21098.master.003.patch, HBASE-21098.master.004.patch, 
> HBASE-21098.master.005.patch, HBASE-21098.master.006.patch, 
> HBASE-21098.master.007.patch, HBASE-21098.master.008.patch, 
> HBASE-21098.master.009.patch, HBASE-21098.master.010.patch, 
> HBASE-21098.master.011.patch, HBASE-21098.master.012.patch, 
> HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21268) Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21268:
--
Attachment: HBASE-21268.branch-2.0.001.patch

> Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in 
> RegionStates when assign/unassign"
> -
>
> Key: HBASE-21268
> URL: https://issues.apache.org/jira/browse/HBASE-21268
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.0.2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21268.branch-2.0.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements

2018-10-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639101#comment-16639101
 ] 

stack commented on HBASE-21242:
---

I pushed to branch-2.1.

I put up patches for branch-2.0 and branch-2. Will wait on hadoopqa builds 
before pushing.

> [amv2] Miscellaneous minor log and assign procedure create improvements
> ---
>
> Key: HBASE-21242
> URL: https://issues.apache.org/jira/browse/HBASE-21242
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, Operability
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21242.branch-2.0.001.patch, 
> HBASE-21242.branch-2.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.002.patch
>
>
> Some minor fixups:
> {code}
> For RIT Duration, do better than print ms/seconds. Remove redundant UI
> column dedicated to duration when we log it in the status field too.
> Make bypass log at INFO level -- when DEBUG we can miss important
> fixup detail like why we failed.
> Make it so on complete of subprocedure, we note count of outstanding
> siblings so we have a clue how much further the parent has to go before
> it is done (Helpful when hundreds of servers doing SCP).
> Have the SCP run the AP preflight check before creating an AP; saves
> creation of hundreds of thousands of APs during fixup of this big cluster
> of mine.
> Don't log tablename three times when reporting remote call failed.
> If lock is held already, note who has it. Also log after we get lock
> or if we have to wait rather than log on entrance though we may
> later have to wait (or we may have just picked up the lock).
> {code}
> Posting patch in a sec but let me try it on cluster too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21242:
--
Attachment: HBASE-21242.branch-2.001.patch

> [amv2] Miscellaneous minor log and assign procedure create improvements
> ---
>
> Key: HBASE-21242
> URL: https://issues.apache.org/jira/browse/HBASE-21242
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, Operability
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21242.branch-2.0.001.patch, 
> HBASE-21242.branch-2.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.002.patch
>
>
> Some minor fixups:
> {code}
> For RIT Duration, do better than print ms/seconds. Remove redundant UI
> column dedicated to duration when we log it in the status field too.
> Make bypass log at INFO level -- when DEBUG we can miss important
> fixup detail like why we failed.
> Make it so on complete of subprocedure, we note count of outstanding
> siblings so we have a clue how much further the parent has to go before
> it is done (Helpful when hundreds of servers doing SCP).
> Have the SCP run the AP preflight check before creating an AP; saves
> creation of hundreds of thousands of APs during fixup of this big cluster
> of mine.
> Don't log tablename three times when reporting remote call failed.
> If lock is held already, note who has it. Also log after we get lock
> or if we have to wait rather than log on entrance though we may
> later have to wait (or we may have just picked up the lock).
> {code}
> Posting patch in a sec but let me try it on cluster too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21242) [amv2] Miscellaneous minor log and assign procedure create improvements

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21242?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21242:
--
Attachment: HBASE-21242.branch-2.0.001.patch

> [amv2] Miscellaneous minor log and assign procedure create improvements
> ---
>
> Key: HBASE-21242
> URL: https://issues.apache.org/jira/browse/HBASE-21242
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, Operability
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21242.branch-2.0.001.patch, 
> HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.001.patch, 
> HBASE-21242.branch-2.1.001.patch, HBASE-21242.branch-2.1.002.patch
>
>
> Some minor fixups:
> {code}
> For RIT Duration, do better than print ms/seconds. Remove redundant UI
> column dedicated to duration when we log it in the status field too.
> Make bypass log at INFO level -- when DEBUG we can miss important
> fixup detail like why we failed.
> Make it so on complete of subprocedure, we note count of outstanding
> siblings so we have a clue how much further the parent has to go before
> it is done (Helpful when hundreds of servers doing SCP).
> Have the SCP run the AP preflight check before creating an AP; saves
> creation of hundreds of thousands of APs during fixup of this big cluster
> of mine.
> Don't log tablename three times when reporting remote call failed.
> If lock is held already, note who has it. Also log after we get lock
> or if we have to wait rather than log on entrance though we may
> later have to wait (or we may have just picked up the lock).
> {code}
> Posting patch in a sec but let me try it on cluster too.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21265:
---
  Resolution: Fixed
Hadoop Flags: Reviewed
  Status: Resolved  (was: Patch Available)

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21269) Forward-port to branch-2 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21269?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21269:
--
Attachment: HBASE-21269.branch-2.001.patch

> Forward-port to branch-2 " HBASE-21213 [hbck2] bypass leaves behind state 
> in RegionStates when assign/unassign"
> ---
>
> Key: HBASE-21269
> URL: https://issues.apache.org/jira/browse/HBASE-21269
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Reporter: stack
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21269.branch-2.001.patch
>
>
> A bunch of this patch does not apply to branch-2 and master now we don't have 
> AP or UP anymore. Need to figure if we need override in branch-2 and master. 
> Let me upload the forward-port done so far. Can finish this when move to 
> branch-2.2 exercise. FYI [~Apache9]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639080#comment-16639080
 ] 

Andrew Purtell commented on HBASE-21265:


I looped TestRSGroupsKillRS on branch-2 10 times. It passed on each iteration. 
At any rate I only moved the code. If this test broken out into its own unit 
eventually tests out as flaky we can file an issue to look at it.

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21268) Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21268:
--
Status: Patch Available  (was: Open)

.001 Backport

> Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in 
> RegionStates when assign/unassign"
> -
>
> Key: HBASE-21268
> URL: https://issues.apache.org/jira/browse/HBASE-21268
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.0.2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.3
>
> Attachments: HBASE-21268.branch-2.0.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21269) Forward-port to branch-2 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread stack (JIRA)
stack created HBASE-21269:
-

 Summary: Forward-port to branch-2 " HBASE-21213 [hbck2] bypass 
leaves behind state in RegionStates when assign/unassign"
 Key: HBASE-21269
 URL: https://issues.apache.org/jira/browse/HBASE-21269
 Project: HBase
  Issue Type: Sub-task
  Components: amv2
Reporter: stack
 Fix For: 3.0.0, 2.2.0


A bunch of this patch does not apply to branch-2 and master now we don't have 
AP or UP anymore. Need to figure if we need override in branch-2 and master. 
Let me upload the forward-port done so far. Can finish this when move to 
branch-2.2 exercise. FYI [~Apache9]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639051#comment-16639051
 ] 

Andrew Purtell commented on HBASE-21265:


Test results on old branch-2 not valid

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639064#comment-16639064
 ] 

Andrew Purtell edited comment on HBASE-21098 at 10/4/18 11:55 PM:
--

I spot a couple minor grammatical nits, but since it's already been committed 
no need to change for only branch-1. I skimmed the patch, mostly looked at the 
tests. Interface changes confined to private classes. +1 if all tests pass, new 
units plus all others. 
[~zyork]


was (Author: apurtell):
I spot a couple minor grammatical nits, but since it's already been committed 
no need to change for only branch-1. I skimmed the patch, mostly looked at the 
tests. Interface changes confined to private classes. +1 if all tests pass, new 
units plus all others. 

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.master.001.patch, HBASE-21098.master.002.patch, 
> HBASE-21098.master.003.patch, HBASE-21098.master.004.patch, 
> HBASE-21098.master.005.patch, HBASE-21098.master.006.patch, 
> HBASE-21098.master.007.patch, HBASE-21098.master.008.patch, 
> HBASE-21098.master.009.patch, HBASE-21098.master.010.patch, 
> HBASE-21098.master.011.patch, HBASE-21098.master.012.patch, 
> HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639064#comment-16639064
 ] 

Andrew Purtell commented on HBASE-21098:


I spot a couple minor grammatical nits, but since it's already been committed 
no need to change for only branch-1. I skimmed the patch, mostly looked at the 
tests. Interface changes confined to private classes. +1 if all tests pass, new 
units plus all others. 

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.master.001.patch, HBASE-21098.master.002.patch, 
> HBASE-21098.master.003.patch, HBASE-21098.master.004.patch, 
> HBASE-21098.master.005.patch, HBASE-21098.master.006.patch, 
> HBASE-21098.master.007.patch, HBASE-21098.master.008.patch, 
> HBASE-21098.master.009.patch, HBASE-21098.master.010.patch, 
> HBASE-21098.master.011.patch, HBASE-21098.master.012.patch, 
> HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (HBASE-21268) Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread stack (JIRA)
stack created HBASE-21268:
-

 Summary: Backport to branch-2.0 " HBASE-21213 [hbck2] bypass 
leaves behind state in RegionStates when assign/unassign"
 Key: HBASE-21268
 URL: https://issues.apache.org/jira/browse/HBASE-21268
 Project: HBase
  Issue Type: Sub-task
  Components: amv2
Affects Versions: 2.0.2
Reporter: stack
Assignee: stack
 Fix For: 2.0.3






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639044#comment-16639044
 ] 

Hadoop QA commented on HBASE-21265:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
23s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2 Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  6m 
40s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
40s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
16s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
38s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
47s{color} | {color:green} branch-2 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
27s{color} | {color:green} branch-2 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 34s{color} 
| {color:red} hbase-rsgroup generated 1 new + 102 unchanged - 4 fixed = 103 
total (was 106) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} hbase-rsgroup: The patch generated 15 new + 0 
unchanged - 0 fixed = 15 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
39s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 47s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
17s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 14m 22s{color} 
| {color:red} hbase-rsgroup in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
13s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 46m 20s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.rsgroup.TestRSGroupsKillRS |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:42ca976 |
| JIRA Issue | HBASE-21265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942472/HBASE-21265-branch-2.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 8cb2c4fe15b2 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 
08:52:28 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | branch-2 / 9e3f3fdc1f |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| javac | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14571/artifact/patchprocess/diff-compile-javac-hbase-rsgroup.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-H

[jira] [Commented] (HBASE-21268) Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in RegionStates when assign/unassign"

2018-10-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639053#comment-16639053
 ] 

stack commented on HBASE-21268:
---

Patch is a bit different in branch-2.0. Required "HBASE-20506 Add doc and test 
for unused RetryCounter, useful-looking utility".  The backport includes most 
of  HBASE-21156, to add hbck2 API which this backport then modifies.

> Backport to branch-2.0 " HBASE-21213 [hbck2] bypass leaves behind state in 
> RegionStates when assign/unassign"
> -
>
> Key: HBASE-21268
> URL: https://issues.apache.org/jira/browse/HBASE-21268
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.0.2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.0.3
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639051#comment-16639051
 ] 

Andrew Purtell edited comment on HBASE-21265 at 10/4/18 11:46 PM:
--

Test results on old branch-2 patch not valid


was (Author: apurtell):
Test results on old branch-2 not valid

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21213) [hbck2] bypass leaves behind state in RegionStates when assign/unassign

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21213:
--
  Resolution: Fixed
Hadoop Flags: Reviewed
Release Note: 
Adds override to assigns and unassigns. Changes bypass 'force' to align calling 
the param 'override' instead.

The new override allows 'overriding' previous Procedure owner to override their 
ownership of the Procedured entity ("Region").

Used by hbck2.

  was:.001 add override of bypass in RegionTransitionProcedure so can do 
cleanup of outstanding state left-behind when finish step is not called. Needs 
different patch for branch-2/master.

  Status: Resolved  (was: Patch Available)

Resolving. Opening subtasks to back and forward port.

> [hbck2] bypass leaves behind state in RegionStates when assign/unassign
> ---
>
> Key: HBASE-21213
> URL: https://issues.apache.org/jira/browse/HBASE-21213
> Project: HBase
>  Issue Type: Bug
>  Components: amv2, hbck2
>Reporter: stack
>Assignee: stack
>Priority: Major
> Fix For: 2.1.1
>
> Attachments: HBASE-21213.branch-2.1.001.patch, 
> HBASE-21213.branch-2.1.002.patch, HBASE-21213.branch-2.1.003.patch, 
> HBASE-21213.branch-2.1.004.patch, HBASE-21213.branch-2.1.005.patch, 
> HBASE-21213.branch-2.1.006.patch, HBASE-21213.branch-2.1.007.patch, 
> HBASE-21213.branch-2.1.007.patch, HBASE-21213.branch-2.1.008.patch, 
> HBASE-21213.branch-2.1.009.patch, HBASE-21213.branch-2.1.010.patch, 
> HBASE-21213.branch-2.1.011.patch
>
>
> This is a follow-on from HBASE-21083 which added the 'bypass' functionality. 
> On bypass, there is more state to be cleared if we are allow new Procedures 
> to be scheduled.
> For example, here is a bypass:
> {code}
> 2018-09-20 05:45:43,722 INFO org.apache.hadoop.hbase.procedure2.Procedure: 
> pid=100449, state=RUNNABLE:REGION_TRANSITION_DISPATCH, locked=true, 
> bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, 
> region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664 bypassed, returning null 
> to finish it
> 2018-09-20 05:45:44,022 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Finished pid=100449, 
> state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure table=hbase:namespace, 
> region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664 in 2mins, 7.618sec
> {code}
> ... but then when I try to assign the bypassed region later, I get this:
> {code}
> 2018-09-20 05:46:31,435 WARN 
> org.apache.hadoop.hbase.master.assignment.RegionTransitionProcedure: There is 
> already another procedure running on this region this=pid=100450, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
> owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664 pid=100450, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16; rit=OPENING, 
> location=ve1233.halxg.cloudera.com,22101,1537397961664
> 2018-09-20 05:46:31,510 INFO 
> org.apache.hadoop.hbase.procedure2.ProcedureExecutor: Rolled back pid=100450, 
> state=ROLLEDBACK, 
> exception=org.apache.hadoop.hbase.procedure2.ProcedureAbortedException via 
> AssignProcedure:org.apache.hadoop.hbase.procedure2.ProcedureAbortedException: 
> There is already another procedure running on this region this=pid=100450, 
> state=RUNNABLE:REGION_TRANSITION_QUEUE, locked=true; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
> owner=pid=100449, state=SUCCESS, bypass=LOG-REDACTED UnassignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16, 
> server=ve1233.halxg.cloudera.com,22101,1537397961664; AssignProcedure 
> table=hbase:namespace, region=37cc206fe9c4bc1c0a46a34c5f523d16 
> exec-time=473msec
> {code}
> ... which is a long-winded way of saying the Unassign Procedure still exists 
> still in RegionStateNodes.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-04 Thread Zach York (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639048#comment-16639048
 ] 

Zach York commented on HBASE-21098:
---

I needed to change a bit of code for branch-1 for it to pass tests (courtesy of 
[~taklwu]), if someone could give the branch-1 patch a quick review, I'd 
appreciate it. [~apurtell] [~liuml07]

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.master.001.patch, HBASE-21098.master.002.patch, 
> HBASE-21098.master.003.patch, HBASE-21098.master.004.patch, 
> HBASE-21098.master.005.patch, HBASE-21098.master.006.patch, 
> HBASE-21098.master.007.patch, HBASE-21098.master.008.patch, 
> HBASE-21098.master.009.patch, HBASE-21098.master.010.patch, 
> HBASE-21098.master.011.patch, HBASE-21098.master.012.patch, 
> HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21098) Improve Snapshot Performance with Temporary Snapshot Directory when rootDir on S3

2018-10-04 Thread Zach York (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zach York updated HBASE-21098:
--
Attachment: HBASE-21098.branch-1.001.patch

> Improve Snapshot Performance with Temporary Snapshot Directory when rootDir 
> on S3
> -
>
> Key: HBASE-21098
> URL: https://issues.apache.org/jira/browse/HBASE-21098
> Project: HBase
>  Issue Type: Improvement
>Affects Versions: 3.0.0, 1.4.8, 2.1.1
>Reporter: Tyler Mi
>Priority: Major
> Fix For: 3.0.0, 2.2.0
>
> Attachments: HBASE-21098.branch-1.001.patch, 
> HBASE-21098.master.001.patch, HBASE-21098.master.002.patch, 
> HBASE-21098.master.003.patch, HBASE-21098.master.004.patch, 
> HBASE-21098.master.005.patch, HBASE-21098.master.006.patch, 
> HBASE-21098.master.007.patch, HBASE-21098.master.008.patch, 
> HBASE-21098.master.009.patch, HBASE-21098.master.010.patch, 
> HBASE-21098.master.011.patch, HBASE-21098.master.012.patch, 
> HBASE-21098.master.013.patch
>
>
> When using Apache HBase, the snapshot feature can be used to make a point in 
> time recovery. To do this, HBase creates a manifest of all the files in all 
> of the Regions so that those files can be referenced again when a user 
> restores a snapshot. With HBase's S3 storage mode, developers can store their 
> data off-cluster on Amazon S3. However, utilizing S3 as a file system is 
> inefficient in some operations, namely renames. Most Hadoop ecosystem 
> applications use an atomic rename as a method of committing data. However, 
> with S3, a rename is a separate copy and then a delete of every file which is 
> no longer atomic and, in fact, quite costly. In addition, puts and deletes on 
> S3 have latency issues that traditional filesystems do not encounter when 
> manipulating the region snapshots to consolidate into a single manifest. When 
> HBase on S3 users have a significant amount of regions, puts, deletes, and 
> renames (the final commit stage of the snapshot) become the bottleneck 
> causing snapshots to take many minutes or even hours to complete.
> The purpose of this patch is to increase the overall performance of snapshots 
> while utilizing HBase on S3 through the use of a temporary directory for the 
> snapshots that exists on a traditional filesystem like HDFS to circumvent the 
> bottlenecks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-20506) Add doc and test for unused RetryCounter, useful-looking utility

2018-10-04 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639041#comment-16639041
 ] 

stack commented on HBASE-20506:
---

Backported to branch-2.0 too. Needed by HBASE-21213 backport.

> Add doc and test for unused RetryCounter, useful-looking utility
> 
>
> Key: HBASE-20506
> URL: https://issues.apache.org/jira/browse/HBASE-20506
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.3
>
> Attachments: 20506.txt, HBASE-20506.master.001.patch, 
> HBASE-20506.master.002.patch
>
>
> I thought I could use RetryCounter, old facility added years ago, for doing 
> backoff calculations. In the end, it didn't work for me because it is lacking 
> pb serialization. While trying to use it, I added a bit of doc and a test. 
> Might help the next dev that trips along this way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-20506) Add doc and test for unused RetryCounter, useful-looking utility

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-20506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-20506:
--
Fix Version/s: 2.0.3

> Add doc and test for unused RetryCounter, useful-looking utility
> 
>
> Key: HBASE-20506
> URL: https://issues.apache.org/jira/browse/HBASE-20506
> Project: HBase
>  Issue Type: Bug
>Reporter: stack
>Assignee: stack
>Priority: Minor
> Fix For: 3.0.0, 2.1.0, 2.0.3
>
> Attachments: 20506.txt, HBASE-20506.master.001.patch, 
> HBASE-20506.master.002.patch
>
>
> I thought I could use RetryCounter, old facility added years ago, for doing 
> backoff calculations. In the end, it didn't work for me because it is lacking 
> pb serialization. While trying to use it, I added a bit of doc and a test. 
> Might help the next dev that trips along this way.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639040#comment-16639040
 ] 

Andrew Purtell commented on HBASE-21265:


Normal concurrent "second part" testing looks better. Here's a typical master 
run, I did 5 of them:
{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.352 s 
- in 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 1.873 s 
- in org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0
[INFO] 
[INFO] 
[INFO] --- maven-surefire-plugin:2.21.0:test (secondPartTestsExecution) @ 
hbase-rsgroup ---
[INFO] 
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin1
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsWithACL
[INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.703 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsWithACL
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin2
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 63.344 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin1
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 54.882 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin2
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsOfflineMode
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsBasics
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 25.859 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsOfflineMode
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestEnableRSGroups
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 12.505 s 
- in org.apache.hadoop.hbase.rsgroup.TestEnableRSGroups
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsBalance
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.043 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsBasics
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 29.499 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsBalance
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 13.085 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 46, Failures: 0, Errors: 0, Skipped: 0
{noformat} 

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639031#comment-16639031
 ] 

Andrew Purtell commented on HBASE-21265:


Checkstyle and javac warning fixes

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21265:
---
Attachment: HBASE-21265.patch
HBASE-21265-branch-2.patch

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265-branch-2.patch, HBASE-21265.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639011#comment-16639011
 ] 

Andrew Purtell commented on HBASE-21265:


Results from the branch-2 version

{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.79 s - 
in 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.392 s 
- in org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsWithACL
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.948 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsWithACL
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin1
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 78.922 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin1
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin2
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 64.299 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin2
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsOfflineMode
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.237 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsOfflineMode
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsBasics
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 110.056 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsBasics
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsBalance
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 41.421 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsBalance
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 24.293 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 45, Failures: 0, Errors: 0, Skipped: 0
{noformat}


> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639014#comment-16639014
 ] 

Andrew Purtell commented on HBASE-21265:


I see the checkstyle nits, fixing

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16639010#comment-16639010
 ] 

Hadoop QA commented on HBASE-21265:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 9 new or modified test 
files. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
17s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
13s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
37s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
18s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
17s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  0m 36s{color} 
| {color:red} hbase-rsgroup generated 1 new + 102 unchanged - 4 fixed = 103 
total (was 106) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
12s{color} | {color:red} hbase-rsgroup: The patch generated 15 new + 0 
unchanged - 0 fixed = 15 total (was 0) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
21s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
11m  5s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  0m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
19s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
46s{color} | {color:green} hbase-rsgroup in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
10s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 37m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21265 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942466/HBASE-21265.patch |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 3977898d62bf 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 
10:45:36 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 4508f670b1 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
| javac | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14570/artifact/patchprocess/diff-compile-javac-hbase-rsgroup.txt
 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14570/artifact/patchprocess/diff-checkstyle-hbase-rsgroup.txt
 |
|  Test Results | 
https://builds

[jira] [Updated] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21265:
---
Attachment: HBASE-21265-branch-2.patch

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21265:
---
Attachment: (was: HBASE-21265-branch-2.patch)

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21246) Introduce WALIdentity interface

2018-10-04 Thread Ted Yu (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21246?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638994#comment-16638994
 ] 

Ted Yu commented on HBASE-21246:


Now that design doc is back to review mode, please allow some more time till I 
address the above comment.

> Introduce WALIdentity interface
> ---
>
> Key: HBASE-21246
> URL: https://issues.apache.org/jira/browse/HBASE-21246
> Project: HBase
>  Issue Type: Sub-task
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Major
> Attachments: 21246.HBASE-20952.001.patch
>
>
> We are introducing WALIdentity interface so that the WAL representation can 
> be decoupled from distributed filesystem.
> The interface provides getName method whose return value can represent 
> filename in distributed filesystem environment or, the name of the stream 
> when the WAL is backed by log stream.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21265:
---
Attachment: HBASE-21265-branch-2.patch

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265-branch-2.patch, HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-19995) Current Jetty 9 version in HBase master branch can memory leak under high traffic

2018-10-04 Thread Ben Lau (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-19995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638939#comment-16638939
 ] 

Ben Lau commented on HBASE-19995:
-

[~mdrob] Thoughts?  [~stack] Is there a reason we changed the affected version 
to 0.16.0 instead of 2.0?  Not sure I understand.  Is the REST component 
identified by a different version scheme?

> Current Jetty 9 version in HBase master branch can memory leak under high 
> traffic
> -
>
> Key: HBASE-19995
> URL: https://issues.apache.org/jira/browse/HBASE-19995
> Project: HBase
>  Issue Type: Bug
>  Components: REST
>Affects Versions: 0.16.0
>Reporter: Ben Lau
>Priority: Major
> Attachments: HBASE-19995-master.patch
>
>
> There is a memory-leak in Jetty 9 that manifests whenever you hit the call 
> queue limit in HBase REST.  The memory-leak leaks both on-heap and off-heap 
> objects permanently.  It happens because whenever the call queue for Jetty 
> server overflows, the task that is rejected runs a 'reject' method if it is a 
> Rejectable to do any cleanup. This clean up is necessary to for example close 
> the connection, deallocate any buffers, etc. Unfortunately, in Jetty 9, they 
> implemented the 'reject' / cleanup method of the SelectChannelEndpoint as a 
> non-blocking call that is not guaranteed to run.  This was later fixed in 
> Jetty 9.4 and later backported however the version of Jetty 9 pulled in HBase 
> for REST comes before this fix.  See 
> [https://github.com/eclipse/jetty.project/issues/1804] and 
> [https://github.com/apache/hbase/blob/master/pom.xml#L1416.]
> If we want to stay on 9.3.X we could update to 
> [9.3.22.v20171030|https://mvnrepository.com/artifact/org.eclipse.jetty/jetty-server/9.3.22.v20171030]
>  which is the latest version of 9.3.  Thoughts?



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638978#comment-16638978
 ] 

Andrew Purtell commented on HBASE-21265:


Attached the same refactor to master branch. Not a port of the branch-1 patch, 
too many differences, but I made the same changes and have the tests ordered in 
the respective files in the same order.

{noformat}
[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.992 s 
- in 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancerWithStochasticLoadBalancerAsInternal
[INFO] Running 
org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer
[INFO] Tests run: 5, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 3.381 s 
- in org.apache.hadoop.hbase.master.balancer.TestRSGroupBasedLoadBalancer
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsWithACL
[INFO] Tests run: 11, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.113 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsWithACL
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin1
[INFO] Tests run: 14, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 74.433 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin1
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin2
[INFO] Tests run: 6, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 61.359 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsAdmin2
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsOfflineMode
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 30.528 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsOfflineMode
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsBasics
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 102.458 
s - in org.apache.hadoop.hbase.rsgroup.TestRSGroupsBasics
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsBalance
[INFO] Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 40.808 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsBalance
[INFO] Running org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS
[INFO] Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 23.907 s 
- in org.apache.hadoop.hbase.rsgroup.TestRSGroupsKillRS
[INFO] 
[INFO] Results:
[INFO] 
[INFO] Tests run: 51, Failures: 0, Errors: 0, Skipped: 0
{noformat}

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21265) Split up TestRSGroups

2018-10-04 Thread Andrew Purtell (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21265?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Andrew Purtell updated HBASE-21265:
---
Attachment: HBASE-21265.patch
HBASE-21265-branch-1.patch

> Split up TestRSGroups
> -
>
> Key: HBASE-21265
> URL: https://issues.apache.org/jira/browse/HBASE-21265
> Project: HBase
>  Issue Type: Task
>  Components: rsgroup, test
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Assignee: Andrew Purtell
>Priority: Minor
>  Labels: beginner, beginners
> Fix For: 3.0.0, 1.5.0, 2.2.0, 1.4.9
>
> Attachments: HBASE-21265-branch-1.patch, HBASE-21265-branch-1.patch, 
> HBASE-21265.patch
>
>
> TestRSGroups is flaky. It is stable when run in isolation but when run as 
> part of the suite with concurrent executors it can fail. The current running 
> time of this unit on my dev box is ~240 seconds (4 minutes), which is far too 
> much time. This unit should be broken up 5 to 8 ways, grouped by 
> functionality under test. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638704#comment-16638704
 ] 

Andrew Purtell commented on HBASE-21266:


Agreed.


> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638671#comment-16638671
 ] 

Xu Cang commented on HBASE-21266:
-

yes, I had the same doubt, 'processing' is redundant. 
Also, should we make 'numProcessing' AtomicInteger?  Since numProcessing++ and 
numProcessing-- are not thread-safe. And they can be called interleaved in 
#notifyServer and #finish

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638614#comment-16638614
 ] 

Andrew Purtell commented on HBASE-21266:


Let me make the above proposed changes and run another test with 'stressAM' and 
'serverKilling' chaos policies. 

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Comment Edited] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638611#comment-16638611
 ] 

Andrew Purtell edited comment on HBASE-21266 at 10/4/18 5:48 PM:
-

bq.  "Number of dead servers in processing should always be non-negative"

You are looking at that assert in DeadServer#finish, right? Those aren't 
evaulated unless the JVM is started with the -ea command line flag, which I 
didn't do. 

We can see from the log line I did see that the dead server map was empty at 
the time so I agree we should look at accounting in DeadServer.java.

"Not running balancer because processing dead regionserver(s)" is printed from 
HMaster.java:1846 based on the result from 
ServerManager#areDeadServersInProgress, which passes through the result from 
DeadServer#areDeadServersInProgress, which is simply

{code}
  public synchronized boolean areDeadServersInProgress() { return processing; }
{code}

This boolean is cleared in DeadServer#finish when
{code}
if (numProcessing == 0) { processing = false; }
{code}

So the first question I have is why do we even need this boolean field? It can 
easily be derived cheaply from other state. In areDeadServersInProgress just 
return the result of {{!(numProcessing == 0)}}. 

That assert you observed should be replaced by use of Preconditions so we will 
get a RuntimeException that will get noticed. 


was (Author: apurtell):
bq.  "Number of dead servers in processing should always be non-negative"

You are looking at that assert in DeadServer#finish, right? Those aren't 
evaulated unless the JVM is started with the -ea command line flag, which I 
didn't do. 

We can see from the log line I did see that the dead server map was empty at 
the time so I agree we should look at accounting in DeadServer.java.

"Not running balancer because processing dead regionserver(s)" is printed from 
HMaster.java:1846 based on the result from 
ServerManager#areDeadServersInProgress, which passes through the result from 
DeadServer#areDeadServersInProgress, which is simply

{code}
  public synchronized boolean areDeadServersInProgress() { return processing; }
{code}

This boolean is cleared in DeadServer#finish when
{code}
if (numProcessing == 0) { processing = false; }
{code}

So the first question I have is why do we even need this boolean field? It can 
easily be derived cheaply from other state. In areDeadServersInProgress just 
return the result of {{numProcessing == 0}}. 

That assert you observed should be replaced by use of Preconditions so we will 
get a RuntimeException that will get noticed. 

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Andrew Purtell (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638611#comment-16638611
 ] 

Andrew Purtell commented on HBASE-21266:


bq.  "Number of dead servers in processing should always be non-negative"

You are looking at that assert in DeadServer#finish, right? Those aren't 
evaulated unless the JVM is started with the -ea command line flag, which I 
didn't do. 

We can see from the log line I did see that the dead server map was empty at 
the time so I agree we should look at accounting in DeadServer.java.

"Not running balancer because processing dead regionserver(s)" is printed from 
HMaster.java:1846 based on the result from 
ServerManager#areDeadServersInProgress, which passes through the result from 
DeadServer#areDeadServersInProgress, which is simply

{code}
  public synchronized boolean areDeadServersInProgress() { return processing; }
{code}

This boolean is cleared in DeadServer#finish when
{code}
if (numProcessing == 0) { processing = false; }
{code}

So the first question I have is why do we even need this boolean field? It can 
easily be derived cheaply from other state. In areDeadServersInProgress just 
return the result of {{numProcessing == 0}}. 

That assert you observed should be replaced by use of Preconditions so we will 
get a RuntimeException that will get noticed. 

> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21263) Mention compression algorithm along with other storefile details

2018-10-04 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638518#comment-16638518
 ] 

Hadoop QA commented on HBASE-21263:
---

| (/) *{color:green}+1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
11s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:orange}-0{color} | {color:orange} test4tests {color} | {color:orange}  
0m  0s{color} | {color:orange} The patch doesn't appear to include any new or 
modified tests. Please justify why no new tests are needed for this patch. Also 
please list what manual steps were performed to verify this patch. {color} |
|| || || || {color:brown} master Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
23s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
48s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
15s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
20s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
7s{color} | {color:green} master passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} master passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m 
44s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
13s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 30s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}126m 
14s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
22s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}167m 38s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b |
| JIRA Issue | HBASE-21263 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12942363/HBASE-21263.master.001.patch
 |
| Optional Tests |  dupname  asflicense  javac  javadoc  unit  findbugs  
shadedjars  hadoopcheck  hbaseanti  checkstyle  compile  |
| uname | Linux 2be6dc9d9cbe 3.13.0-139-generic #188-Ubuntu SMP Tue Jan 9 
14:43:09 UTC 2018 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | 
/home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh
 |
| git revision | master / 4508f670b1 |
| maven | version: Apache Maven 3.5.4 
(1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) |
| Default Java | 1.8.0_181 |
| findbugs | v3.1.0-RC3 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HBASE-Build/14569/testReport/ |
| Max. process+thread count | 5255 (vs. ulimit of 1) |
| modules | C: hbase-server U: hbase-server |
| Console output | 
https://builds.apache.org/job/PreCommit-HBAS

[jira] [Commented] (HBASE-21221) Ineffective assertion in TestFromClientSide3#testMultiRowMutations

2018-10-04 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16638126#comment-16638126
 ] 

Hudson commented on HBASE-21221:


Results for branch master
[build #525 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/525/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(/) {color:green}+1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/525//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/525//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/master/525//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Ineffective assertion in TestFromClientSide3#testMultiRowMutations
> --
>
> Key: HBASE-21221
> URL: https://issues.apache.org/jira/browse/HBASE-21221
> Project: HBase
>  Issue Type: Test
>Reporter: Ted Yu
>Assignee: Ted Yu
>Priority: Minor
> Fix For: 3.0.0
>
> Attachments: 21221.addendum.txt, 21221.v10.txt, 21221.v11.txt, 
> 21221.v12.txt, 21221.v7.txt, 21221.v8.txt, 21221.v9.txt
>
>
> Observed the following in 
> org.apache.hadoop.hbase.util.TestFromClientSide3WoUnsafe-output.txt :
> {code}
> Caused by: 
> org.apache.hadoop.hbase.ipc.RemoteWithExtrasException(java.io.IOException): 
> java.io.IOException: Timed out waiting for lock for row: ROW-1 in region 
> 089bdfa75f44d88e596479038a6da18b
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.getRowLockInternal(HRegion.java:5816)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion$4.lockRowsAndBuildMiniBatch(HRegion.java:7432)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.doMiniBatchMutate(HRegion.java:4008)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.batchMutate(HRegion.java:3982)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.mutateRowsWithLocks(HRegion.java:7424)
>   at 
> org.apache.hadoop.hbase.coprocessor.MultiRowMutationEndpoint.mutateRows(MultiRowMutationEndpoint.java:116)
>   at 
> org.apache.hadoop.hbase.protobuf.generated.MultiRowMutationProtos$MultiRowMutationService.callMethod(MultiRowMutationProtos.java:2266)
>   at 
> org.apache.hadoop.hbase.regionserver.HRegion.execService(HRegion.java:8182)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execServiceOnRegion(RSRpcServices.java:2481)
>   at 
> org.apache.hadoop.hbase.regionserver.RSRpcServices.execService(RSRpcServices.java:2463)
> ...
> Exception in thread "pool-678-thread-1" java.lang.AssertionError: This cp 
> should fail because the target lock is blocked by previous put
>   at org.junit.Assert.fail(Assert.java:88)
>   at 
> org.apache.hadoop.hbase.client.TestFromClientSide3.lambda$testMultiRowMutations$7(TestFromClientSide3.java:861)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> {code}
> Here is related code:
> {code}
>   cpService.execute(() -> {
> ...
> if (!threw) {
>   // Can't call fail() earlier because the catch would eat it.
>   fail("This cp should fail because the target lock is blocked by 
> previous put");
> }
> {code}
> Since the fail() call is executed by the cpService, the assertion had no 
> bearing on the outcome of the test.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21185:
--
Component/s: Operability

> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21185) WALPrettyPrinter: Additional useful info to be printed by wal printer tool, for debugability purposes

2018-10-04 Thread stack (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

stack updated HBASE-21185:
--
   Resolution: Fixed
 Hadoop Flags: Reviewed
Fix Version/s: 2.0.3
   2.1.1
   2.2.0
   3.0.0
   Status: Resolved  (was: Patch Available)

I pushed this to branch-2.0+ [~wchevreuil]. I tried backport but failed. If you 
want to put up a patch for branch-1 in next few days, I'll commit it. Add a 
release note please on your nice additions here sir. Thanks.

> WALPrettyPrinter: Additional useful info to be printed by wal printer tool, 
> for debugability purposes
> -
>
> Key: HBASE-21185
> URL: https://issues.apache.org/jira/browse/HBASE-21185
> Project: HBase
>  Issue Type: Improvement
>  Components: Operability
>Reporter: Wellington Chevreuil
>Assignee: Wellington Chevreuil
>Priority: Trivial
> Fix For: 3.0.0, 2.2.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21185.master.001.patch, 
> HBASE-21185.master.002.patch, HBASE-21185.master.003.patch
>
>
> *WALPrettyPrinter* is very useful for troubleshooting wal issues, such as 
> faulty replication sinks. An useful information one might want to track is 
> the size of a single WAL entry edit, as well as size for each edit cell. Am 
> proposing a patch that adds calculations for these two, as well an option to 
> seek straight to a given position on the WAL file being analysed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (HBASE-21263) Mention compression algorithm along with other storefile details

2018-10-04 Thread Subrat Mishra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subrat Mishra updated HBASE-21263:
--
Attachment: HBASE-21263.master.001.patch
Status: Patch Available  (was: Open)

Hi Andrew,
I have attached the patch for master branch. Can you please review?
 

> Mention compression algorithm along with other storefile details
> 
>
> Key: HBASE-21263
> URL: https://issues.apache.org/jira/browse/HBASE-21263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Subrat Mishra
>Priority: Minor
>  Labels: beginner, beginners
> Attachments: HBASE-21263.master.001.patch
>
>
> Where we log storefile details we should also log the compression algorithm 
> used to compress blocks on disk, if any. 
> For example, here's a log line out of compaction:
> 2018-10-02 21:59:47,594 DEBUG 
> [regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
> compactions.Compactor: Compacting 
> hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/i/3d04a7c28d6343ceb773737dbb192533,
>  keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
> seqNum=154199, earliestPutTs=1538516084915
> Aside from bloom type, block encoding, and filename, it would be good to know 
> compression type in this type of DEBUG or INFO level logging. A minor 
> omission of information that could be helpful during debugging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Assigned] (HBASE-21263) Mention compression algorithm along with other storefile details

2018-10-04 Thread Subrat Mishra (JIRA)


 [ 
https://issues.apache.org/jira/browse/HBASE-21263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Subrat Mishra reassigned HBASE-21263:
-

Assignee: Subrat Mishra

> Mention compression algorithm along with other storefile details
> 
>
> Key: HBASE-21263
> URL: https://issues.apache.org/jira/browse/HBASE-21263
> Project: HBase
>  Issue Type: Improvement
>Reporter: Andrew Purtell
>Assignee: Subrat Mishra
>Priority: Minor
>  Labels: beginner, beginners
>
> Where we log storefile details we should also log the compression algorithm 
> used to compress blocks on disk, if any. 
> For example, here's a log line out of compaction:
> 2018-10-02 21:59:47,594 DEBUG 
> [regionserver/host/1.1.1.1:8120-longCompactions-1538517461152] 
> compactions.Compactor: Compacting 
> hdfs://namenode:8020/hbase/data/default/TestTable/86037c19117a46b5b8148439ea55753b/i/3d04a7c28d6343ceb773737dbb192533,
>  keycount=3335873, bloomtype=ROW, size=107.5 M, encoding=ROW_INDEX_V1, 
> seqNum=154199, earliestPutTs=1538516084915
> Aside from bloom type, block encoding, and filename, it would be good to know 
> compression type in this type of DEBUG or INFO level logging. A minor 
> omission of information that could be helpful during debugging. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21266) Not running balancer because processing dead regionservers, but empty dead rs list

2018-10-04 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21266?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16637939#comment-16637939
 ] 

Xu Cang commented on HBASE-21266:
-

Took a quick look at the related code, I have one question, [~apurtell] did you 
see this line in log "Number of dead servers in processing should always be 
non-negative"?
If so, it could be a race condition happened to int 'numProcessing' or hashMap 
'deadServers'.




> Not running balancer because processing dead regionservers, but empty dead rs 
> list
> --
>
> Key: HBASE-21266
> URL: https://issues.apache.org/jira/browse/HBASE-21266
> Project: HBase
>  Issue Type: Bug
>Affects Versions: 1.4.8
>Reporter: Andrew Purtell
>Priority: Major
> Fix For: 1.5.0, 1.4.9
>
>
> Found during ITBLL testing. AM in master gets into a state where manual 
> attempts from the shell to run the balancer always return false and this is 
> printed in the master log:
> 2018-10-03 19:17:14,892 DEBUG 
> [RpcServer.default.FPBQ.Fifo.handler=21,queue=0,port=8100] master.HMaster: 
> Not running balancer because processing dead regionserver(s): 
> Note the empty list. 
> This errant state did not recover without intervention by way of master 
> restart, but the test environment was chaotic so needs investigation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)