[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-11-29 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704316#comment-16704316
 ] 

Allan Yang commented on HBASE-21083:


And besides, bypassing a procedure in the middle without fix the inconsistency 
is dangerous. We don't want to expose this method to user themself.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
> we still have some unrevealed bugs hiding in our AMv2 and procedureV2 system, 
> we need something to interfere with stuck procedures before HBCK2 can work. 
> This is very crucial for a production ready system.
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-11-29 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704314#comment-16704314
 ] 

Allan Yang commented on HBASE-21083:


[~xucang], It should not be very hard to back port this patch, but this feature 
is only used by HBCK2 in 2.x, and it does not provide a Admin API(over RPC) or 
something else, it can't be directly called by shell or HBaseAdmin class. 

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
> we still have some unrevealed bugs hiding in our AMv2 and procedureV2 system, 
> we need something to interfere with stuck procedures before HBCK2 can work. 
> This is very crucial for a production ready system.
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-11-29 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704195#comment-16704195
 ] 

Xu Cang commented on HBASE-21083:
-

[~allan163] thanks for response

The reason I am asking is, seems for hbase branch-1 there is no reliable way to 
unblock or bypass stuck Procedures. And this feature is something potentially 
can be applied to branch-1 to alleviate engineering burden such as manually 
operating on WAL files.

I briefly skimmed your patch and I see most of the code change is related to 
ProcedureV2, not AMv2. So since branch-1 has ProcedureV2, so I was asking how 
hard to port this high-level logic to branch-1. 

 

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
> we still have some unrevealed bugs hiding in our AMv2 and procedureV2 system, 
> we need something to interfere with stuck procedures before HBCK2 can work. 
> This is very crucial for a production ready system.
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-11-29 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704171#comment-16704171
 ] 

Allan Yang commented on HBASE-21083:


[~xucang] this feature is mostly used by HBCK2? how are preparing to use it in 
1.x?

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
> we still have some unrevealed bugs hiding in our AMv2 and procedureV2 system, 
> we need something to interfere with stuck procedures before HBCK2 can work. 
> This is very crucial for a production ready system.
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-11-28 Thread Xu Cang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16702299#comment-16702299
 ] 

Xu Cang commented on HBASE-21083:
-

[~allan163]

Do you see a value backporting this to branch-1? 

We have seen MasterProcedure stuck in production cluster and have very few ways 
to safely resolve it. 

(I am not very familiar with AMv2 but seems AMv2 is available in branch-1 too? 
I can see procedure2.Procedure class and so on) Thanks.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-09-20 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16622046#comment-16622046
 ] 

stack commented on HBASE-21083:
---

FYI [~allan163] HBASE-21213

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598608#comment-16598608
 ] 

Hudson commented on HBASE-21083:


Results for branch branch-2.1
[build #257 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/257/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/257//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/257//console].


(/) {color:green}+1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2.1/257//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598582#comment-16598582
 ] 

Hudson commented on HBASE-21083:


Results for branch branch-2
[build #1179 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1179/]: 
(x) *{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- For more information [see general 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1179//General_Nightly_Build_Report/]




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- For more information [see jdk8 (hadoop2) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1179//JDK8_Nightly_Build_Report_(Hadoop2)/]


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- For more information [see jdk8 (hadoop3) 
report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1179//JDK8_Nightly_Build_Report_(Hadoop3)/]


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-31 Thread Hudson (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16598488#comment-16598488
 ] 

Hudson commented on HBASE-21083:


Results for branch master
[build #464 on 
builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/464/]: (x) 
*{color:red}-1 overall{color}*

details (if available):

(x) {color:red}-1 general checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/464//console].




(x) {color:red}-1 jdk8 hadoop2 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/464//console].


(x) {color:red}-1 jdk8 hadoop3 checks{color}
-- Something went wrong running this stage, please [check relevant console 
output|https://builds.apache.org/job/HBase%20Nightly/job/master/464//console].


(/) {color:green}+1 source release artifact{color}
-- See build output for details.


(/) {color:green}+1 client integration test{color}


> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-30 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597818#comment-16597818
 ] 

stack commented on HBASE-21083:
---

[~uagashe] My fault. Was local only. I had not pushed it. Done.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-30 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16597814#comment-16597814
 ] 

Umesh Agashe commented on HBASE-21083:
--

@stack, can this be committed to master as well?

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.2
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595790#comment-16595790
 ] 

Hadoop QA commented on HBASE-21083:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
39s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
26s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  5m 
38s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
47s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
50s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
49s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
18s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
14s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
46s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
35s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 
35s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m  1s{color} 
| {color:red} hbase-protocol-shaded generated 2 new + 98 unchanged - 2 fixed = 
100 total (was 100) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
49s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
51s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
9m 22s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 33s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
28s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
31s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
3s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}175m  
8s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}254m 20s{color} | 
{color:black} {color} |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:6f01af0 |
| JIRA Issue | HBASE-21083 |
| JIRA P

[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595520#comment-16595520
 ] 

Hadoop QA commented on HBASE-21083:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
41s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
1s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m  
5s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
53s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
20s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
52s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
39s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m  
8s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
14s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
16s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
36s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
11s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 
11s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red}  9m 44s{color} 
| {color:red} hbase-protocol-shaded generated 2 new + 98 unchanged - 2 fixed = 
100 total (was 100) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
38s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
8m 21s{color} | {color:green} Patch does not cause any errors with Hadoop 2.6.5 
2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
40s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
29s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
35s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
59s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}181m  3s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
 0s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}257m 31s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestSplitOrMergeStatus |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.

[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595503#comment-16595503
 ] 

stack commented on HBASE-21083:
---

Retry.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.0.003.patch, HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Umesh Agashe (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595494#comment-16595494
 ] 

Umesh Agashe commented on HBASE-21083:
--

Thanks for addressing the review comments, [~stack]! Thanks [~allan163] for the 
changes!

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595483#comment-16595483
 ] 

Hadoop QA commented on HBASE-21083:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.1 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  1m 
10s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  4m 
45s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
42s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
54s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
25s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
13s{color} | {color:green} branch-2.1 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} branch-2.1 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
15s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
37s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 12m 
34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 12m 
34s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 10m  4s{color} 
| {color:red} hbase-protocol-shaded generated 2 new + 98 unchanged - 2 fixed = 
100 total (was 100) {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
43s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
22s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green}  
6m 45s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 
or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
1m 34s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
10s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
31s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
39s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
0s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red}194m 37s{color} 
| {color:red} hbase-server in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
52s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}270m  9s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hbase.client.TestAsyncTableGetMultiThreaded |
|   | hadoop.hbase.regionserver.throttle.TestFlushWithThroug

[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595173#comment-16595173
 ] 

stack commented on HBASE-21083:
---

Thanks [~allan163].

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595166#comment-16595166
 ] 

Allan Yang commented on HBASE-21083:


Uploaded a HBASE-21083.branch-2.0.003 patch based on [~stack]'s .001 against 
branch-2.1. And uploaded it to review board too. Thanks all for reviewing. The 
most significant change in this path  is that according to [~Apache9]'s review 
comment, we need to count the wait time in tryLockEntry(long id, long time) our 
self , since JVM may wake the waiting thread even time is not up.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595162#comment-16595162
 ] 

stack commented on HBASE-21083:
---

[~allan163] no worries. I figured you were busy. Nothing special about 2.1.  
What is in your .003 patch?

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.0.003.patch, 
> HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595100#comment-16595100
 ] 

Allan Yang commented on HBASE-21083:


Does branch-2.1 have big differences so that we need another patch and another 
review?

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595097#comment-16595097
 ] 

Allan Yang commented on HBASE-21083:


[~stack], sorry, boss, kinda of busy these days, will catch up.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-28 Thread stack (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16595080#comment-16595080
 ] 

stack commented on HBASE-21083:
---

.001 against branch-2.1 is [~allan163] 's patch with [~uagashe] review comments 
addressed and some checkstyle fixup.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Fix For: 3.0.0, 2.1.1, 2.0.3
>
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch, HBASE-21083.branch-2.1.001.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-23 Thread Hadoop QA (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16590221#comment-16590221
 ] 

Hadoop QA commented on HBASE-21083:
---

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green}  0m  
0s{color} | {color:green} Patch does not have any anti-patterns. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 2 new or modified test 
files. {color} |
|| || || || {color:brown} branch-2.0 Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
17s{color} | {color:blue} Maven dependency ordering for branch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
59s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 15m  
9s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  1m 
59s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  4m 
 4s{color} | {color:green} branch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  5m 
27s{color} | {color:green} branch-2.0 passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
9s{color} | {color:green} branch-2.0 passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:blue}0{color} | {color:blue} mvndep {color} | {color:blue}  0m 
13s{color} | {color:blue} Maven dependency ordering for patch {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  2m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 14m 
58s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} cc {color} | {color:green} 14m 
58s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} javac {color} | {color:red} 12m 24s{color} 
| {color:red} hbase-protocol-shaded generated 2 new + 98 unchanged - 2 fixed = 
100 total (was 100) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
22s{color} | {color:red} hbase-common: The patch generated 1 new + 0 unchanged 
- 0 fixed = 1 total (was 0) {color} |
| {color:red}-1{color} | {color:red} checkstyle {color} | {color:red}  0m 
16s{color} | {color:red} hbase-procedure: The patch generated 8 new + 21 
unchanged - 0 fixed = 29 total (was 21) {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedjars {color} | {color:green}  3m 
57s{color} | {color:green} patch has no errors when building our shaded 
downstream artifacts. {color} |
| {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 
10m 33s{color} | {color:green} Patch does not cause any errors with Hadoop 
2.6.5 2.7.4 or 3.0.0. {color} |
| {color:green}+1{color} | {color:green} hbaseprotoc {color} | {color:green}  
2m  6s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  6m 
55s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m 
29s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  0m 
36s{color} | {color:green} hbase-protocol-shaded in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  2m 
42s{color} | {color:green} hbase-common in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}  3m  
9s{color} | {color:green} hbase-procedure in the patch passed. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green}180m 
29s{color} | {color:green} hbase-server in the patch passed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  1m 
19s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} 

[jira] [Commented] (HBASE-21083) Introduce a mechanism to bypass the execution of a stuck procedure

2018-08-23 Thread Allan Yang (JIRA)


[ 
https://issues.apache.org/jira/browse/HBASE-21083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16589916#comment-16589916
 ] 

Allan Yang commented on HBASE-21083:


[~stack], uploaded a new patch to review.

> Introduce a mechanism to bypass the execution of a stuck procedure
> --
>
> Key: HBASE-21083
> URL: https://issues.apache.org/jira/browse/HBASE-21083
> Project: HBase
>  Issue Type: Sub-task
>  Components: amv2
>Affects Versions: 2.1.0, 2.0.1
>Reporter: Allan Yang
>Assignee: Allan Yang
>Priority: Major
> Attachments: HBASE-21083.branch-2.0.001.patch, 
> HBASE-21083.branch-2.0.002.patch
>
>
> Offline discussed with [~stack] and [~Apache9]. We all agreed that we need to 
> introduce a mechanism to 'force complete' a stuck procedure, so the AMv2 can 
> continue running.
>  we still have some unrevealed bugs hiding in our AMv2 and procedureV2 
> system, we need something to interfere with stuck procedures before HBCK2 can 
> work. This is very crucial for a production ready system. 
> For now, we have little ways to interfere with running procedures. Aborting 
> them is not a good choice, since some procedures are not abort-able. And some 
> procedure may have overridden the abort() method, which will ignore the abort 
> request.
> So, here, I will introduce a mechanism  to bypass the execution of a stuck 
> procedure.
> Basically, I added a field called 'bypass' to Procedure class. If we set this 
> field to true, all the logic in execute/rollback will be skipped, letting 
> this procedure and its ancestors complete normally and releasing the lock 
> resources at last.
> Notice that bypassing a procedure may leave the cluster in a middle state, 
> e.g. the region not assigned, or some hdfs files left behind. 
> The Operators need know the side effect of bypassing and recover the 
> inconsistent state of the cluster themselves, like issuing new procedures to 
> assign the regions.
> A patch will be uploaded and review board will be open. For now, only APIs in 
> ProcedureExecutor are provided. If anything is fine, I will add it to master 
> service and add a shell command to bypass a procedure. Or, maybe we can use 
> dynamically compiled JSPs to execute those APIs as mentioned in HBASE-20679. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)