[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711438#comment-16711438 ] Pankaj Kumar commented on HBASE-21217: -- Ping [~allan163], any plan to backport this bug in 2.0/2.1 branch? > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711441#comment-16711441 ] Duo Zhang commented on HBASE-21217: --- For branch-2.1/branch-2.0, we will just call openRegion and closeRegion directly, IIRC. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711455#comment-16711455 ] Pankaj Kumar commented on HBASE-21217: -- Do you mean this bug is not applicable to branch-2.0/2.1? Pardon me if wrong > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711498#comment-16711498 ] Allan Yang commented on HBASE-21217: We have HBASE-21237 for branch-2.0/2.1. [~pankaj2461] > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16711632#comment-16711632 ] Pankaj Kumar commented on HBASE-21217: -- Got it [~allan163]... thanks for the Jira pointer. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623026#comment-16623026 ] Duo Zhang commented on HBASE-21217: --- In general, maybe we should keep the old code as is, which is used to keep compatible with the master for 1.x, and implement new openRegion/closeRegion methods for 2.x. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623038#comment-16623038 ] Allan Yang commented on HBASE-21217: ExecuteProcedures is problematic. Since it will group all the open/close operations in one call and execute them sequentially on the target RS. If one operation fails, all the operation will be marked as failure. Actually, some of the operations(like open region) is already executing in the open region handler thread. But master thinks these operations fails and reassign the regions to another RS. So when the previous RS report to the master that the region is online, master will kill the RS since it already assign the region to another RS. In our internal version, I already discards the ExecuteProcedures method, but use the CompatRemoteProcedureResolver to send the open/close requests one by one. Should we fallback to CompatRemoteProcedureResolver until we find a way to resolve this?[~Apache9], [~stack]. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623049#comment-16623049 ] Duo Zhang commented on HBASE-21217: --- You can see how do I implement the general remote procedure. The executeProcedures method is not the problem, the problem is we just call openRegion/closeRegion method directly, and they could fail. In general, the executeProcedures method should not fail, it will just schedule the procedures to an executor. And it is the procedure's duty to report back to master. Especially for opening/closing a region, we should use reportRegionStateTransition to tell master the result, instead of throwing exceptions and let the dispatcher to handle the problem. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623053#comment-16623053 ] Duo Zhang commented on HBASE-21217: --- But anyway, I think we can change back to use CompatRemoteProcedureResolver for 2.1 and 2.0, as the implementation of executeProcedures on these branches have critical problems, not only you described above, you can see what I described in the description here... If it is a bulk assign it will not throw exception, but use return value to indicate that there is a failure, but executeProcedures will ignore the return value... > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623082#comment-16623082 ] Allan Yang commented on HBASE-21217: So, we should change back to use CompatRemoteProcedureResolver for 2.1 and 2.0 simply, will open another issue for it. As in 2.2, I think we can do some Refactoring here to make ExecuteProcedures won't return any error but report back the error to the master. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623090#comment-16623090 ] Duo Zhang commented on HBASE-21217: --- Agree. Let's do this. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624658#comment-16624658 ] Duo Zhang commented on HBASE-21217: --- Review board link: https://reviews.apache.org/r/68811/ > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624669#comment-16624669 ] Hadoop QA commented on HBASE-21217: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 32s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 10s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:red}-1{color} | {color:red} mvninstall {color} | {color:red} 2m 48s{color} | {color:red} root in the patch failed. {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 44s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 44s{color} | {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 total (was 188) {color} | | {color:red}-1{color} | {color:red} checkstyle {color} | {color:red} 1m 16s{color} | {color:red} hbase-server: The patch generated 1 new + 317 unchanged - 4 fixed = 318 total (was 321) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:red}-1{color} | {color:red} shadedjars {color} | {color:red} 3m 13s{color} | {color:red} patch has 11 errors when building our shaded downstream artifacts. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 1m 46s{color} | {color:red} The patch causes 11 errors with Hadoop v2.7.4. {color} | | {color:red}-1{color} | {color:red} hadoopcheck {color} | {color:red} 3m 50s{color} | {color:red} The patch causes 11 errors with Hadoop v3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 6s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red} 0m 44s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 11s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 32m 30s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940911/HBASE-21217.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 3bb66c2a281d 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7ab77518a2 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | mvninstall | https://builds.apache.org/job/PreCommit-HBASE-Build/14477/artifact/patchprocess/patch-mvninstall-root.txt | | javac | https://builds.apache.org/job/PreCommit-HBASE-
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16624955#comment-16624955 ] Hadoop QA commented on HBASE-21217: --- | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 12s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 57s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 16s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 13s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 0s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 30s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 4m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} javac {color} | {color:red} 1m 45s{color} | {color:red} hbase-server generated 1 new + 187 unchanged - 1 fixed = 188 total (was 188) {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} hbase-server: The patch generated 0 new + 316 unchanged - 5 fixed = 316 total (was 321) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 10s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 51s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 14s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 32s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:red}-1{color} | {color:red} unit {color} | {color:red}144m 59s{color} | {color:red} hbase-server in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}186m 33s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hbase.replication.regionserver.TestDrainReplicationQueuesForStandBy | | | hadoop.hbase.regionserver.TestRegionOpen | | | hadoop.hbase.client.TestTableFavoredNodes | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12940937/HBASE-21217-v1.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 52b9f3706a0d 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7ab77518a2 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | javac | https://builds.apache.org/job/PreCommit-HBASE-Build/14480/artif
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626047#comment-16626047 ] stack commented on HBASE-21217: --- Ugh. Just saw this. Thanks lads. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16626122#comment-16626122 ] Hadoop QA commented on HBASE-21217: --- | (/) *{color:green}+1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 11s{color} | {color:blue} Docker mode activated. {color} | || || || || {color:brown} Prechecks {color} || | {color:green}+1{color} | {color:green} hbaseanti {color} | {color:green} 0m 0s{color} | {color:green} Patch does not have any anti-patterns. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 3 new or modified test files. {color} | || || || || {color:brown} master Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 4s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 17s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 16s{color} | {color:green} branch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 59s{color} | {color:green} master passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 29s{color} | {color:green} master passed {color} | || || || || {color:brown} Patch Compile Tests {color} || | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 5m 15s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 46s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 1m 19s{color} | {color:green} hbase-server: The patch generated 0 new + 315 unchanged - 6 fixed = 315 total (was 321) {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} shadedjars {color} | {color:green} 4m 15s{color} | {color:green} patch has no errors when building our shaded downstream artifacts. {color} | | {color:green}+1{color} | {color:green} hadoopcheck {color} | {color:green} 10m 29s{color} | {color:green} Patch does not cause any errors with Hadoop 2.7.4 or 3.0.0. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 7s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 31s{color} | {color:green} the patch passed {color} | || || || || {color:brown} Other Tests {color} || | {color:green}+1{color} | {color:green} unit {color} | {color:green}120m 28s{color} | {color:green} hbase-server in the patch passed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 23s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black}162m 7s{color} | {color:black} {color} | \\ \\ || Subsystem || Report/Notes || | Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hbase:b002b0b | | JIRA Issue | HBASE-21217 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12941052/HBASE-21217-v2.patch | | Optional Tests | asflicense javac javadoc unit findbugs shadedjars hadoopcheck hbaseanti checkstyle compile | | uname | Linux 85dae2e34597 3.13.0-143-generic #192-Ubuntu SMP Tue Feb 27 10:45:36 UTC 2018 x86_64 GNU/Linux | | Build tool | maven | | Personality | /home/jenkins/jenkins-slave/workspace/PreCommit-HBASE-Build/component/dev-support/hbase-personality.sh | | git revision | master / 7ab77518a2 | | maven | version: Apache Maven 3.5.4 (1edded0938998edf8bf061f1ceb3cfdeccf443fe; 2018-06-17T18:33:14Z) | | Default Java | 1.8.0_181 | | findbugs | v3.1.0-RC3 | | Test Results | https://builds.apache.org/job/PreCommit-HBASE-Build/14485/testReport/ | | Max. process+thread count | 5184 (vs. ulimit of 1) | | modules | C: hbase-server U: hbase-server | | Console output | https://builds.apache.org/job/PreCommit-HBASE-Build/14485/console | | Powered by | Apache Yetus 0.7.0 http://yetus.apache.org |
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627095#comment-16627095 ] Duo Zhang commented on HBASE-21217: --- Let me commit. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627167#comment-16627167 ] Allan Yang commented on HBASE-21217: {quote} There is no FAILED_CLOSE state so we can not tell master that the closing is failed. And I think the only possible way to meet the null region is that, we have already sent the request to RS, and RS has finished the closing, but a retrying rpc call has aleady been sent, and finally it is scheduled after we finish everything, then here we will meet a null region. For this case I think it is fine to just ignore it? Not sure if there are other possible ways to enter here, if so I think there will be bugs... {quote} Quote from the reviewborad. Yes, I have encountered this in ITBLL, that's why I want to change back to CompatRemoteProcedureResolver in branch-2.0 and branch-2.1. But it is definitely another bug need to reveal and fix. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627269#comment-16627269 ] Duo Zhang commented on HBASE-21217: --- Yes, when calling closeRegion we could throw a NotServingRegionException, so the master could know that the close is useless. But we need to know why master issues the useless close region request... > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627591#comment-16627591 ] stack commented on HBASE-21217: --- Is there an issue for reenabling CompatRemoteProcedureResolver in branch-2.0 and branch-2.1? Thanks. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16627665#comment-16627665 ] Hudson commented on HBASE-21217: Results for branch branch-2 [build #1300 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1300/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1300//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1300//JDK8_Nightly_Build_Report_(Hadoop2)/] (/) {color:green}+1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/branch-2/1300//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628157#comment-16628157 ] Duo Zhang commented on HBASE-21217: --- Ping [~allan163], mind opening the issue to upload your patch for branch-2.0 & branch-2.1? Thanks. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628205#comment-16628205 ] Allan Yang commented on HBASE-21217: {quote} Ping Allan Yang, mind opening the issue to upload your patch for branch-2.0 & branch-2.1? Thanks. {quote} Sure, will open a issue later today. > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (HBASE-21217) Revisit the executeProcedure method for open/close region
[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16628645#comment-16628645 ] Hudson commented on HBASE-21217: Results for branch master [build #511 on builds.a.o|https://builds.apache.org/job/HBase%20Nightly/job/master/511/]: (x) *{color:red}-1 overall{color}* details (if available): (/) {color:green}+1 general checks{color} -- For more information [see general report|https://builds.apache.org/job/HBase%20Nightly/job/master/511//General_Nightly_Build_Report/] (x) {color:red}-1 jdk8 hadoop2 checks{color} -- For more information [see jdk8 (hadoop2) report|https://builds.apache.org/job/HBase%20Nightly/job/master/511//JDK8_Nightly_Build_Report_(Hadoop2)/] (x) {color:red}-1 jdk8 hadoop3 checks{color} -- For more information [see jdk8 (hadoop3) report|https://builds.apache.org/job/HBase%20Nightly/job/master/511//JDK8_Nightly_Build_Report_(Hadoop3)/] (/) {color:green}+1 source release artifact{color} -- See build output for details. (/) {color:green}+1 client integration test{color} > Revisit the executeProcedure method for open/close region > - > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 >Reporter: Duo Zhang >Assignee: Duo Zhang >Priority: Critical > Fix For: 3.0.0, 2.2.0 > > Attachments: HBASE-21217-v1.patch, HBASE-21217-v2.patch, > HBASE-21217.patch > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)