[ https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623038#comment-16623038 ]
Allan Yang commented on HBASE-21217: ------------------------------------ ExecuteProcedures is problematic. Since it will group all the open/close operations in one call and execute them sequentially on the target RS. If one operation fails, all the operation will be marked as failure. Actually, some of the operations(like open region) is already executing in the open region handler thread. But master thinks these operations fails and reassign the regions to another RS. So when the previous RS report to the master that the region is online, master will kill the RS since it already assign the region to another RS. In our internal version, I already discards the ExecuteProcedures method, but use the CompatRemoteProcedureResolver to send the open/close requests one by one. Should we fallback to CompatRemoteProcedureResolver until we find a way to resolve this?[~Apache9], [~stack]. > Revisit the executeProcedure method for open/close region > --------------------------------------------------------- > > Key: HBASE-21217 > URL: https://issues.apache.org/jira/browse/HBASE-21217 > Project: HBase > Issue Type: Sub-task > Components: amv2, proc-v2 > Reporter: Duo Zhang > Priority: Critical > Fix For: 3.0.0, 2.2.0 > > > Currently we just call openRegion and closeRegion directly, which is a bit > buggy. For example, in order to not fail all the open region requests while > there is only one failure, we will catch the exception and set a flag in the > return value. But for executeProcedures call, the return value will be > ignored, and we expect the openRegion method will always call > reportRegionStateTransition to report the failure but in fact it does not... > And after HBASE-20881, we can confirm that the race could happen, where we > send a close request to a region which is opening(HBASE-21199), and vice > visa. So I think here we need to revisit the implementation of > executeProcedures to make it more stable. -- This message was sent by Atlassian JIRA (v7.6.3#76005)