[ 
https://issues.apache.org/jira/browse/HBASE-21217?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16623038#comment-16623038
 ] 

Allan Yang commented on HBASE-21217:
------------------------------------

ExecuteProcedures is problematic. Since it will group all the open/close 
operations in one call and execute them sequentially on the target RS. If one 
operation fails, all the operation will be marked as failure. Actually, some of 
the operations(like open region) is already executing in the open region 
handler thread. But master thinks these operations fails and reassign the 
regions  to another RS. So when the previous RS report to the master that the 
region is online, master will kill the RS since it already assign the region to 
another RS.
In our internal version, I already discards the ExecuteProcedures method, but 
use the CompatRemoteProcedureResolver to send the open/close requests one by 
one.
Should we fallback to CompatRemoteProcedureResolver until we find a way to 
resolve this?[~Apache9], [~stack].

> Revisit the executeProcedure method for open/close region
> ---------------------------------------------------------
>
>                 Key: HBASE-21217
>                 URL: https://issues.apache.org/jira/browse/HBASE-21217
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2, proc-v2
>            Reporter: Duo Zhang
>            Priority: Critical
>             Fix For: 3.0.0, 2.2.0
>
>
> Currently we just call openRegion and closeRegion directly, which is a bit 
> buggy. For example, in order to not fail all the open region requests while 
> there is only one failure, we will catch the exception and set a flag in the 
> return value. But for executeProcedures call, the return value will be 
> ignored, and we expect the openRegion method will always call 
> reportRegionStateTransition to report the failure but in fact it does not...
> And after HBASE-20881, we can confirm that the race could happen, where we 
> send a close request to a region which is opening(HBASE-21199), and vice 
> visa. So I think here we need to revisit the implementation of 
> executeProcedures to make it more stable.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to