[ https://issues.apache.org/jira/browse/HBASE-20990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16564712#comment-16564712 ]
Allan Yang commented on HBASE-20990: ------------------------------------ {quote} I prefer not returning anything when calling executeProcedure, instead, using reportRegionTransition and reportProcedureResult to send back the response... {quote} Then you need to record the exceptions in the memory and send them back to master when reporting. The sync RPC call become a async one, what if the RS restarts before sending this info. The procedure in master even don't know whether the open/close procedure is executing, whether a RPC retry is needed. > One operation in procedure batch throws an exception will cause all > RegionTransitionProcedures receive the same exception > ------------------------------------------------------------------------------------------------------------------------- > > Key: HBASE-20990 > URL: https://issues.apache.org/jira/browse/HBASE-20990 > Project: HBase > Issue Type: Sub-task > Components: amv2 > Affects Versions: 2.1.0, 2.0.1 > Reporter: Allan Yang > Assignee: Allan Yang > Priority: Major > > In AMv2, we batch open/close region operations and call RS with > executeProcedures API. But, in this API, if one of the region's operations > throws an exception, all the operations in the batch will receive the same > exception. Actually, some of the operations in the batch is executing > normally in the RS. > I think we should try catch exceptions respectively, and call > remoteCallFailed or remoteCallCompleted in RegionTransitionProcedure > respectively. > Otherwise, there will be some very strange behave. Such as this one: > {code} > 2018-07-18 02:56:18,506 WARN [RSProcedureDispatcher-pool3-t1] > assignment.RegionTransitionProcedure(226): Remote call failed > e010125048016.bja,60020,1531848989401; pid=8362, ppid=8272, state=RUNNABLE:R > EGION_TRANSITION_DISPATCH; AssignProcedure > table=IntegrationTestBigLinkedList, region=0beb8ea4e2f239fc082be7cefede1427, > target=e010125048016.bja,60020,1531848989401; rit=OPENING, > location=e010125048016 > .bja,60020,1531848989401; exception=NotServingRegionException > {code} > The AssignProcedure failed with a NotServingRegionException, what??? It is > very strange, actually, the AssignProcedure successes on the RS, another > CloseRegion operation failed in the operation batch was causing the exception. > To correct this, we need to modify the response of executeProcedures API, > which is the ExecuteProceduresResponse proto, to return infos(status, > exceptions) per operation. > This issue alone won't cause much trouble, so not so hurry to change the > behave here, but indeed we need to consider this one when we want do some > reconstruct to AMv2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)