[ 
https://issues.apache.org/jira/browse/HBASE-21050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16581702#comment-16581702
 ] 

stack commented on HBASE-21050:
-------------------------------

Ok. Test is hard. All is happening down inside load procedures. The lock is 
getting restored post crash and the child is being marked completed because it 
is 'finished' ... so it is not being rescheduled. Messing, trying to test this, 
the child procedure evaporates before I can get a hold on it. I had various 
attempts at an 'entity' lock that had a lifecycle independent of Procedure but 
what is wanted is exercising the locking we do inside the 
MasterProcedureScheduler where it, an independent entity, has special mechanism 
for keeping up region locks. Building up a test case that has 
MasterProcedureScheduler at its core with Region entities would be a good bit 
of work. I'm passing on it for now.

Let me commit this patch. At the very least, after being in here a while, patch 
makes even more sense.

> Exclusive lock may be held by a SUCCESS state procedure forever
> ---------------------------------------------------------------
>
>                 Key: HBASE-21050
>                 URL: https://issues.apache.org/jira/browse/HBASE-21050
>             Project: HBase
>          Issue Type: Sub-task
>          Components: amv2
>    Affects Versions: 2.1.0, 2.0.1
>            Reporter: Allan Yang
>            Assignee: Allan Yang
>            Priority: Major
>         Attachments: HBASE-21050.branch-2.0.001.patch
>
>
> After HBASE-20846, we restore lock info for procedures. But, there is a case 
> that the lock and be held by a already success procedure. Since the procedure 
> won't execute again, the lock will held by the procedure forever.
> 1. All children for pid=1208 had been finished, but before procedure 1208 
> awake, the master was killed
> {code}
> 2018-08-05 02:20:14,465 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1659): Finished subprocedure(s) of pid=1208, 
> ppid=1206, state=RUNNABLE, hasLock=true; MoveRegionProcedure 
> hri=c2a23a735f16df57299
> dba6fd4599f2f, source=e010125050127.bja,60020,1533403109034, 
> destination=e010125050127.bja,60020,1533403109034; resume parent processing.
> 2018-08-05 02:20:14,466 INFO  [PEWorker-8] 
> procedure2.ProcedureExecutor(1296): Finished pid=1232, ppid=1208, 
> state=SUCCESS, hasLock=false; AssignProcedure 
> table=IntegrationTestBigLinkedList, region=c2a
> 23a735f16df57299dba6fd4599f2f, target=e010125050127.bja,60020,1533403109034 
> in 1.5060sec
> {code}
> 2. Master restarts, since procedure 1208 held the lock before restart, so the 
> lock was resotore for it
> {code}
> 2018-08-05 02:20:30,803 DEBUG [Thread-15] procedure2.ProcedureExecutor(456): 
> Loading pid=1208, ppid=1206, state=SUCCESS, hasLock=false; 
> MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, source=
> e010125050127.bja,60020,1533403109034, 
> destination=e010125050127.bja,60020,1533403109034
> 2018-08-05 02:20:30,818 DEBUG [Thread-15] procedure2.Procedure(898): 
> pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure 
> hri=c2a23a735f16df57299dba6fd4599f2f, source=e010125050127.bj
> a,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 held 
> the lock before restarting, call acquireLock to restore it.
> 2018-08-05 02:20:30,818 INFO  [Thread-15] 
> procedure.MasterProcedureScheduler(631): pid=1208, ppid=1206, state=SUCCESS, 
> hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, 
> source=e0
> 10125050127.bja,60020,1533403109034, 
> destination=e010125050127.bja,60020,1533403109034 checking lock on 
> c2a23a735f16df57299dba6fd4599f2f
> {code}
> 3. Since procedure 1208 is success, it won't execute later, so the lock will 
> be held by it forever
> We need to check the state of the procedure before restoring locks, if the 
> procedure is already finished (success or rollback), we do not need to 
> acquire lock for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to