Allan Yang created HBASE-21050:
----------------------------------

             Summary: Exclusive lock may be held by a SUCCESS state procedure 
forever
                 Key: HBASE-21050
                 URL: https://issues.apache.org/jira/browse/HBASE-21050
             Project: HBase
          Issue Type: Sub-task
          Components: amv2
    Affects Versions: 2.0.1, 2.1.0
            Reporter: Allan Yang
            Assignee: Allan Yang


After HBASE-20846, we restore lock info for procedures. But, there is a case 
that the lock and be held by a already success procedure. Since the procedure 
won't execute again, the lock will held by the procedure forever.

1. All children for pid=1208 had been finished, but before procedure 1208 
awake, the master was killed
{code}
2018-08-05 02:20:14,465 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1659): 
Finished subprocedure(s) of pid=1208, ppid=1206, state=RUNNABLE, hasLock=true; 
MoveRegionProcedure hri=c2a23a735f16df57299
dba6fd4599f2f, source=e010125050127.bja,60020,1533403109034, 
destination=e010125050127.bja,60020,1533403109034; resume parent processing.

2018-08-05 02:20:14,466 INFO  [PEWorker-8] procedure2.ProcedureExecutor(1296): 
Finished pid=1232, ppid=1208, state=SUCCESS, hasLock=false; AssignProcedure 
table=IntegrationTestBigLinkedList, region=c2a
23a735f16df57299dba6fd4599f2f, target=e010125050127.bja,60020,1533403109034 in 
1.5060sec
{code}

2. Master restarts, since procedure 1208 held the lock before restart, so the 
lock was resotore for it
{code}
2018-08-05 02:20:30,803 DEBUG [Thread-15] procedure2.ProcedureExecutor(456): 
Loading pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure 
hri=c2a23a735f16df57299dba6fd4599f2f, source=
e010125050127.bja,60020,1533403109034, 
destination=e010125050127.bja,60020,1533403109034

2018-08-05 02:20:30,818 DEBUG [Thread-15] procedure2.Procedure(898): pid=1208, 
ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure 
hri=c2a23a735f16df57299dba6fd4599f2f, source=e010125050127.bj
a,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 held 
the lock before restarting, call acquireLock to restore it.

2018-08-05 02:20:30,818 INFO  [Thread-15] 
procedure.MasterProcedureScheduler(631): pid=1208, ppid=1206, state=SUCCESS, 
hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f, 
source=e0
10125050127.bja,60020,1533403109034, 
destination=e010125050127.bja,60020,1533403109034 checking lock on 
c2a23a735f16df57299dba6fd4599f2f
{code}

3. Since procedure 1208 is success, it won't execute later, so the lock will be 
held by it forever

We need to check the state of the procedure before restoring locks, if the 
procedure is already finished (success or rollback), we do not need to acquire 
lock for it.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to