Allan Yang created HBASE-21050:
----------------------------------
Summary: Exclusive lock may be held by a SUCCESS state procedure
forever
Key: HBASE-21050
URL: https://issues.apache.org/jira/browse/HBASE-21050
Project: HBase
Issue Type: Sub-task
Components: amv2
Affects Versions: 2.0.1, 2.1.0
Reporter: Allan Yang
Assignee: Allan Yang
After HBASE-20846, we restore lock info for procedures. But, there is a case
that the lock and be held by a already success procedure. Since the procedure
won't execute again, the lock will held by the procedure forever.
1. All children for pid=1208 had been finished, but before procedure 1208
awake, the master was killed
{code}
2018-08-05 02:20:14,465 INFO [PEWorker-8] procedure2.ProcedureExecutor(1659):
Finished subprocedure(s) of pid=1208, ppid=1206, state=RUNNABLE, hasLock=true;
MoveRegionProcedure hri=c2a23a735f16df57299
dba6fd4599f2f, source=e010125050127.bja,60020,1533403109034,
destination=e010125050127.bja,60020,1533403109034; resume parent processing.
2018-08-05 02:20:14,466 INFO [PEWorker-8] procedure2.ProcedureExecutor(1296):
Finished pid=1232, ppid=1208, state=SUCCESS, hasLock=false; AssignProcedure
table=IntegrationTestBigLinkedList, region=c2a
23a735f16df57299dba6fd4599f2f, target=e010125050127.bja,60020,1533403109034 in
1.5060sec
{code}
2. Master restarts, since procedure 1208 held the lock before restart, so the
lock was resotore for it
{code}
2018-08-05 02:20:30,803 DEBUG [Thread-15] procedure2.ProcedureExecutor(456):
Loading pid=1208, ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure
hri=c2a23a735f16df57299dba6fd4599f2f, source=
e010125050127.bja,60020,1533403109034,
destination=e010125050127.bja,60020,1533403109034
2018-08-05 02:20:30,818 DEBUG [Thread-15] procedure2.Procedure(898): pid=1208,
ppid=1206, state=SUCCESS, hasLock=false; MoveRegionProcedure
hri=c2a23a735f16df57299dba6fd4599f2f, source=e010125050127.bj
a,60020,1533403109034, destination=e010125050127.bja,60020,1533403109034 held
the lock before restarting, call acquireLock to restore it.
2018-08-05 02:20:30,818 INFO [Thread-15]
procedure.MasterProcedureScheduler(631): pid=1208, ppid=1206, state=SUCCESS,
hasLock=false; MoveRegionProcedure hri=c2a23a735f16df57299dba6fd4599f2f,
source=e0
10125050127.bja,60020,1533403109034,
destination=e010125050127.bja,60020,1533403109034 checking lock on
c2a23a735f16df57299dba6fd4599f2f
{code}
3. Since procedure 1208 is success, it won't execute later, so the lock will be
held by it forever
We need to check the state of the procedure before restoring locks, if the
procedure is already finished (success or rollback), we do not need to acquire
lock for it.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)