Allan Yang created HBASE-21364:
----------------------------------

             Summary: Procedure holds the lock should put to front of the queue 
after restart
                 Key: HBASE-21364
                 URL: https://issues.apache.org/jira/browse/HBASE-21364
             Project: HBase
          Issue Type: Sub-task
    Affects Versions: 2.0.2, 2.1.0
            Reporter: Allan Yang
            Assignee: Allan Yang


After restore the procedures form Procedure WALs. We will put the runable 
procedures back to the queue to execute. The order is not the problem before 
HBASE-20846 since the first one to execute will acquire the lock itself. But 
since the locks will restored after HBASE-20846. If we execute a procedure 
without the lock first before a procedure with the lock in the same queue, 
there is a race condition that we may not be able to execute all procedures in 
the same queue at all.
The race condtion is:
1. A procedure need to take the table's exclusive lock was put into the table's 
queue, but the table's shard lock was lock by a Region Procedure. Since no one 
takes the exclusive lock, the queue is put to run queue to execute. But soon, 
the worker thread see the procedure can't execute because it doesn't hold the 
lock, so it will stop execute and remove the queue from run queue.
2. At the same time, the Region procedure which holds the table's shard lock 
and the region's exclusive lock is put to the table's queue. But, since the 
queue already added to the run queue, it won't add again.
3. Since 1, the table's queue was removed from the run queue.
4. Then, no one will put the table's queue back, thus no worker will execute 
the procedures inside
A test case in the patch shows how.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to