Andrew Purtell created HBASE-20445:
--------------------------------------

             Summary: Defer work when a row lock is busy
                 Key: HBASE-20445
                 URL: https://issues.apache.org/jira/browse/HBASE-20445
             Project: HBase
          Issue Type: Improvement
            Reporter: Andrew Purtell


Instead of blocking on row locks, defer the call and make the call runner 
available so it can service other activity. Have runners pick up deferred calls 
in the background after servicing the other request. 

Spin briefly on tryLock() wherever we are now using lock() to acquire a row 
lock. Introduce two new configuration parameters: one for the amount of time to 
wait between lock acquisition attempts, and another for the total number of 
times we wait before deferring the work. If the lock cannot be acquired, put 
the call back into the call queue. Call queues therefore should be priority 
queues sorted by deadline. Currently they are implemented with 
LinkedBlockingQueue (which isn't), or AdaptiveLifoCoDelCallQueue (which is) if 
the CoDel scheduler is enabled. Perhaps we could just require use of 
AdaptiveLifoCoDelCallQueue. Runners will be picking up work from the head of 
the queues as long as they are not empty, so deferred calls will be serviced 
again, or dropped if the deadline has passed.

Implementing continuations for simple operations should be straightforward. 

Batch mutations try to acquire as many rowlocks as they can, then do the 
partial batch over the successfully locked rows, then loop back to attempt the 
remaining work. This is a partial implementation of what we need so we can 
build on it. Rather than loop around, save the partial batch completion state 
and put a pointer to it along with the call back into the RPC queue.

For scans where allowPartialResults has been set to true we can simply complete 
the call at the point we fail to acquire a row lock. The client will handle the 
rest. For scans where allowPartialResults is false we have to save the scanner 
state and partial results, and put a pointer to this state along with the call 
back into the queue. 

We could approach this in phases:

Phase 0 - Sort out the call queuing details. Do we require 
AdaptiveLifoCoDelCallQueue? Certainly we can make use of it. Can we also have 
RWQueueRpcExecutor create queues as PriorityBlockingQueue instead of 
LinkedBlockingQueue? There must be a reason why not already.

Phase 1 - Implement deferral of simple ops only. (Batch mutations and scans 
will still block on rowlocks.)

Phase 2 - Implement deferral of batch mutations. (Scans will still block on 
rowlocks.)

Phase 3 - Implement deferral of scans where allowPartialResults is false.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to