Sergey Shelukhin created HBASE-21788:
----------------------------------------

             Summary: OpenRegionProcedure (after recovery?) is unreliable and 
needs to be improved
                 Key: HBASE-21788
                 URL: https://issues.apache.org/jira/browse/HBASE-21788
             Project: HBase
          Issue Type: Bug
    Affects Versions: 3.0.0
            Reporter: Sergey Shelukhin


Not much for this one yet.
I repeatedly see the cases when the region is stuck in OPENING, and after 
master restart RIT is recovered, and stays WAITING; its OpenRegionProcedure 
(also recovered) is stuck in Runnable and never does anything for hours. I 
cannot find logs on the target server indicating that it ever tried to do 
anything after master restart.

This procedure needs at the very least logging of what it's trying to do, and 
maybe a timeout so it unconditionally fails after a configurable period (1 
hour?).
I may also investigate why it doesn't do anything and file a separate bug. I 
wonder if it's somehow related to the region status check, but this is just a 
hunch.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to