[ 
https://issues.apache.org/jira/browse/HBASE-21611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin updated HBASE-21611:
-------------------------------------
    Summary: REGION_STATE_TRANSITION_CONFIRM_CLOSED should interact better with 
crash procedure  (was: REGION_STATE_TRANSITION_CONFIRM_CLOSED should interact 
better with crash procedure.)

> REGION_STATE_TRANSITION_CONFIRM_CLOSED should interact better with crash 
> procedure
> ----------------------------------------------------------------------------------
>
>                 Key: HBASE-21611
>                 URL: https://issues.apache.org/jira/browse/HBASE-21611
>             Project: HBase
>          Issue Type: Bug
>            Reporter: Sergey Shelukhin
>            Priority: Major
>
> 1) Not a bug per se, since HDFS is not supposed to lose files, just a bit 
> fragile.
> When a dead server's WAL directory is deleted (due to a manual intervention, 
> or some issue with HDFS) while some regions are in CLOSING state on that 
> server, they get stuck forever in REGION_STATE_TRANSITION_CONFIRM_CLOSED - 
> REGION_STATE_TRANSITION_CLOSE - "give up and mark the procedure as complete, 
> the parent procedure will take care of this" loop. There's no crash procedure 
> for the server so nobody ever takes care of that.
> 2) Under normal circumstances, when a large WAL is being split, this same 
> loop keeps spamming the logs and wasting resources for no reason, until the 
> crash procedure completes. There's no reason for it to retry - it should just 
> wait for crash procedure.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to