Andrew Purtell created HBASE-20087:
--------------------------------------

             Summary: Periodically attempt redeploy of regions in FAILED_OPEN 
state
                 Key: HBASE-20087
                 URL: https://issues.apache.org/jira/browse/HBASE-20087
             Project: HBase
          Issue Type: Improvement
          Components: master, Region Assignment
            Reporter: Andrew Purtell
            Assignee: Andrew Purtell
             Fix For: 2.0.0, 1.5.0


Because RSGroups can cause permanent RIT with regions in FAILED_OPEN state, we 
added logic to the master portion of the RSGroups extention to enumerate RITs 
and retry assignment of regions in FAILED_OPEN state.

However, this strategy can be applied generally to reduce need of operator 
involvement in cluster operations. Now an operator has to manually resolve 
FAILED_OPEN assignments but there is little risk in automatically retrying them 
after a while. If the reason the assignment failed has not cleared, the 
assignment will just fail again. Should the reason the assignment failed be 
resolved, then operators don't have to do more in order for the cluster to 
fully heal. 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to