Jimmy Xiang created HBASE-11197: ----------------------------------- Summary: Region could remain unassigned if regionserver crashes Key: HBASE-11197 URL: https://issues.apache.org/jira/browse/HBASE-11197 Project: HBase Issue Type: Bug Components: Region Assignment Reporter: Jimmy Xiang Assignee: Jimmy Xiang
When looking into test failure: testVisibilityLabelsOnKillingOfRSContainingLabelsTable and find this is what has happened: 1. try to assign a region a region server; 2. master creates a znode, and send an openRegion request to the rs; 3. rs gets the request and sends back a response, then crashed; 4. try to assign the region again with forceNewPlan = true; 5. since the region is in transition, master tries to close it and get region server stopped exception; 6. master offlines the region and removes it from transition; but can't assign the region since the dead server is not processed; 7. now SSH finally kicks in, tries to assign this region again; 8. SSH will fail to assign it since the znode is there already. We should clean up the znode in force offline a region. -- This message was sent by Atlassian JIRA (v6.2#6252)