[ https://issues.apache.org/jira/browse/HBASE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Zhong Deyin resolved HBASE-7445. -------------------------------- Resolution: Fixed modify class org.apache.hadoop.hbase.master.handler.ServerShutdownHandler, change the process method, replace code this.services.getAssignmentManager().assignMeta() to assignMetaWithRetries(),then meta will try 10 times while the regionservers crashed. {code} // Carrying meta if (isCarryingMeta()) { LOG.info("Server " + serverName + " was carrying META. Trying to assign."); this.services.getAssignmentManager(). regionOffline(HRegionInfo.FIRST_META_REGIONINFO); //this.services.getAssignmentManager().assignMeta(); assignMetaWithRetries(); } {code} Add method assignMetaWithRetries, code of assignMetaWithRetries method as follows: {code} private void assignMetaWithRetries() throws IOException{ int iTimes = this.server.getConfiguration().getInt( "hbase.catalog.verification.retries", 10); long waitTime = this.server.getConfiguration().getLong( "hbase.catalog.verification.timeout", 1000); int iFlag = 0; LOG.info("TEST START"); while (true) { try { // verifyAndAssignRoot(); this.services.getAssignmentManager().assignMeta(); break; } catch (Exception e) { if (iFlag >= iTimes) { this.server.abort(" test ming assginMeta failed after" + iTimes + " times retries, aborting", e); throw new IOException("Aborting", e); } try { Thread.sleep(waitTime); } catch (InterruptedException e1) { LOG.warn("Interrupted when is the thread sleep", e1); Thread.currentThread().interrupt(); throw new IOException("Interrupted", e1); } iFlag++; } } LOG.info("TEST END HBASE"); } {code} > Hbase cluster is unavailable while the regionserver that Meta table deployed > crashed > ------------------------------------------------------------------------------------ > > Key: HBASE-7445 > URL: https://issues.apache.org/jira/browse/HBASE-7445 > Project: HBase > Issue Type: Bug > Components: Region Assignment, regionserver > Affects Versions: 0.94.1 > Environment: Hadoop 0.20.2-cdh3u3 > Hbase 0.94.1 > Reporter: Zhong Deyin > Labels: patch > Original Estimate: 336h > Remaining Estimate: 336h > > while the regionserver that META table deployed crashed, the .META. table > can't migrate to other available regionservers. Then the region spliting, > can't find META table, cause the whole cluster is unavailable. > Code path: org.apache.hadoop.hbase.master.handler.ServerShutdownHandler > {code} > // Carrying meta > if (isCarryingMeta()) { > LOG.info("Server " + serverName + " was carrying META. Trying to > assign."); > this.services.getAssignmentManager(). > regionOffline(HRegionInfo.FIRST_META_REGIONINFO); > this.services.getAssignmentManager().assignMeta(); > } > {code} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira