[ 
https://issues.apache.org/jira/browse/HBASE-7445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhong Deyin resolved HBASE-7445.
--------------------------------

    Resolution: Fixed

modify class org.apache.hadoop.hbase.master.handler.ServerShutdownHandler, 
change the process method, replace code 
this.services.getAssignmentManager().assignMeta() to 
assignMetaWithRetries(),then meta will try 10 times while the regionservers 
crashed.
{code}
  // Carrying meta
      if (isCarryingMeta()) {
        LOG.info("Server " + serverName +
          " was carrying META. Trying to assign.");
        this.services.getAssignmentManager().
          regionOffline(HRegionInfo.FIRST_META_REGIONINFO);
        //this.services.getAssignmentManager().assignMeta();
        assignMetaWithRetries();
      }
{code}
Add method assignMetaWithRetries, code of assignMetaWithRetries method as 
follows:
{code}
     private void assignMetaWithRetries() throws IOException{
          int iTimes = this.server.getConfiguration().getInt(
                        "hbase.catalog.verification.retries", 10);

                    long waitTime = this.server.getConfiguration().getLong(
                        "hbase.catalog.verification.timeout", 1000);

                    int iFlag = 0;
                    LOG.info("TEST START");
                    while (true) {
                      try {
                       // verifyAndAssignRoot();
                        this.services.getAssignmentManager().assignMeta();
                        break;
                      } catch (Exception e) {
                        if (iFlag >= iTimes) {
                          this.server.abort("  test ming  assginMeta failed 
after" + iTimes
                              + " times retries, aborting", e);
                          throw new IOException("Aborting", e);
                        }
                        try {
                          Thread.sleep(waitTime);
                        } catch (InterruptedException e1) {
                          LOG.warn("Interrupted when is the thread sleep", e1);
                          Thread.currentThread().interrupt();
                          throw new IOException("Interrupted", e1);
                        }
                        iFlag++;
                      }
                    } 
                    LOG.info("TEST END HBASE");
  }
{code}
                
> Hbase cluster is unavailable while the regionserver that Meta table deployed 
> crashed
> ------------------------------------------------------------------------------------
>
>                 Key: HBASE-7445
>                 URL: https://issues.apache.org/jira/browse/HBASE-7445
>             Project: HBase
>          Issue Type: Bug
>          Components: Region Assignment, regionserver
>    Affects Versions: 0.94.1
>         Environment: Hadoop 0.20.2-cdh3u3
> Hbase 0.94.1
>            Reporter: Zhong Deyin
>              Labels: patch
>   Original Estimate: 336h
>  Remaining Estimate: 336h
>
> while the regionserver that META table deployed crashed, the .META. table 
> can't migrate to other available regionservers. Then the region spliting, 
> can't find META table, cause the whole cluster is unavailable.
> Code path: org.apache.hadoop.hbase.master.handler.ServerShutdownHandler
> {code}
>       // Carrying meta
>       if (isCarryingMeta()) {
>         LOG.info("Server " + serverName + " was carrying META. Trying to 
> assign.");
>         this.services.getAssignmentManager().
>           regionOffline(HRegionInfo.FIRST_META_REGIONINFO);
>         this.services.getAssignmentManager().assignMeta();
>       }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to