[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

chunhui shen (JIRA) Fri, 25 May 2012 10:13:24 -0700

    [ 
https://issues.apache.org/jira/browse/HBASE-5916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13283605#comment-13283605
 ]


chunhui shen commented on HBASE-5916:
-------------------------------------

@ram
Thanks to write much for the case.
However, I don't think the above case will happen. Correct me if wrong.

bq.At the same time as master initialization has already been done and so we 
are able to carry on assignment with SSH also. This will lead to double 
assignment
Why it will lead to double assignment? When we reassign regions in the process 
of SSH, we would skip regions as the folowing:
{code}
if (processDeadRegion(e.getKey(), e.getValue(),
              this.services.getAssignmentManager(),
              this.server.getCatalogTracker())) {
            ServerName addressFromAM = this.services.getAssignmentManager()
                .getRegionServerOfRegion(e.getKey());
            if (rit != null && !rit.isClosing() && !rit.isPendingClose()) {
              // Skip regions that were in transition unless CLOSING or
              // PENDING_CLOSE
              LOG.info("Skip assigning region " + rit.toString());
            } else if (addressFromAM != null
                && !addressFromAM.equals(this.serverName)) {
              LOG.debug("Skip assigning region "
                    + e.getKey().getRegionNameAsString()
                    + " because it has been opened in "
                    + addressFromAM.getServerName());
              } else {
                toAssignRegions.add(e.getKey());
              }
          }
{code}
In RIT?(not closing&&not pendingClose, it won't be these two state in the above 
case ) -> skip
Has onlined on other server-> skip

At last, I think HBASE-5916_trunk_v7.patch is fine, and aggree we check in the 
patch for the current JIRA.
Thanks help for my doubt.
                
> RS restart just before master intialization we make the cluster non operative
> -----------------------------------------------------------------------------
>
>                 Key: HBASE-5916
>                 URL: https://issues.apache.org/jira/browse/HBASE-5916
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>            Priority: Critical
>             Fix For: 0.94.1
>
>         Attachments: HBASE-5916_trunk.patch, HBASE-5916_trunk_1.patch, 
> HBASE-5916_trunk_1.patch, HBASE-5916_trunk_2.patch, HBASE-5916_trunk_3.patch, 
> HBASE-5916_trunk_4.patch, HBASE-5916_trunk_v5.patch, 
> HBASE-5916_trunk_v6.patch, HBASE-5916_trunk_v7.patch, HBASE-5916v8.patch
>
>
> Consider a case where my master is getting restarted.  RS that was alive when 
> the master restart started, gets restarted before the master initializes the 
> ServerShutDownHandler.
> {code}
> serverShutdownHandlerEnabled = true;
> {code}
> In this case when the RS tries to register with the master, the master will 
> try to expire the server but the server cannot be expired as still the 
> serverShutdownHandler is not enabled.
> This case may happen when i have only one RS gets restarted or all the RS 
> gets restarted at the same time.(before assignRootandMeta).
> {code}
> LOG.info(message);
>       if (existingServer.getStartcode() < serverName.getStartcode()) {
>         LOG.info("Triggering server recovery; existingServer " +
>           existingServer + " looks stale, new server:" + serverName);
>         expireServer(existingServer);
>       }
> {code}
> If another RS is brought up then the cluster comes back to normalcy.
> May be a very corner case.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-5916) RS restart just before master intialization we make the cluster non operative

Reply via email to