[ 
https://issues.apache.org/jira/browse/HBASE-5237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13190869#comment-13190869
 ] 

ramkrishna.s.vasudevan commented on HBASE-5237:
-----------------------------------------------

@Stack
Sorry if you feel the check in is not correct.  But pls find the analysis and 
scenario as to why this fix is needed.
As per HBASE-4397 if after preparing a region plan and sending an RPC to open 
if there are no RS alive we get an exception where we check for getting a new 
region plan and if that is null we set the 
{code}
this.timeoutMonitor.setAllRegionServersOffline(true);
{code}

This patch also does the same thing but the difference is even before sending 
an RPC if we find a null region plan it means no servers are alive.  

bq.And then in this case we set a flag up in TM. But TM only runs every 
30minutes so the setting of this flag doesn't do much?
As per the patch in HBASE-4397, if the TM timeout is not elapsed we have added 
another check which will help in assigning regions earlier.
{code}
 if (regionState.getStamp() + timeout <= now) {
            actOnTimeOut(unassigns, assigns, regionState, regionInfo);
          }
          else if(this.allRegionServersOffline && !allRSsOffline){
            actOnTimeOut(unassigns, assigns, regionState, regionInfo);          
  
          }
{code}

Even if the setting of offline servers is done in the middle of assign the 
current code was returning null making the assign to wait for TM.
Now that is avoided and it will try to do an assign again if any RS comes alive 
soon after this.
We got this problem in our cluster and after verifying the patch i had uploaded 
it.  Please correct me if am wrong.
                
> Addendum for HBASE-5160 and HBASE-4397
> --------------------------------------
>
>                 Key: HBASE-5237
>                 URL: https://issues.apache.org/jira/browse/HBASE-5237
>             Project: HBase
>          Issue Type: Bug
>    Affects Versions: 0.90.5
>            Reporter: ramkrishna.s.vasudevan
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.0, 0.90.6
>
>         Attachments: HBASE-5237_0.90.patch, HBASE-5237_trunk.patch
>
>
> As part of HBASE-4397 there is one more scenario where the patch has to be 
> applied.
> {code}
> RegionPlan plan = getRegionPlan(state, forceNewPlan);
>       if (plan == null) {
>         debugLog(state.getRegion(),
>             "Unable to determine a plan to assign " + state);
>         return; // Should get reassigned later when RIT times out.
>       }
> {code}
> I think in this scenario also 
> {code}
> this.timeoutMonitor.setAllRegionServersOffline(true);
> {code}
> this should be done.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to