[ 
https://issues.apache.org/jira/browse/HBASE-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13090298#comment-13090298
 ] 

ramkrishna.s.vasudevan commented on HBASE-4124:
-----------------------------------------------

@Gao
Correct me if am wrong.  I can understand the intention behind the logic. 
{code}
+          RegionTransitionData data = ZKAssign.getData(watcher, 
regionInfo.getEncodedName()); 
+          
+          //When zk node has been updated by a living server, we consider that 
this region server is handling it. 
+          //So we should skip it and process it in processRegionsInTransition.
+          if (data != null && data.getServerName() != null &&
+            serverManager.isServerOnline(data.getServerName())){
+              LOG.info("The region " + regionInfo.getEncodedName() +
+                "is processing by " + data.getServerName());
+            continue;
+          }
{code}
But if as part of rebuildUserRegions() the master finds a server to be dead and 
adds those RS to dead servers and also u said the master was killed.
How come we have a dead RS if we dont kill the RS and if the master is also 
killed how can the regions be assigned to some other RS (how can the state 
change in ZK for that region node).
May be am not understanding something.  If you can explain this it will help me 
in Timeoutmonitor. 
Rest looks fine.  

> ZK restarted while assigning a region, new active HM re-assign it but the RS 
> warned 'already online on this server'.
> --------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-4124
>                 URL: https://issues.apache.org/jira/browse/HBASE-4124
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>            Reporter: fulin wang
>            Assignee: gaojinchao
>             Fix For: 0.90.5
>
>         Attachments: HBASE-4124_Branch90V1_trial.patch, 
> HBASE-4124_Branch90V2.patch, HBASE-4124_Branch90V3.patch, log.txt
>
>   Original Estimate: 0.4h
>  Remaining Estimate: 0.4h
>
> ZK restarted while assigning a region, new active HM re-assign it but the RS 
> warned 'already online on this server'.
> Issue:
> The RS failed besause of 'already online on this server' and return; The HM 
> can not receive the message and report 'Regions in transition timed out'.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to