[ 
https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284026#comment-13284026
 ] 

ramkrishna.s.vasudevan commented on HBASE-6046:
-----------------------------------------------

When the master is retrying to come up in 
{code}
private boolean tryRecoveringExpiredZKSession() throws InterruptedException,
      IOException, KeeperException, ExecutionException {
{code}
We tend to just initialize the zk trackers, assign root and meta and finally 
assign any regions in transition.
But in this time if an RS has gone down we totally miss those callback and the 
logs are never splitted.  Also as the AM is reinitialized we always treat as a 
clean cluster start.
                
> Master retry on ZK session expiry causes inconsistent region assignments.
> -------------------------------------------------------------------------
>
>                 Key: HBASE-6046
>                 URL: https://issues.apache.org/jira/browse/HBASE-6046
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.92.1, 0.94.0
>            Reporter: Gopinathan A
>            Assignee: ramkrishna.s.vasudevan
>             Fix For: 0.92.2, 0.94.1
>
>
> 1> ZK Session timeout in the hmaster leads to bulk assignment though all the 
> RSs are online.
> 2> While doing bulk assignment, if the master again goes down & restart(or 
> backup comes up) all the node created in the ZK will now be tried to reassign 
> to the new RSs. This is leading to double assignment.
> we had 2800 regions, among this 1900 region got double assignment, taking the 
> region count to 4700. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to