Master can fail if ZooKeeper session expires
--------------------------------------------

                 Key: HBASE-5549
                 URL: https://issues.apache.org/jira/browse/HBASE-5549
             Project: HBase
          Issue Type: Bug
          Components: master, zookeeper
    Affects Versions: 0.96.0
         Environment: all
            Reporter: nkeywal
            Assignee: nkeywal
            Priority: Minor


There is a retry mechanism in RecoverableZooKeeper, but when the session 
expires, the whole ZooKeeperWatcher is recreated, hence the retry mechanism 
does not work in this case. This is why a sleep is needed in 
TestZooKeeper#testMasterSessionExpired: we need to wait for ZooKeeperWatcher to 
be recreated before using the connection.

This can happen in real life, it can happen when:
- master & zookeeper starts
- zookeeper connection is cut
- master enters the retry loop
- in the meantime the session expires
- the network comes back, the session is recreated
- the retries continues, but on the wrong object, hence fails.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to