Hedwig PubSubServer must wait for its Zookeeper client to be connected upon 
startup
-----------------------------------------------------------------------------------

                 Key: BOOKKEEPER-63
                 URL: https://issues.apache.org/jira/browse/BOOKKEEPER-63
             Project: Bookkeeper
          Issue Type: Bug
          Components: hedwig-server
            Reporter: Matthieu Morel
            Priority: Minor


When a PubSubServer is instantiated in *non-standalone* mode, it creates a 
ZkTopicManager which takes a Zookeeper client as an argument.
Unfortunately, this Zookeeper client may not be connected yet (not in CONNECTED 
state yet), and when this is the case, creation of ZkTopicManager fails, 
leading to failure of the PubSubServer startup.

Typical error (adapted, line numbers take into account commented patching code):
jjava.io.IOException: 
org.apache.hedwig.exceptions.PubSubException$ServiceDownException: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hedwig/standalone/hosts/x.x.x.x:4080:9876
        at 
org.apache.hedwig.server.netty.PubSubServer.instantiateTopicManager(PubSubServer.java:170)
        at 
org.apache.hedwig.server.netty.PubSubServer$3.run(PubSubServer.java:294)
        at java.lang.Thread.run(Thread.java:680)
Caused by: org.apache.hedwig.exceptions.PubSubException$ServiceDownException: 
org.apache.zookeeper.KeeperException$ConnectionLossException: KeeperErrorCode = 
ConnectionLoss for /hedwig/standalone/hosts/x.x.x.x:4080:9876
        at 
org.apache.hedwig.server.topics.ZkTopicManager$4.safeProcessResult(ZkTopicManager.java:146)
etc...

This is particularly problematic for running tests that require to pass a 
config to the PubSubServer.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to