We are seeing some occasional incidents where a zookeeper java client
will hang in CountDownLatch.await() while waiting for a connection to be
established. Our connect() code is pretty standard I think and it
similar to this:
private ZooKeeper connect(String hosts, int sessionTimeout) throws
IOException, InterruptedException {
final CountDownLatch connectedSignal = new CountDownLatch(1);
ZooKeeper zk = new ZooKeeper(hosts, sessionTimeout, new Watcher() {
@Override
public void process(WatchedEvent event) {
if (event.getState() == Event.KeeperState.SyncConnected) {
connectedSignal.countDown();
}
}
});
connectedSignal.await();
return zk;
}
Has anyone else had an issue with the await() blocking forever like
this? Any advice?
As a "fix" I am considering adding a timeout to the CountDownLatch
await() call; if we fail to connect within that timeout then retry the
connection attempt. After, say, 3 retries, give up entirely.
Thanks!
--
John Lindwall