Hi Yunong, Yes, this looks like a bug. The problem is that the C client is not handling the case when connect() returns EINPROGRESS or EWOULDBLOCK and eventually fails. I think the right fix is to check SO_ERROR after the socket becomes writable. Please go ahead and open a jira.
Thanks! --Michi On Sun, Nov 4, 2012 at 2:11 PM, Yunong Xiao <yjx...@gmail.com> wrote: > I have a fairly simple single-threaded C client set up -- single-threaded > because we are embedding zk in the node.js/libuv runtime -- which consists of > the following algorithm: > > zookeeper_interest(); select(); > // perform zookeeper api calls > zookeeper_process(); > > I've noticed that zookeeper_interest in the C client never returns error if it > is unable to connect to the zk server. > > From the spec of the zookeeper_interest API, I see that zookeeper_interest is > supposed to return ZCONNECTIONLOSS when disconnected from the client. However, > digging into the code, I see that the client is making a non-blocking connect > call > https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613 > , and returning ZOK > https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684 > > If we assume that the server is not up, this will mean that the subsequent > select() call would return 0, since the fd is not ready, and future calls to > zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS. > Thus an upstream client will never be aware that the connection is lost. > > I don't think this is the expected behavior. I have temporarily patched the zk > C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's > still > unable to connect after session_timeout has been exceeded. > > Is this the right interpretation of the API? Are you guys open to taking the > patch I described? > > -Yunong