[jira] [Updated] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-09-03 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-1676:

Fix Version/s: (was: 3.5.2)

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-09-03 Thread Rakesh R (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rakesh R updated ZOOKEEPER-1676:

Fix Version/s: (was: 3.4.9)

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-05-22 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira resolved ZOOKEEPER-1676.
-
  Resolution: Not A Problem
Release Note: Ok, let's resolve this one and document it.

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
> Fix For: 3.4.9, 3.5.2
>
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-05-22 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295629#comment-15295629
 ] 

Michael Han commented on ZOOKEEPER-1676:


[~fpj] I think we can resolve this. I also created ZOOKEEPER-2431 and 
ZOOKEEPER-2432 as a follow up to improve documentation.

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
> Fix For: 3.4.9, 3.5.2
>
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-05-22 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295626#comment-15295626
 ] 

Michael Han commented on ZOOKEEPER-1676:


Link to documentation JIRA sub-task.

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
> Fix For: 3.4.9, 3.5.2
>
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-05-22 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295573#comment-15295573
 ] 

Flavio Junqueira commented on ZOOKEEPER-1676:
-

[~hanm] You're right, unfortunately we don't have good online documentation for 
the C client, and I'm totally for documenting it. I'm thinking that perhaps we 
should have a documentation jira for the C client and have one of the tasks be 
to document the use of the single-thread C client. I'd say that we resolve this 
issue so that we don't have it as a blocker anymore for 3.5.2. We can 
alternatively keep it open, link it to the documentation jira, but drop the 
priority. What do you think?

Btw, I used this code to test the behavior of the single-thread C client:

https://github.com/fpj/zookeeper-book-example/blob/master/src/main/c/master.c  

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
> Fix For: 3.4.9, 3.5.2
>
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-05-21 Thread Michael Han (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15295337#comment-15295337
 ] 

Michael Han commented on ZOOKEEPER-1676:


[~fpj] I can confirm that using watcher can get expected connection events. 
Here is my test case using single thread ZK library [https://goo.gl/hql4B1]. 
I've tried to start / stop server before / after my tests run and the state 
changes from watcher are expected. The return code of zookeeper_interest though 
is not reliable (for example, zookeeper_interest will return OK when the server 
is dead). Maybe we should add some documentations on the zookeeper_interest to 
indicate the return code of it should NOT be used for deciding connection 
status between client and server.


> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
> Fix For: 3.4.9, 3.5.2
>
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2016-05-20 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15294292#comment-15294292
 ] 

Flavio Junqueira commented on ZOOKEEPER-1676:
-

I'm not convinced that this is an issue. I have tested the single-threaded 
client and I get the state changes correctly via the default watcher. That's 
the right way to observe the state changes rather than relying on the return 
value of {{zookeeper_interest}}. A call to {{zookeeper_process}} will make sure 
to deliver the events. See this for an example:

https://github.com/apache/zookeeper/blob/trunk/src/c/src/cli.c

I could use a second pair of eyes to confirm this, perhaps [~yunong] if still 
around. 

> C client zookeeper_interest returning ZOK on Connection Loss
> 
>
> Key: ZOOKEEPER-1676
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
> Project: ZooKeeper
>  Issue Type: Bug
>  Components: c client
>Affects Versions: 3.4.3
> Environment: All
>Reporter: Yunong Xiao
>Assignee: Yunong Xiao
>Priority: Blocker
> Fix For: 3.4.9, 3.5.2
>
>
> I have a fairly simple single-threaded C client set up -- single-threaded
> because we are embedding zk in the node.js/libuv runtime -- which consists of
> the following algorithm:
> zookeeper_interest(); select();
> // perform zookeeper api calls
> zookeeper_process();
> I've noticed that zookeeper_interest in the C client never returns error if it
> is unable to connect to the zk server.
> From the spec of the zookeeper_interest API, I see that zookeeper_interest is
> supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
> digging into the code, I see that the client is making a non-blocking connect
> call
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
> ,  and returning ZOK
> https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
> If we assume that the server is not up, this will mean that the subsequent
> select() call would return 0, since the fd is not ready, and future calls to
> zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
> Thus an upstream client will never be aware that the connection is lost.
> I don't think this is the expected behavior. I have temporarily patched the zk
> C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
> still
> unable to connect after session_timeout has been exceeded. 
> I have included a patch for the client which fixes this for release 3.4.3 
> 6b35e96 in this branch: 
> https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
> patch https://gist.github.com/yunong/efe869a0345867d54adf
> For more information, please see this email thread. 
> http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2014-04-29 Thread Flavio Junqueira (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13984143#comment-13984143
 ] 

Flavio Junqueira commented on ZOOKEEPER-1676:
-

I missed initially that the bug reports the use of the single-thread C client. 
I have checked the multi-thread one and the transition of states 
CONNECTED-CONNECTING-CONNECTED in my client code as I stop and restart the 
server works fine. I'll try to reproduce the problem with the single-thread 
client.

 C client zookeeper_interest returning ZOK on Connection Loss
 

 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Assignee: Yunong Xiao
Priority: Blocker

 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded. 
 I have included a patch for the client which fixes this for release 3.4.3 
 6b35e96 in this branch: 
 https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
 patch https://gist.github.com/yunong/efe869a0345867d54adf
 For more information, please see this email thread. 
 http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2014-04-25 Thread Flavio Junqueira (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Flavio Junqueira updated ZOOKEEPER-1676:


Priority: Blocker  (was: Critical)

 C client zookeeper_interest returning ZOK on Connection Loss
 

 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Assignee: Yunong Xiao
Priority: Blocker

 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded. 
 I have included a patch for the client which fixes this for release 3.4.3 
 6b35e96 in this branch: 
 https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
 patch https://gist.github.com/yunong/efe869a0345867d54adf
 For more information, please see this email thread. 
 http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.2#6252)


[jira] [Updated] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2013-10-03 Thread Patrick Hunt (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-1676:


Assignee: Yunong Xiao

 C client zookeeper_interest returning ZOK on Connection Loss
 

 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Assignee: Yunong Xiao
Priority: Critical

 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded. 
 I have included a patch for the client which fixes this for release 3.4.3 
 6b35e96 in this branch: 
 https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
 patch https://gist.github.com/yunong/efe869a0345867d54adf
 For more information, please see this email thread. 
 http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2013-10-03 Thread Patrick Hunt (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785393#comment-13785393
 ] 

Patrick Hunt commented on ZOOKEEPER-1676:
-

[~yunong] this sounds like an important one to fix, can you submit a patch per 
Michi's comment?

 C client zookeeper_interest returning ZOK on Connection Loss
 

 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Assignee: Yunong Xiao
Priority: Critical

 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded. 
 I have included a patch for the client which fixes this for release 3.4.3 
 6b35e96 in this branch: 
 https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
 patch https://gist.github.com/yunong/efe869a0345867d54adf
 For more information, please see this email thread. 
 http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Commented] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2013-04-20 Thread Michi Mutsuzaki (JIRA)

[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13637482#comment-13637482
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1676:


Hi Yunong,

Thank you for the patch. Could you please upload your patch here? This page 
describes how to submit patches:

http://wiki.apache.org/hadoop/ZooKeeper/HowToContribute

Thanks!
--Michi

 C client zookeeper_interest returning ZOK on Connection Loss
 

 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Priority: Critical

 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded. 
 I have included a patch for the client which fixes this for release 3.4.3 
 6b35e96 in this branch: 
 https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
 patch https://gist.github.com/yunong/efe869a0345867d54adf
 For more information, please see this email thread. 
 http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: zookeeper_interest returning ZOK on Connection Loss

2013-04-09 Thread Michi Mutsuzaki
Hi Yunong,

I haven't had a chance to review your patch yet. I think I'll have
some time this weekend.

Thanks!
--Michi

On Mon, Apr 8, 2013 at 7:14 PM, Yunong Xiao yun...@joyent.com wrote:

 On Mar 27, 2013, at 10:49 AM, Marshall McMullen
 marshall.mcmul...@gmail.com wrote:

 We already have a ZOO_NOTCONNECTED_STATE. Seems like that should work for
 these purposes...


 On Wed, Mar 27, 2013 at 11:23 AM, Michi Mutsuzaki
 mi...@cs.stanford.eduwrote:

 Hi Anthony,

 Is there a open jira for this patch? I don't have any problem
 integrating this patch. Only thing I'd suggest is to not call the
 event ZOO_EXPIRED_SESSION_STATE (which should be triggered only by the
 server).

 --Michi


 Michi--

 Have you guys had a chance to look at the Jira and patch? I submitted the
 Jira https://issues.apache.org/jira/browse/ZOOKEEPER-1676 as well a link to
 my patch https://github.com/yunong/zookeeper/tree/release-3.4.3-patched a
 couple weeks ago. Just wondered if there's anything else you need from me in
 order to get this patch integrated.

 -Yunong


Re: zookeeper_interest returning ZOK on Connection Loss

2013-04-08 Thread Yunong Xiao

On Mar 27, 2013, at 10:49 AM, Marshall McMullen marshall.mcmul...@gmail.com 
wrote:

 We already have a ZOO_NOTCONNECTED_STATE. Seems like that should work for
 these purposes...
 
 
 On Wed, Mar 27, 2013 at 11:23 AM, Michi Mutsuzaki 
 mi...@cs.stanford.eduwrote:
 
 Hi Anthony,
 
 Is there a open jira for this patch? I don't have any problem
 integrating this patch. Only thing I'd suggest is to not call the
 event ZOO_EXPIRED_SESSION_STATE (which should be triggered only by the
 server).
 
 --Michi
 

Michi--

Have you guys had a chance to look at the Jira and patch? I submitted the Jira 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676 as well a link to my patch 
https://github.com/yunong/zookeeper/tree/release-3.4.3-patched a couple weeks 
ago. Just wondered if there's anything else you need from me in order to get 
this patch integrated.

-Yunong

[jira] [Updated] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2013-03-28 Thread Yunong Xiao (JIRA)

 [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yunong Xiao updated ZOOKEEPER-1676:
---

Component/s: c client

 C client zookeeper_interest returning ZOK on Connection Loss
 

 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
  Components: c client
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Priority: Critical

 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded. 
 I have included a patch for the client which fixes this for release 3.4.3 
 6b35e96 in this branch: 
 https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the 
 patch https://gist.github.com/yunong/efe869a0345867d54adf
 For more information, please see this email thread. 
 http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Created] (ZOOKEEPER-1676) C client zookeeper_interest returning ZOK on Connection Loss

2013-03-28 Thread Yunong Xiao (JIRA)
Yunong Xiao created ZOOKEEPER-1676:
--

 Summary: C client zookeeper_interest returning ZOK on Connection 
Loss
 Key: ZOOKEEPER-1676
 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1676
 Project: ZooKeeper
  Issue Type: Bug
Affects Versions: 3.4.3
 Environment: All
Reporter: Yunong Xiao
Priority: Critical


I have a fairly simple single-threaded C client set up -- single-threaded
because we are embedding zk in the node.js/libuv runtime -- which consists of
the following algorithm:

zookeeper_interest(); select();
// perform zookeeper api calls
zookeeper_process();

I've noticed that zookeeper_interest in the C client never returns error if it
is unable to connect to the zk server.

From the spec of the zookeeper_interest API, I see that zookeeper_interest is
supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
digging into the code, I see that the client is making a non-blocking connect
call
https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
,  and returning ZOK
https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684

If we assume that the server is not up, this will mean that the subsequent
select() call would return 0, since the fd is not ready, and future calls to
zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
Thus an upstream client will never be aware that the connection is lost.

I don't think this is the expected behavior. I have temporarily patched the zk
C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's still
unable to connect after session_timeout has been exceeded. 

I have included a patch for the client which fixes this for release 3.4.3 
6b35e96 in this branch: 
https://github.com/yunong/zookeeper/tree/release-3.4.3-patched Here's the patch 
https://gist.github.com/yunong/efe869a0345867d54adf

For more information, please see this email thread. 
http://mail-archives.apache.org/mod_mbox/zookeeper-dev/201211.mbox/%3c11a8e7c3-4dde-45d8-abec-a8a4d32cf...@gmail.com%3E

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


Re: zookeeper_interest returning ZOK on Connection Loss

2013-03-28 Thread Yunong Xiao

On Mar 27, 2013, at 10:23 AM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:
 
 Is there a open jira for this patch? I don't have any problem
 integrating this patch. Only thing I'd suggest is to not call the
 event ZOO_EXPIRED_SESSION_STATE (which should be triggered only by the
 server).
 
 --Michi

I've submitted the JIRA: https://issues.apache.org/jira/browse/ZOOKEEPER-1676

Of note I've submitted links to the patch: 
https://gist.github.com/yunong/efe869a0345867d54adf as well as my private 
branch of the patch 
https://github.com/yunong/zookeeper/tree/release-3.4.3-patched


On Mar 27, 2013, at 10:49 AM, Marshall McMullen marshall.mcmul...@gmail.com 
wrote:

 We already have a ZOO_NOTCONNECTED_STATE. Seems like that should work for
 these purposes…

At the time I wrote the patch, I don't think 3.4.3 had ZOO_NOTCONNECTED_STATE. 
If it's there now in the newer versions, then it certainly seems viable.

-Yunong

Re: zookeeper_interest returning ZOK on Connection Loss

2013-03-27 Thread Anthony Barré
I would like to reopen this thread.

The fact that no event ZOO_EXPIRED_SESSION_STATE is triggered after a
certain amount of time when the server is offline is a serious problem.

In my company, we also using the c client in a single thread (as part of
the node-zookeeper module) but we decided to use the patched version
created by yunong to deal with this problem of session expiration.

What is blocking the integration of the patch proposed by yunoung ? This
patch has a great value for us.

Anthony.


Re: zookeeper_interest returning ZOK on Connection Loss

2013-03-27 Thread Michi Mutsuzaki
Hi Anthony,

Is there a open jira for this patch? I don't have any problem
integrating this patch. Only thing I'd suggest is to not call the
event ZOO_EXPIRED_SESSION_STATE (which should be triggered only by the
server).

--Michi

On Wed, Mar 27, 2013 at 2:41 AM, Anthony Barré a...@fasterize.com wrote:
 I would like to reopen this thread.

 The fact that no event ZOO_EXPIRED_SESSION_STATE is triggered after a
 certain amount of time when the server is offline is a serious problem.

 In my company, we also using the c client in a single thread (as part of
 the node-zookeeper module) but we decided to use the patched version
 created by yunong to deal with this problem of session expiration.

 What is blocking the integration of the patch proposed by yunoung ? This
 patch has a great value for us.

 Anthony.


Re: zookeeper_interest returning ZOK on Connection Loss

2013-03-27 Thread Marshall McMullen
We already have a ZOO_NOTCONNECTED_STATE. Seems like that should work for
these purposes...


On Wed, Mar 27, 2013 at 11:23 AM, Michi Mutsuzaki mi...@cs.stanford.eduwrote:

 Hi Anthony,

 Is there a open jira for this patch? I don't have any problem
 integrating this patch. Only thing I'd suggest is to not call the
 event ZOO_EXPIRED_SESSION_STATE (which should be triggered only by the
 server).

 --Michi

 On Wed, Mar 27, 2013 at 2:41 AM, Anthony Barré a...@fasterize.com wrote:
  I would like to reopen this thread.
 
  The fact that no event ZOO_EXPIRED_SESSION_STATE is triggered after a
  certain amount of time when the server is offline is a serious problem.
 
  In my company, we also using the c client in a single thread (as part of
  the node-zookeeper module) but we decided to use the patched version
  created by yunong to deal with this problem of session expiration.
 
  What is blocking the integration of the patch proposed by yunoung ? This
  patch has a great value for us.
 
  Anthony.



Re: zookeeper_interest returning ZOK on Connection Loss

2012-11-13 Thread Yunong Xiao
Michi -- 

On Nov 5, 2012, at 11:24 PM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:

 On Mon, Nov 5, 2012 at 2:16 PM, Yunong Xiao yjx...@gmail.com wrote:
 What happens if the server is offline? In the scenario you described, 
 select()
 will never succeed, and zookeeper_interest() will never inform the upstream
 consumer that the server is down. That's the crux of the problem here, when 
 the
 server is unavailable, the client needs to know after some amount of time 
 that
 the connection has failed. Otherwise the upstream consumer has no idea the
 connection to zookeeper has been severed and will hang indefinitely.
 
 If the server is offline, zookeeper client keeps trying to connect to
 the server indefinitely. It's up to the caller to decide when to give
 up. One possibility is to poll the connection state using zoo_state()
 and give up on the handle if it doesn't become connected after certain
 time.

I certainly don't disagree here. What you're suggesting makes sense here -- 
indeed my patch does a variation of this -- however, it relies on the consuming 
client to keep track of the time since the client was last connected to the 
server. I dare say it'd be more preferable for zookeeper.c to keep track of 
this time internally, and inform the consuming client when this time has 
exceeded a certain threshold, such as session_timeout. The consuming client can 
then decide to give up. It seems like common functionality that all consumers 
would like to have as part of using the C client. 

Re: zookeeper_interest returning ZOK on Connection Loss

2012-11-05 Thread Michi Mutsuzaki
Hi Yunong,

Yes, this looks like a bug. The problem is that the C client is not
handling the case when connect() returns EINPROGRESS or EWOULDBLOCK
and eventually fails. I think the right fix is to check SO_ERROR after
the socket becomes writable. Please go ahead and open a jira.

Thanks!
--Michi

On Sun, Nov 4, 2012 at 2:11 PM, Yunong Xiao yjx...@gmail.com wrote:
 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:

 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();

 I've noticed that zookeeper_interest in the C client never returns error if it
 is unable to connect to the zk server.

 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684

 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.

 I don't think this is the expected behavior. I have temporarily patched the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded.

 Is this the right interpretation of the API? Are you guys open to taking the
 patch I described?

 -Yunong


Re: zookeeper_interest returning ZOK on Connection Loss

2012-11-05 Thread Yunong Xiao
Michi,

I don't think just returning error after checking SO_ERROR is sufficient here,
since this indicates connection loss but not session expiration. The client
could still connect to a different zk server on future iterations of
zookeeper_interest(), which will not occur if we return error. I think what is
needed here is to keep track of the amount of time that has elapsed since the
last time a connection was established to the server, while trying to connect
to servers in the list. If this time exceeds session_timeout, then return an
error such as ZOO_EXPIRED_SESSION_STATE. I've patched my local zookeeper client
to keep track of this state. 

In addition, It's not clear to me how you would go about checking for SO_ERROR
within zookeeper.c, since you'll need to check SO_ERROR after calling select(),
which -- at least in my case, since I'm using the single threaded client, and I
am embedding in the node.js/libuv runtime -- needs to be outside of the C
client. I'd like to hear your thoughts on how this could be better achieved. I
would propose the following patch:

1) zookeeper_interest needs to return state about it's current connection
status, since the zhandle is opaque. This will allow consuming clients to check
for the status of the non-blocking connect.  

2) zookeeper_interest has to keep track of the last time it was connected to a
server, and return ZOO_EXPIRED_SESSION_STATE if it's still unable to connect
after the timeout exceeds the session timeout.  

3) some sample code/documentation around checking for the status of
non-blocking connects in user code for consumers of zookeeper.h.

Does this seem reasonable? Would you guys be open to taking my patch?

-Yunong

On Nov 5, 2012, at 10:33 AM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:

 Hi Yunong,
 
 Yes, this looks like a bug. The problem is that the C client is not
 handling the case when connect() returns EINPROGRESS or EWOULDBLOCK
 and eventually fails. I think the right fix is to check SO_ERROR after
 the socket becomes writable. Please go ahead and open a jira.
 
 Thanks!
 --Michi
 
 On Sun, Nov 4, 2012 at 2:11 PM, Yunong Xiao yjx...@gmail.com wrote:
 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists of
 the following algorithm:
 
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 
 I've noticed that zookeeper_interest in the C client never returns error if 
 it
 is unable to connect to the zk server.
 
 From the spec of the zookeeper_interest API, I see that zookeeper_interest is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. 
 However,
 digging into the code, I see that the client is making a non-blocking connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 
 I don't think this is the expected behavior. I have temporarily patched the 
 zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded.
 
 Is this the right interpretation of the API? Are you guys open to taking the
 patch I described?
 
 -Yunong



Re: zookeeper_interest returning ZOK on Connection Loss

2012-11-05 Thread Michi Mutsuzaki
Hi Yunong,

Zookeeper client shouldn't return ZOO_EXPIRED_SESSION_STATE unless the
server tells the client the session is expired. Otherwise the client
might think the session is expired when it isn't. In case of the
single-threaded client, it looks like you need to keep calling
zookeeper_interest() until select() succeeds. It would be great to
have sample code / documentation.

--Michi

On Mon, Nov 5, 2012 at 1:19 PM, Yunong Xiao yjx...@gmail.com wrote:
 Michi,

 I don't think just returning error after checking SO_ERROR is sufficient here,
 since this indicates connection loss but not session expiration. The client
 could still connect to a different zk server on future iterations of
 zookeeper_interest(), which will not occur if we return error. I think what is
 needed here is to keep track of the amount of time that has elapsed since the
 last time a connection was established to the server, while trying to connect
 to servers in the list. If this time exceeds session_timeout, then return an
 error such as ZOO_EXPIRED_SESSION_STATE. I've patched my local zookeeper 
 client
 to keep track of this state.

 In addition, It's not clear to me how you would go about checking for SO_ERROR
 within zookeeper.c, since you'll need to check SO_ERROR after calling 
 select(),
 which -- at least in my case, since I'm using the single threaded client, and 
 I
 am embedding in the node.js/libuv runtime -- needs to be outside of the C
 client. I'd like to hear your thoughts on how this could be better achieved. I
 would propose the following patch:

 1) zookeeper_interest needs to return state about it's current connection
 status, since the zhandle is opaque. This will allow consuming clients to 
 check
 for the status of the non-blocking connect.

 2) zookeeper_interest has to keep track of the last time it was connected to a
 server, and return ZOO_EXPIRED_SESSION_STATE if it's still unable to connect
 after the timeout exceeds the session timeout.

 3) some sample code/documentation around checking for the status of
 non-blocking connects in user code for consumers of zookeeper.h.

 Does this seem reasonable? Would you guys be open to taking my patch?

 -Yunong

 On Nov 5, 2012, at 10:33 AM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:

 Hi Yunong,

 Yes, this looks like a bug. The problem is that the C client is not
 handling the case when connect() returns EINPROGRESS or EWOULDBLOCK
 and eventually fails. I think the right fix is to check SO_ERROR after
 the socket becomes writable. Please go ahead and open a jira.

 Thanks!
 --Michi

 On Sun, Nov 4, 2012 at 2:11 PM, Yunong Xiao yjx...@gmail.com wrote:
 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists 
 of
 the following algorithm:

 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();

 I've noticed that zookeeper_interest in the C client never returns error if 
 it
 is unable to connect to the zk server.

 From the spec of the zookeeper_interest API, I see that zookeeper_interest 
 is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. 
 However,
 digging into the code, I see that the client is making a non-blocking 
 connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684

 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls to
 zookeeper_interest will always return 0 and not the expected 
 ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.

 I don't think this is the expected behavior. I have temporarily patched the 
 zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded.

 Is this the right interpretation of the API? Are you guys open to taking the
 patch I described?

 -Yunong



Re: zookeeper_interest returning ZOK on Connection Loss

2012-11-05 Thread Yunong Xiao
Hi Michi,


 Otherwise the client
 might think the session is expired when it isn't. In case of the
 single-threaded client, it looks like you need to keep calling
 zookeeper_interest() until select() succeeds. 


What happens if the server is offline? In the scenario you described, select()
will never succeed, and zookeeper_interest() will never inform the upstream
consumer that the server is down. That's the crux of the problem here, when the
server is unavailable, the client needs to know after some amount of time that
the connection has failed. Otherwise the upstream consumer has no idea the
connection to zookeeper has been severed and will hang indefinitely.

 Zookeeper client shouldn't return ZOO_EXPIRED_SESSION_STATE unless the
 server tells the client the session is expired. 

That's fair, I could return ZOO_CONNECTION_LOSS here instead.


On Nov 5, 2012, at 1:59 PM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:

 Hi Yunong,
 
 Zookeeper client shouldn't return ZOO_EXPIRED_SESSION_STATE unless the
 server tells the client the session is expired. Otherwise the client
 might think the session is expired when it isn't. In case of the
 single-threaded client, it looks like you need to keep calling
 zookeeper_interest() until select() succeeds. It would be great to
 have sample code / documentation.
 
 --Michi
 
 On Mon, Nov 5, 2012 at 1:19 PM, Yunong Xiao yjx...@gmail.com wrote:
 Michi,
 
 I don't think just returning error after checking SO_ERROR is sufficient 
 here,
 since this indicates connection loss but not session expiration. The client
 could still connect to a different zk server on future iterations of
 zookeeper_interest(), which will not occur if we return error. I think what 
 is
 needed here is to keep track of the amount of time that has elapsed since the
 last time a connection was established to the server, while trying to connect
 to servers in the list. If this time exceeds session_timeout, then return an
 error such as ZOO_EXPIRED_SESSION_STATE. I've patched my local zookeeper 
 client
 to keep track of this state.
 
 In addition, It's not clear to me how you would go about checking for 
 SO_ERROR
 within zookeeper.c, since you'll need to check SO_ERROR after calling 
 select(),
 which -- at least in my case, since I'm using the single threaded client, 
 and I
 am embedding in the node.js/libuv runtime -- needs to be outside of the C
 client. I'd like to hear your thoughts on how this could be better achieved. 
 I
 would propose the following patch:
 
 1) zookeeper_interest needs to return state about it's current connection
 status, since the zhandle is opaque. This will allow consuming clients to 
 check
 for the status of the non-blocking connect.
 
 2) zookeeper_interest has to keep track of the last time it was connected to 
 a
 server, and return ZOO_EXPIRED_SESSION_STATE if it's still unable to connect
 after the timeout exceeds the session timeout.
 
 3) some sample code/documentation around checking for the status of
 non-blocking connects in user code for consumers of zookeeper.h.
 
 Does this seem reasonable? Would you guys be open to taking my patch?
 
 -Yunong
 
 On Nov 5, 2012, at 10:33 AM, Michi Mutsuzaki mi...@cs.stanford.edu wrote:
 
 Hi Yunong,
 
 Yes, this looks like a bug. The problem is that the C client is not
 handling the case when connect() returns EINPROGRESS or EWOULDBLOCK
 and eventually fails. I think the right fix is to check SO_ERROR after
 the socket becomes writable. Please go ahead and open a jira.
 
 Thanks!
 --Michi
 
 On Sun, Nov 4, 2012 at 2:11 PM, Yunong Xiao yjx...@gmail.com wrote:
 I have a fairly simple single-threaded C client set up -- single-threaded
 because we are embedding zk in the node.js/libuv runtime -- which consists 
 of
 the following algorithm:
 
 zookeeper_interest(); select();
 // perform zookeeper api calls
 zookeeper_process();
 
 I've noticed that zookeeper_interest in the C client never returns error 
 if it
 is unable to connect to the zk server.
 
 From the spec of the zookeeper_interest API, I see that zookeeper_interest 
 is
 supposed to return ZCONNECTIONLOSS when disconnected from the client. 
 However,
 digging into the code, I see that the client is making a non-blocking 
 connect
 call
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1596-1613
 ,  and returning ZOK
 https://github.com/apache/zookeeper/blob/trunk/src/c/src/zookeeper.c#L1684
 
 If we assume that the server is not up, this will mean that the subsequent
 select() call would return 0, since the fd is not ready, and future calls 
 to
 zookeeper_interest will always return 0 and not the expected 
 ZCONNECTIONLOSS.
 Thus an upstream client will never be aware that the connection is lost.
 
 I don't think this is the expected behavior. I have temporarily patched 
 the zk
 C client such that zookeeper_interest will return ZCONNECTIONLOSS if it's 
 still
 unable to connect after session_timeout has been exceeded.
 
 Is this the right 

Re: zookeeper_interest returning ZOK on Connection Loss

2012-11-05 Thread Michi Mutsuzaki
On Mon, Nov 5, 2012 at 2:16 PM, Yunong Xiao yjx...@gmail.com wrote:
 What happens if the server is offline? In the scenario you described, select()
 will never succeed, and zookeeper_interest() will never inform the upstream
 consumer that the server is down. That's the crux of the problem here, when 
 the
 server is unavailable, the client needs to know after some amount of time that
 the connection has failed. Otherwise the upstream consumer has no idea the
 connection to zookeeper has been severed and will hang indefinitely.

If the server is offline, zookeeper client keeps trying to connect to
the server indefinitely. It's up to the caller to decide when to give
up. One possibility is to poll the connection state using zoo_state()
and give up on the handle if it doesn't become connected after certain
time.

--Michi

--Michi