[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16780649#comment-16780649 ] Dan Benediktson commented on ZOOKEEPER-2471: [~andorm] I'm no longer working at the company where I was working on Zookeeper, so I don't actually have access to the resources I used to for developing, testing, etc. [~hanm] may be interested in advancing this patch and, relatedly, work for ZOOKEEPER-2869, but at this time, I won't be working on them. > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.3 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson >Priority: Major > Labels: pull-request-available > Attachments: ZOOKEEPER-2471.patch > > Time Spent: 20m > Remaining Estimate: 0h > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and difficult-to-find bug in the first place. > I'm currently running ant tests locally against my patch on trunk, and then > I'll upload it here. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ZOOKEEPER-2901) Session ID that is negative causes mis-calculation of Ephemeral Type
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2901?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16207878#comment-16207878 ] Dan Benediktson commented on ZOOKEEPER-2901: [~randgalt] You mean Server IDs, not Session IDs, and up to 127, correct? > Session ID that is negative causes mis-calculation of Ephemeral Type > > > Key: ZOOKEEPER-2901 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2901 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.3 > Environment: Running 3.5.3-beta in Docker container >Reporter: Mark Johnson >Assignee: Jordan Zimmerman >Priority: Blocker > > In the code that determines the EphemeralType it is looking at the owner > (which is the client ID or connection ID): > EphemeralType.java: >public static EphemeralType get(long ephemeralOwner) { >if (ephemeralOwner == CONTAINER_EPHEMERAL_OWNER) { >return CONTAINER; >} >if (ephemeralOwner < 0) { >return TTL; >} >return (ephemeralOwner == 0) ? VOID : NORMAL; >} > However my connection ID is: > header.getClientId(): -720548323429908480 > This causes the code to think this is a TTL Ephemeral node instead of a > NORMAL Ephemeral node. > This also explains why this is random - if my client ID is non-negative > then the node gets added correctly. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16155862#comment-16155862 ] Dan Benediktson commented on ZOOKEEPER-2471: Hey, sorry, I've been meaning to get back to this, but I wasn't expecting to sign up for porting the exponential backoff retry until this was in. Polishing that patch up so that it can be accepted in mainline will take me a fair bit more time, since our version of the code straight-up replaced the existing logic with jittered exponential backoff (we haven't run a JVM ZK client without jittered exponential backoff in > 1.5 years), and I doubt Apache ZK would be willing to accept that. I simply don't have time right now to do that work, and won't for at least a month. It also makes me a bit nervous to offer a pull request for the exponential backoff feature without this fix already checked in, since this was an extremely expensive bug for us, but I'm sympathetic to the desire to unit test it; we simply didn't have time to concoct a unit test back when we needed the fix urgently, since the nature of Zookeeper code makes it generally pretty difficult to add unit tests for most areas, including this one. > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.3 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and difficult-to-find bug in the first place. >
[jira] [Commented] (ZOOKEEPER-2886) Permanent session moved error in multi-op only connections
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16147872#comment-16147872 ] Dan Benediktson commented on ZOOKEEPER-2886: This is a duplicate of ZOOKEEPER-2440, for which it looks like an svn-style patch was provided about 14 months ago and some review was done about 13 months ago (that was before the git migration). > Permanent session moved error in multi-op only connections > -- > > Key: ZOOKEEPER-2886 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2886 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.10, 3.5.3, 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv > > If there are slow followers, it's possible that the leader and the client > disagree on where the client is connecting to, therefore the client keeps > getting "Session Moved" error. Partial of the issue fixed in Jira: > ZOOKEEPER-710, but leaves the issue in multi-op only connection. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136834#comment-16136834 ] Dan Benediktson commented on ZOOKEEPER-2471: Thanks for the reviews, [~nickt] and [~hanm]. Sorry, I should have been more clear about the scenario: although I didn't realize it at the time I initially filed this bug (I wasn't aware of the extent of the divergence of my own fork from mainline), I don't think you can reasonably run into an infinite connect loop on *current* Apache Zookeeper. On the current code base, hitting an infinite connect loop basically requires falling victim to the random number generator repeatedly; not likely at all. However, if you pick up exponential backup from ZOOKEEPER-2869, it becomes a certainty if you hit the right conditions: basically, you just need to get enough initial failures to rack up a large exponential sleep backoff, which in practice happened to us during leader election on a large ensemble, and then your sleep time can potentially outgrow your connect timeout. That's somewhat dependent on the backoff algorithm used, but the one we have is pretty reasonable in how it caps the backoff, and it was certainly still possible there: we had applications which went down completely since they were unable to connect to ZK until we recycled the application every time we leader elected the ensemble they talked to, until we fixed this. So, that's why I suggested in the pull request that maybe a compromise would be to submit tests along with ZOOKEEPER-2869: a lot of the machinery that's needed to make this testable is actually stuff that can be introduced as part of the product feature in ZOOKEEPER-2869. > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.3 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be e
[jira] [Commented] (ZOOKEEPER-2879) Adding observers dynamically without server id
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16136823#comment-16136823 ] Dan Benediktson commented on ZOOKEEPER-2879: [~hanm] No, local sessions are not a sufficient mitigation for the problems introduced by this proposed feature: - Local sessions can be upgraded to a global session at any time by creating an ephemeral node; basically, enabling local sessions across an ensemble does not guarantee that all sessions are local. If one of them upgrades, for instance, what happens to the other one? - Moreover, I believe this proposed feature would let you accidentally reuse a local session across two members of the same ensemble. Imagine two sessions connect to different members and get the same session ID: now, one of them disconnects and tries to reconnect, and lands on the second server (this is possible: local sessions are purely a server-side concept, so in practice clients with a local session DO try to reconnect to another server, even though it is guaranteed to fail). I expect it would succeed, at which point there are would be two connections for the same local session, one of which actually originated from another node. I don't know precisely what this would entail, but it's probably also undesirable. > Adding observers dynamically without server id > -- > > Key: ZOOKEEPER-2879 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2879 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv > > Dynamic config requires observer has unique server id, which means we cannot > simply add observer with dynamic server id -1. For large observer cluster, > it's much more easier to add observer without unique server id if it doesn't > need to be promoted to participant. Also, it will make dynamic config more > efficient, we don't need to store and send the long list of observers during > re-config. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2879) Adding observers dynamically without server id
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135568#comment-16135568 ] Dan Benediktson commented on ZOOKEEPER-2879: It's also unclear to me what problem this feature is intended to solve. - Efficiency of dynamic config should be completely unimportant: the cost of another < 100 bytes of configuration data per observer is vanishingly small. This doesn't seem to be the real goal, just an added benefit, but let's be clear that it isn't a benefit. - The real goal seems to be ease of management, but I don't really understand that. Can you elaborate in what way it is easier to not have server IDs? FWIW, we operate dozens of ZK clusters, some of them with more than 50 members, so I'm not unfamiliar with the problems of managing "large" ZK ensembles, for some definition of large; having to assign unique server IDs isn't really one of them, at least at our scale. > Adding observers dynamically without server id > -- > > Key: ZOOKEEPER-2879 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2879 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv > > Dynamic config requires observer has unique server id, which means we cannot > simply add observer with dynamic server id -1. For large observer cluster, > it's much more easier to add observer without unique server id if it doesn't > need to be promoted to participant. Also, it will make dynamic config more > efficient, we don't need to store and send the long list of observers during > re-config. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2879) Adding observers dynamically without server id
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2879?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16135539#comment-16135539 ] Dan Benediktson commented on ZOOKEEPER-2879: IIRC, the unique server ID is also used as part of ensuring that the session IDs generated on any given server are guaranteed to be unique. If that is violated, you can see very strange behavior (refer to ZOOKEEPER-2504, where I submitted a patch for ensuring the leader permitted at most one server with a given ID at a time, to prevent that strange behavior). > Adding observers dynamically without server id > -- > > Key: ZOOKEEPER-2879 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2879 > Project: ZooKeeper > Issue Type: Improvement > Components: quorum >Affects Versions: 3.6.0 >Reporter: Fangmin Lv >Assignee: Fangmin Lv > > Dynamic config requires observer has unique server id, which means we cannot > simply add observer with dynamic server id -1. For large observer cluster, > it's much more easier to add observer without unique server id if it doesn't > need to be promoted to participant. Also, it will make dynamic config more > efficient, we don't need to store and send the long list of observers during > re-config. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16120331#comment-16120331 ] Dan Benediktson commented on ZOOKEEPER-2471: Core tests that I see failed in the log both had already passed on my local run: ChrootClientTest.testNonExistingOpCode WatchEventWhenAutoResetTest.testNodeDataChanged The first one is clearly suspicious because my patch allegedly "fixed" the corresponding ClientTest.testNonExistingOpCode, which it shouldn't have done anything about, so I'm pretty certain that's just a flaky test. I've tried running both of those test cases about 10 times on my MBP to no avail; they pass every time. I also already ran the full "ant test" suite before submitting the patch in the first place, and it all succeeded. Any suggestions here? > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.1 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and difficult-to-find bug in the first place. > I'm currently running ant tests locally against my patch on trunk, and then > I'll upload it here. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2869) Allow for exponential backoff in ClientCnxn.SendThread on connection re-establishment
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117581#comment-16117581 ] Dan Benediktson commented on ZOOKEEPER-2869: Yeah, ours has been running in production at Twitter for 1.5 years now. It's not rocket surgery, but it at least unearthed that other bug I mentioned, so I think it's fair to call it battle-tested. The only arguable downside I see to the implementation we have is that, while the implementation has a pluggable backoff provider, there's no config for selecting the implementation, and there's no implementation to maintain the existing behavior; it just forces the caller into using exponential backoff for reconnect. And while I'm a firm believer that it's better to just make the product do "the right thing" than it is to provide a bunch of knobs, I can see if this behavior change is not viewed as being obviously preferable. Would be good if one of the committers or PMC can chime in with their view. In the meantime, I'll look at getting a pull request put together. I haven't tried submitting a ZK patch since the project moved off SVN. > Allow for exponential backoff in ClientCnxn.SendThread on connection > re-establishment > - > > Key: ZOOKEEPER-2869 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2869 > Project: ZooKeeper > Issue Type: Improvement > Components: java client >Affects Versions: 3.4.10, 3.5.3 >Reporter: Nick Travers >Priority: Minor > > As part of ZOOKEEPER-961, when the client re-establishes a connection to the > server, it will sleep for a random number of milliseconds in the range [0, > 1000). Introduced > [here|https://github.com/apache/zookeeper/commit/d84dc077d576b7cdfbfd003e3425fab85ca29a44]. > These reconnects can cause excessive logging in clients if the server is > unavailable for an extended period of time, with reconnects every 500ms on > average. > One solution could be to allow for exponential backoff in the client. The > backoff params could be made configurable. > [3.5.x > code|https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1059]. > [3.4.x > code|https://github.com/apache/zookeeper/blob/release-3.4.9/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1051]. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2869) Allow for exponential backoff in ClientCnxn.SendThread on connection re-establishment
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2869?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16117378#comment-16117378 ] Dan Benediktson commented on ZOOKEEPER-2869: Our fork actually has solved this problem, using a standard jittered exponential backoff algorithm (the problem being addressed there was partially around addressing thundering herds, so jittering was deemed necessary for that). I wouldn't mind porting our code and offering a patch for it; anything that gets us closer to upstream is goodness. However, we really need to take the fix I provided a year ago for ZOOKEEPER-2471 before doing this, otherwise allowing higher backoff than 1 second will dramatically increase the likelihood of clients getting completely wedged in a sleep/retry loop. > Allow for exponential backoff in ClientCnxn.SendThread on connection > re-establishment > - > > Key: ZOOKEEPER-2869 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2869 > Project: ZooKeeper > Issue Type: Improvement > Components: java client >Affects Versions: 3.4.10, 3.5.3 >Reporter: Nick Travers >Priority: Minor > > As part of ZOOKEEPER-961, when the client re-establishes a connection to the > server, it will sleep for a random number of milliseconds in the range [0, > 1000). Introduced > [here|https://github.com/apache/zookeeper/commit/d84dc077d576b7cdfbfd003e3425fab85ca29a44]. > These reconnects can cause excessive logging in clients if the server is > unavailable for an extended period of time, with reconnects every 500ms on > average. > One solution could be to allow for exponential backoff in the client. The > backoff params could be made configurable. > [3.5.x > code|https://github.com/apache/zookeeper/blob/release-3.5.3/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1059]. > [3.4.x > code|https://github.com/apache/zookeeper/blob/release-3.4.9/src/java/main/org/apache/zookeeper/ClientCnxn.java#L1051]. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (ZOOKEEPER-2531) Configuration as "zookeeper.secure=true/false" can be introduced and reading and verifying all ssl related configuration (like secureport, keystore, truststore, cor
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2531?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15437103#comment-15437103 ] Dan Benediktson commented on ZOOKEEPER-2531: I'd encourage using a more specific name, e.g. "zookeeper.validateSslConfigurationAtStartup" > Configuration as "zookeeper.secure=true/false" can be introduced and reading > and verifying all ssl related configuration (like secureport, keystore, > truststore, corresponding password) should be done only when > "zookeeper.secure" flag is true > - > > Key: ZOOKEEPER-2531 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2531 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.5.1 >Reporter: Rakesh Kumar Singh >Priority: Minor > > Configuration as "zookeeper.secure=true/false" can be introduced and reading > and verifying all ssl related configuration (like secureport, keystore, > truststore, corresponding password) should be done only when > "zookeeper.secure" flag is true -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15419170#comment-15419170 ] Dan Benediktson commented on ZOOKEEPER-2504: Test results say that core tests all passed (except one that was skipped), in spite of the above comment... > Enforce that server ids are unique in a cluster > --- > > Key: ZOOKEEPER-2504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2504.patch > > > The leader will happily accept connections from learners that have the same > server id (i.e., due to misconfiguration). This can lead to various issues > including non-unique session_ids being generated by these servers. > The leader can enforce that all learners come in with unique server IDs; if a > learner attempts to connect with an id that is already in use, it should be > denied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2504: --- Attachment: ZOOKEEPER-2504.patch Didn't see the findbugs warnings during local testing... > Enforce that server ids are unique in a cluster > --- > > Key: ZOOKEEPER-2504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2504.patch > > > The leader will happily accept connections from learners that have the same > server id (i.e., due to misconfiguration). This can lead to various issues > including non-unique session_ids being generated by these servers. > The leader can enforce that all learners come in with unique server IDs; if a > learner attempts to connect with an id that is already in use, it should be > denied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2504: --- Attachment: (was: ZOOKEEPER-2504.patch) > Enforce that server ids are unique in a cluster > --- > > Key: ZOOKEEPER-2504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Dan Benediktson >Assignee: Dan Benediktson > > The leader will happily accept connections from learners that have the same > server id (i.e., due to misconfiguration). This can lead to various issues > including non-unique session_ids being generated by these servers. > The leader can enforce that all learners come in with unique server IDs; if a > learner attempts to connect with an id that is already in use, it should be > denied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2504: --- Attachment: ZOOKEEPER-2504.patch > Enforce that server ids are unique in a cluster > --- > > Key: ZOOKEEPER-2504 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 > Project: ZooKeeper > Issue Type: Bug >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2504.patch > > > The leader will happily accept connections from learners that have the same > server id (i.e., due to misconfiguration). This can lead to various issues > including non-unique session_ids being generated by these servers. > The leader can enforce that all learners come in with unique server IDs; if a > learner attempts to connect with an id that is already in use, it should be > denied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZOOKEEPER-2504) Enforce that server ids are unique in a cluster
Dan Benediktson created ZOOKEEPER-2504: -- Summary: Enforce that server ids are unique in a cluster Key: ZOOKEEPER-2504 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2504 Project: ZooKeeper Issue Type: Bug Reporter: Dan Benediktson The leader will happily accept connections from learners that have the same server id (i.e., due to misconfiguration). This can lead to various issues including non-unique session_ids being generated by these servers. The leader can enforce that all learners come in with unique server IDs; if a learner attempts to connect with an id that is already in use, it should be denied. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2503) Inconsistency between myid documentation and implementation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410221#comment-15410221 ] Dan Benediktson commented on ZOOKEEPER-2503: Absolutely, I'd be delighted to. Will file the JIRA today, and will try to get a patch attached today, or latest on Monday. > Inconsistency between myid documentation and implementation > --- > > Key: ZOOKEEPER-2503 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2503 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.2 >Reporter: Michael Han > Fix For: 3.5.3, 3.4.10 > > > In ZK documentation, we have: > "The myid file consists of a single line containing only the text of that > machine's id. So myid of server 1 would contain the text "1" and nothing > else. The id must be unique within the ensemble and should have a value > between 1 and 255." > This however is not enforced in code, which should be fixed either in > documentation that we remove the restriction of the range 1-255 or in code we > enforce such constraint. > Discussion thread: > http://zookeeper-user.578899.n2.nabble.com/Is-myid-actually-limited-to-1-255-td7581270.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2503) Inconsistency between myid documentation and implementation
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15410083#comment-15410083 ] Dan Benediktson commented on ZOOKEEPER-2503: Honestly, it's probably for the best to break those hypothetical existing configurations, since in some ways they are already broken. It sounds like the behavior would result in truncation down to one byte of effective server ID, and from personal experience, I can say that if two servers are running with the same Server ID in an ensemble it can have weird results. We had a configuration issue in one of our ensembles at one point where two servers had been misconfigured to use the same ServerId, which the Leader happily allows, and which we think resulted in duplicate session IDs getting handed out to two different sessions, and then to strange side-effects down the line. We actually have a fix in our Zookeeper fork to make LearnerHandler block any server from being allowed in if the Learner is trying to claim a Server ID that's already in use by another Learner, to prevent this configuration problem from sneaking in again. > Inconsistency between myid documentation and implementation > --- > > Key: ZOOKEEPER-2503 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2503 > Project: ZooKeeper > Issue Type: Bug > Components: server >Affects Versions: 3.4.9, 3.5.2 >Reporter: Michael Han > Fix For: 3.5.3, 3.4.10 > > > In ZK documentation, we have: > "The myid file consists of a single line containing only the text of that > machine's id. So myid of server 1 would contain the text "1" and nothing > else. The id must be unique within the ensemble and should have a value > between 1 and 255." > This however is not enforced in code, which should be fixed either in > documentation that we remove the restriction of the range 1-255 or in code we > enforce such constraint. > Discussion thread: > http://zookeeper-user.578899.n2.nabble.com/Is-myid-actually-limited-to-1-255-td7581270.html -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2447: --- Assignee: Edward Ribeiro (was: Dan Benediktson) > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Edward Ribeiro > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15406240#comment-15406240 ] Dan Benediktson commented on ZOOKEEPER-2447: Correct, I don't think there is a patch in a completed state. The patch I provided was just to avoid a specific problem during the most recent proposed solution: the problem where you try to connect using only a slice of the connection time, but the chosen slice is too small to reasonably expect a connection to succeed. I think [~eribeiro] offered to look at a comprehensive solution, so I'm assigning over to him for the time being. > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Dan Benediktson > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15371209#comment-15371209 ] Dan Benediktson commented on ZOOKEEPER-2447: The failures appear to be unrelated to the changes: junit.framework.AssertionFailedError: Unexpected bean exists! expected:<0> but was:<1> at org.apache.zookeeper.test.ClientBase.verifyUnexpectedBeans(ClientBase.java:498) at org.apache.zookeeper.test.ClientBase.startServer(ClientBase.java:477) at org.apache.zookeeper.test.ClientBase.setUp(ClientBase.java:460) at org.apache.zookeeper.test.WatcherTest.setUp(WatcherTest.java:73) And the same tests are passing locally for me. > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson reassigned ZOOKEEPER-2447: -- Assignee: Dan Benediktson (was: Vishal Khandelwal) > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Dan Benediktson > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2447: --- Assignee: Vishal Khandelwal (was: Dan Benediktson) > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2447: --- Attachment: (was: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch) > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2447: --- Attachment: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch Attaching patch for "MinConnectTimeout" enforcement - kind of an auxiliary for this issue. > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Dan Benediktson > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2447: --- Attachment: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch Attaching patch for the MinConnectTimeout feature (kind of an auxiliary for the overall problem here) > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Dan Benediktson > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447-MinConnectTimeoutOnly.patch, > ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2471: --- Attachment: ZOOKEEPER-2471.patch > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.1 > Environment: all >Reporter: Dan Benediktson >Assignee: Dan Benediktson > Attachments: ZOOKEEPER-2471.patch > > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and difficult-to-find bug in the first place. > I'm currently running ant tests locally against my patch on trunk, and then > I'll upload it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2471?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15368658#comment-15368658 ] Dan Benediktson commented on ZOOKEEPER-2471: If anyone can assign this to me, I'll upload the patch I have. > Java Zookeeper Client incorrectly considers time spent sleeping as time spent > connecting, potentially resulting in infinite reconnect loop > -- > > Key: ZOOKEEPER-2471 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.1 > Environment: all >Reporter: Dan Benediktson > > ClientCnxnSocket uses a member variable "now" to track the current time, and > lastSend / lastHeard variables to track socket liveness. Implementations, and > even ClientCnxn itself, are expected to call both updateNow() to reset "now" > to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() > on IO completions. > This is a fragile contract, so it's not surprising that there's a bug > resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() > as soon as startConnect() returns, but it does not call updateNow() first. I > expect when this was written, either the expectation was that startConnect() > was an asynchronous operation and that updateNow() would have been called > very recently, or simply the requirement to call updateNow() was forgotten at > this point. As far as I can see, this bug has been present since the > "updateNow" method was first introduced in the distant past. As it turns out, > since startConnect() calls HostProvider.next(), which can sleep, quite a lot > of time can pass, leaving a big gap between "now" and now. > If you are using very short session timeouts (one of our ZK ensembles has > many clients using a 1-second timeout), this is potentially disastrous, > because the sleep time may exceed the connection timeout itself, which can > potentially result in the Java client being stuck in a perpetual reconnect > loop. The exact code path it goes through in this case is complicated, > because there has to be a previously-closed socket still waiting in the > selector (otherwise, the first timeout evaluation will not fail because "now" > still hasn't been updated, and then the actual connect timeout will be > applied in ClientCnxnSocket.doTransport()) so that select() will harvest the > IO from the previous socket and updateNow(), resulting in the next loop > through ClientCnxnSocket.SendThread.run() observing the spurious timeout and > failing. In practice it does happen to us fairly frequently; we only got to > the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper > client object is rendered unusable: it's stuck in a perpetual reconnect loop > where it keeps sleeping, opening a socket, and immediately closing it. > I have a patch. Rather than calling updateNow() right after startConnect(), > my fix is to remove the "now" member variable and the updateNow() method > entirely, and to instead just call System.currentTimeMillis() whenever time > needs to be evaluated. I realize there is a benefit (aside from a trivial > micro-optimization not worth worrying about) to having the time be "fixed", > particularly for truth in the logging: if time is fixed by an updateNow() > call, then the log for a timeout will still show exactly the same value the > code reasoned about. However, this benefit is in my opinion not enough to > merit the fragility of the contract which led to this (for us) highly > impactful and difficult-to-find bug in the first place. > I'm currently running ant tests locally against my patch on trunk, and then > I'll upload it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (ZOOKEEPER-2471) Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop
Dan Benediktson created ZOOKEEPER-2471: -- Summary: Java Zookeeper Client incorrectly considers time spent sleeping as time spent connecting, potentially resulting in infinite reconnect loop Key: ZOOKEEPER-2471 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2471 Project: ZooKeeper Issue Type: Bug Components: java client Affects Versions: 3.5.1 Environment: all Reporter: Dan Benediktson ClientCnxnSocket uses a member variable "now" to track the current time, and lastSend / lastHeard variables to track socket liveness. Implementations, and even ClientCnxn itself, are expected to call both updateNow() to reset "now" to System.currentTimeMillis, and then call updateLastSend()/updateLastHeard() on IO completions. This is a fragile contract, so it's not surprising that there's a bug resulting from it: ClientCnxn.SendThread.run() calls updateLastSendAndHeard() as soon as startConnect() returns, but it does not call updateNow() first. I expect when this was written, either the expectation was that startConnect() was an asynchronous operation and that updateNow() would have been called very recently, or simply the requirement to call updateNow() was forgotten at this point. As far as I can see, this bug has been present since the "updateNow" method was first introduced in the distant past. As it turns out, since startConnect() calls HostProvider.next(), which can sleep, quite a lot of time can pass, leaving a big gap between "now" and now. If you are using very short session timeouts (one of our ZK ensembles has many clients using a 1-second timeout), this is potentially disastrous, because the sleep time may exceed the connection timeout itself, which can potentially result in the Java client being stuck in a perpetual reconnect loop. The exact code path it goes through in this case is complicated, because there has to be a previously-closed socket still waiting in the selector (otherwise, the first timeout evaluation will not fail because "now" still hasn't been updated, and then the actual connect timeout will be applied in ClientCnxnSocket.doTransport()) so that select() will harvest the IO from the previous socket and updateNow(), resulting in the next loop through ClientCnxnSocket.SendThread.run() observing the spurious timeout and failing. In practice it does happen to us fairly frequently; we only got to the bottom of the bug yesterday. Worse, when it does happen, the Zookeeper client object is rendered unusable: it's stuck in a perpetual reconnect loop where it keeps sleeping, opening a socket, and immediately closing it. I have a patch. Rather than calling updateNow() right after startConnect(), my fix is to remove the "now" member variable and the updateNow() method entirely, and to instead just call System.currentTimeMillis() whenever time needs to be evaluated. I realize there is a benefit (aside from a trivial micro-optimization not worth worrying about) to having the time be "fixed", particularly for truth in the logging: if time is fixed by an updateNow() call, then the log for a timeout will still show exactly the same value the code reasoned about. However, this benefit is in my opinion not enough to merit the fragility of the contract which led to this (for us) highly impactful and difficult-to-find bug in the first place. I'm currently running ant tests locally against my patch on trunk, and then I'll upload it here. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15363322#comment-15363322 ] Dan Benediktson commented on ZOOKEEPER-2447: I have a patch file for the minConnectTimeout enforcement ready, but I don't seem to be able to attach it to this issue - should I create a separate issue assigned to myself for that one part? Sorry for the novice questions, my first time attempting to contribute. > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-1045) Support Quorum Peer mutual authentication via SASL
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15357399#comment-15357399 ] Dan Benediktson commented on ZOOKEEPER-1045: This is kind of a side comment on this topic, but please make sure you support the case where all the ZK hosts run as the same Kerberos principal. You don't have to support *only* that case, of course, but it's definitely how I would be deploying ZK when using Kerb auth. The reason for running all the service instances with the same Kerb principal is to enable clients to do Kerberos AuthN to all the ZK hosts using a single DNS name, which is pretty common, I think; we certainly do it, so that we can scale out the ensemble for more throughput as needed. Since they're pointed at a single DNS name, the clients should always construct the same service principal name, so the client will get a ticket that's only good for a single Kerberos service principal. All the services must be running as that same principal, otherwise they won't be able to crack the Kerberos ticket. Basically, since the clients can't see a difference between the servers (due to the shared DNS name), and since the clients are authenticating the servers' Kerberos identity, the servers have to be identical (according to Kerberos identity). > Support Quorum Peer mutual authentication via SASL > -- > > Key: ZOOKEEPER-1045 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1045 > Project: ZooKeeper > Issue Type: New Feature > Components: server >Reporter: Eugene Koontz >Assignee: Rakesh R >Priority: Critical > Fix For: 3.4.9, 3.5.3 > > Attachments: 0001-ZOOKEEPER-1045-br-3-4.patch, > 1045_failing_phunt.tar.gz, ZK-1045-test-case-failure-logs.zip, > ZOOKEEPER-1045-00.patch, ZOOKEEPER-1045-Rolling Upgrade Design Proposal.pdf, > ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, > ZOOKEEPER-1045-br-3-4.patch, ZOOKEEPER-1045-br-3-4.patch, > ZOOKEEPER-1045-br-3-4.patch > > > ZOOKEEPER-938 addresses mutual authentication between clients and servers. > This bug, on the other hand, is for authentication among quorum peers. > Hopefully much of the work done on SASL integration with Zookeeper for > ZOOKEEPER-938 can be used as a foundation for this enhancement. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dan Benediktson updated ZOOKEEPER-2447: --- Comment: was deleted (was: Given that the goal is to open a socket, implementing an isReachable() method which is just opening a socket with a timeout seems wasteful. It will add unnecessary additional latency (the RTT latency for TCP handshake) and overhead to every connection, even successful ones where the ensemble isn't suffering due to a down host; it'll also put a little extra load on the servers, not just clients. Would it make sense to make ClientCnxnSocket.open() implementations apply the timeout directly? As far as choosing the connection timeout to use, I would suggest just using the time-sliced session timeout (I thought this was already done, but perhaps it's only applied later in the connection handshake?) but applying some reasonable lower bound (say, default of 1 second and overridable by a Java system property). We actually have a fork of the ZK code which has that minimum bound for connection timeout logic in the Java client, and I'd be happy to prepare a patch for that part. It's pretty trivial code, but it's important if you start using a combination of large ZK clusters and very small session timeouts, which we have in some of our ensembles.) > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355843#comment-15355843 ] Dan Benediktson commented on ZOOKEEPER-2447: Given that the goal is to open a socket, implementing an isReachable() method which is just opening a socket with a timeout seems wasteful. It will add unnecessary additional latency (the RTT latency for TCP handshake) and overhead to every connection, even successful ones where the ensemble isn't suffering due to a down host; it'll also put a little extra load on the servers, not just clients. Would it make sense to make ClientCnxnSocket.open() implementations apply the timeout directly? As far as choosing the connection timeout to use, I would suggest just using the time-sliced session timeout (I thought this was already done, but perhaps it's only applied later in the connection handshake?) but applying some reasonable lower bound (say, default of 1 second and overridable by a Java system property). We actually have a fork of the ZK code which has that minimum bound for connection timeout logic in the Java client, and I'd be happy to prepare a patch for that part. It's pretty trivial code, but it's important if you start using a combination of large ZK clusters and very small session timeouts, which we have in some of our ensembles. > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (ZOOKEEPER-2447) Zookeeper adds good delay when one of the quorum host is not reachable
[ https://issues.apache.org/jira/browse/ZOOKEEPER-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15355844#comment-15355844 ] Dan Benediktson commented on ZOOKEEPER-2447: Given that the goal is to open a socket, implementing an isReachable() method which is just opening a socket with a timeout seems wasteful. It will add unnecessary additional latency (the RTT latency for TCP handshake) and overhead to every connection, even successful ones where the ensemble isn't suffering due to a down host; it'll also put a little extra load on the servers, not just clients. Would it make sense to make ClientCnxnSocket.open() implementations apply the timeout directly? As far as choosing the connection timeout to use, I would suggest just using the time-sliced session timeout (I thought this was already done, but perhaps it's only applied later in the connection handshake?) but applying some reasonable lower bound (say, default of 1 second and overridable by a Java system property). We actually have a fork of the ZK code which has that minimum bound for connection timeout logic in the Java client, and I'd be happy to prepare a patch for that part. It's pretty trivial code, but it's important if you start using a combination of large ZK clusters and very small session timeouts, which we have in some of our ensembles. > Zookeeper adds good delay when one of the quorum host is not reachable > --- > > Key: ZOOKEEPER-2447 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2447 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.4.6, 3.5.0 >Reporter: Vishal Khandelwal >Assignee: Vishal Khandelwal > Fix For: 3.5.3, 3.6.0 > > Attachments: ZOOKEEPER-2447.3.5.patch, withfix.txt, withoutFix.txt > > > StaticHostProvider --> resolveAndShuffle method adds all of the address which > are valid in the quorum to the list, shuffles them and sends back to client > connection class. If after shuffling if first node appear to be the one which > is not reachable, Clientcnx.SendThread.run will keep on connecting to the > failure till a timeout and the moves to a different node. This adds up random > delay in zookeeper connection in case a host is down. Rather we could check > if host is reachable in StaticHostProvider and ignore isReachable is false. > Same as we do for UnknownHostException Exception. > This can tested using following test code by providing a valid host which is > not reachable. for quick test comment Collections.shuffle(tmpList, > sourceOfRandomness); in StaticHostProvider.resolveAndShuffle > {code} > @Test > public void test() throws Exception { > EventsWatcher watcher = new EventsWatcher(); > QuorumUtil qu = new QuorumUtil(1); > qu.startAll(); > > ZooKeeper zk = > new ZooKeeper(" watcher); > > watcher.waitForConnected(CONNECTION_TIMEOUT * 5); > Assert.assertTrue("connection Established", watcher.isConnected()); > zk.close(); > } > {code} > Following fix can be added to StaticHostProvider.resolveAndShuffle > {code} > if(taddr.isReachable(4000 // can be some value)) { > tmpList.add(new InetSocketAddress(taddr, > address.getPort())); > } > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)