[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13634365#comment-13634365 ] Shevek commented on ZOOKEEPER-1683: --- I think it looks OK. We put the code into production on several clusters on Monday afternoon, and we're waiting on results. It's a bit early to promise anything from tests. I'm still somewhat unhappy with the client code overall, as I don't think the contracts of many of the states are clear enough to definitively claim correctness. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch, > ZOOKEEPER-1683-ver2.patch, ZOOKEEPER-1683-ver2.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (ZOOKEEPER-1684) Failure to update socket addresses on immedate connection
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shevek updated ZOOKEEPER-1684: -- Attachment: ZOOKEEPER-1684.diff My suggestion as a patch. Don't update variables; make things definitively correct. > Failure to update socket addresses on immedate connection > - > > Key: ZOOKEEPER-1684 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1684 > Project: ZooKeeper > Issue Type: Bug >Reporter: Shevek > Attachments: ZOOKEEPER-1684.diff > > > I quote: > void registerAndConnect(SocketChannel sock, InetSocketAddress addr) > throws IOException { > sockKey = sock.register(selector, SelectionKey.OP_CONNECT); > boolean immediateConnect = sock.connect(addr); > if (immediateConnect) > { sendThread.primeConnection(); } > } > In the immediate case, there are several bugs: > a) updateSocketAddresses() is never called, as it is when the select-loop in > doTransport(). This means that clientCnxnSocket.getRemoteSocketAddress() will > return null for the lifetime of this socket? > b) CONNECT still in the interest set for the socket. > c) updateLastSendAndHeard() is never called either. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1677) Misuse of INET_ADDRSTRLEN
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13633176#comment-13633176 ] Shevek commented on ZOOKEEPER-1677: --- Looks kosher to me. Note: I found this bug by reading source, not by occurrence, so I don't have an actual test. The casts within the memcmp are unnecessary, and of no benefit, since they aren't checked. > Misuse of INET_ADDRSTRLEN > - > > Key: ZOOKEEPER-1677 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1677 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Marshall McMullen > Attachments: ZOOKEEPER-1677.patch > > > ZOOKEEPER-1355. Add zk.updateServerList(newServerList) (Alex Shraer, > Marshall McMullen via fpj) > > > > git-svn-id: https://svn.apache.org/repos/asf/zookeeper/trunk@1410731 > 13f79535-47bb-0310-9956-ffa450edef68 > +int addrvec_contains(const addrvec_t *avec, const struct sockaddr_storage > *addr) > +{ > +if (!avec || !addr) > +{ > +return 0; > +} > + > +int i = 0; > +for (i = 0; i < avec->count; i++) > +{ > +if(memcmp(&avec->data[i], addr, INET_ADDRSTRLEN) == 0) > +return 1; > +} > + > +return 0; > +} > Pretty sure that should be sizeof(sockaddr_storage). INET_ADDRSTRLEN is the > size of the character buffer which needs to be allocated for the return value > of inet_ntop, which seems to be totally wrong. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13629780#comment-13629780 ] Shevek commented on ZOOKEEPER-1683: --- It's going to be a while, I'm afraid; I'm flying every other day for a bit. Right now, I just catch the NPE in upstream code and reset totally, ditching the session. :( > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628652#comment-13628652 ] Shevek commented on ZOOKEEPER-1683: --- volatile causes a write-read pair of the variable to form a transitive happen-before relationship in the JMM. It does not solve the race - we (A) could check sockKey != null, then the close() (B) could clear it, then we (A) indirect on it, and we have an ABA race. Personally, I would probably change testableCloseSocket to load sockKey into a temporary local, then test for nullness of the local before indirecting on that. No synchronization needed, and pointer assignment is always atomic, even in 64-bit JVMs. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628622#comment-13628622 ] Shevek commented on ZOOKEEPER-1683: --- I think you can't just check isConnected() / sockKey != null because it's being set to null on a different thread, so then there will be a race for the NPE instead of just an NPE. One of the other issues with this thread contract, where values are set within an object which is accessible to more than one thread is that JSR133 says that assignments performed in constructors do not happen-before the constructor returns unless the variable being assigned is final. Since ClientCnxnNIO is accessible to more than one thread at the point variables like sockKey are initialized, I suspect the code violates the JMM anyway. A much better solution would be to use more immutable variables, and discard and reconstruct objects rather than attempting to modify them. An ideal solution would be to use something like netty, which is designed to handle and abstract all these cases. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627495#comment-13627495 ] Shevek commented on ZOOKEEPER-1683: --- sockKey == null, because previously: 2013-04-10 05:19:52,787 WARN [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxn$SendThread.run (ClientCnxn.java:1122 ) - Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect java.net.ConnectException: Connection refused at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:1.6.0_24] at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:592) ~[?:1.6.0_24] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:353) ~[zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1101) [zookeeper-3.5.0.jar:3.5.0--1] 2013-04-10 05:19:52,788 DEBUG [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxnSocketNIO.cleanup (ClientCnxnSocketNI O.java:193) - Ignoring exception during shutdown input java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownInput(SocketChannelImpl.java:656) ~[?:1.6.0_24] at sun.nio.ch.SocketAdaptor.shutdownInput(SocketAdaptor.java:378) ~[?:1.6.0_24] at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:190) [zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1190) [zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) [zookeeper-3.5.0.jar:3.5.0--1] 2013-04-10 05:19:52,789 DEBUG [main-SendThread(localhost:2181)] org.apache.zookeeper.ClientCnxnSocketNIO.cleanup (ClientCnxnSocketNI O.java:200) - Ignoring exception during shutdown output java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.shutdownOutput(SocketChannelImpl.java:667) ~[?:1.6.0_24] at sun.nio.ch.SocketAdaptor.shutdownOutput(SocketAdaptor.java:386) ~[?:1.6.0_24] at org.apache.zookeeper.ClientCnxnSocketNIO.cleanup(ClientCnxnSocketNIO.java:197) [zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ClientCnxn$SendThread.cleanup(ClientCnxn.java:1190) [zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1130) [zookeeper-3.5.0.jar:3.5.0--1] thus clearing sockKey. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13627493#comment-13627493 ] Shevek commented on ZOOKEEPER-1683: --- Actual testing failed, when first connected to invalid server, then attempting to update to valid server. java.lang.NullPointerException at org.apache.zookeeper.ClientCnxnSocketNIO.testableCloseSocket(ClientCnxnSocketNIO.java:376) ~[zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:187) ~[zookeeper-3.5.0.jar:3.5.0--1] > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13625598#comment-13625598 ] Shevek commented on ZOOKEEPER-1683: --- Withdraw that; it's the previous server list. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13625596#comment-13625596 ] Shevek commented on ZOOKEEPER-1683: --- + if (lastIndex >=0) { + // take the last server to which we were connected + myServer = this.serverAddresses.get(lastIndex); I don't see why this is guaranteed not to throw an IndexOutOfBoundsException. There is no relationship between lastIndex from the previous server list, and serverAddresses, the new server list. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug > Components: java client >Affects Versions: 3.5.0 >Reporter: Shevek >Assignee: Alexander Shraer > Fix For: 3.5.0 > > Attachments: ZOOKEEPER-1683.patch, ZOOKEEPER-1683-ver1.patch > > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624089#comment-13624089 ] Shevek commented on ZOOKEEPER-1683: --- TL;DR: It should never be null. That's the bug. When "new ZooKeeper()" returns, we know which host we intend to connect to. We just don't have that information due to a sequence of bugs and races. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Shevek >Assignee: Alexander Shraer > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624086#comment-13624086 ] Shevek commented on ZOOKEEPER-1683: --- The trouble is that there's hidden state in flight. The question is, what is the state of the ZooKeeper client when the constructor returns. If it is determined that it has a remote socket address, i.e. updateServerList should be callable as-written, and be safe, then ClientCnxn needs to SYNCHRONOUSLY, i.e. not via SendThread.start() record the "intended" target connection address. If we follow the proposal to allow the address to be null in updateServerList, then either: a) We nulls (treat it as always in the list), in which case the "invisible" state of the intended server address of the in-flight connect may still allow us to connect to a "wrong" server. b) We say a null is never in the list, in which case repeated fast calls to updateServerList could cause a denial of service of the ZooKeeper client. The difficulty, I think, is that the state of the client when "new ZooKeeper" returns is defined by a race condition. The INTENT of the connection needs to be recorded in 'new ClientCnxn' before SendThread.start() goes asynchronous. A somewhat uglier hack might be to make StaticHostProvider cache the previously returned value and give the ClientCnxn a kicking if the cached value was removed from the list. While this would also work, I feel that it would contribute negatively to the overall complexity and coupledness of the code. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Shevek >Assignee: Alexander Shraer > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (ZOOKEEPER-1684) Failure to update socket addresses on immedate connection
Shevek created ZOOKEEPER-1684: - Summary: Failure to update socket addresses on immedate connection Key: ZOOKEEPER-1684 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1684 Project: ZooKeeper Issue Type: Bug Reporter: Shevek I quote: void registerAndConnect(SocketChannel sock, InetSocketAddress addr) throws IOException { sockKey = sock.register(selector, SelectionKey.OP_CONNECT); boolean immediateConnect = sock.connect(addr); if (immediateConnect) { sendThread.primeConnection(); } } In the immediate case, there are several bugs: a) updateSocketAddresses() is never called, as it is when the select-loop in doTransport(). This means that clientCnxnSocket.getRemoteSocketAddress() will return null for the lifetime of this socket? b) CONNECT still in the interest set for the socket. c) updateLastSendAndHeard() is never called either. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624042#comment-13624042 ] Shevek commented on ZOOKEEPER-1683: --- Note that since connect() is called by SendThread, if connect() is NOT immediate ClientCnxn.start() may return immediately, and so the following call sequence will return null: zk = new ZooKeeper(...) zk.[].getRemoteSocketAddress() So I think this breaks on both code paths. Perhaps the immediate-case above is a strictly separate bug. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Shevek >Assignee: Alexander Shraer > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624039#comment-13624039 ] Shevek commented on ZOOKEEPER-1683: --- This may be caused by ClientCnxnSocketNIO.java: void registerAndConnect(SocketChannel sock, InetSocketAddress addr) throws IOException { sockKey = sock.register(selector, SelectionKey.OP_CONNECT); boolean immediateConnect = sock.connect(addr); if (immediateConnect) { sendThread.primeConnection(); } } In the immediate case, there are several bugs: a) updateSocketAddresses() is never called, as it is when the select-loop in doTransport(). This means that clientCnxnSocket.getRemoteSocketAddress() will return null for the lifetime of this socket? b) CONNECT still in the interest set for the socket. c) updateLastSendAndHeard() is never called either. > ZooKeeper client NPE when updating server list on disconnected client > - > > Key: ZOOKEEPER-1683 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 > Project: ZooKeeper > Issue Type: Bug >Reporter: Shevek >Assignee: Alexander Shraer > > 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] > com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - > Background exception caught > java.lang.NullPointerException > at > org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) > ~[zookeeper-3.5.0.jar:3.5.0--1] > at > com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) > ~[curator-client-1.3.5-SNAPSHOT.jar:?] > The duff code is this: > ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); > InetSocketAddress currentHost = (InetSocketAddress) > clientCnxnSocket.getRemoteSocketAddress(); > boolean reconfigMode = hostProvider.updateServerList(serverAddresses, > currentHost); > Now, currentHost might be null, if we're not yet connected. But > StaticHostProvider.updateServerList indirects on it unconditionally. > This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (ZOOKEEPER-1683) ZooKeeper client NPE when updating server list on disconnected client
Shevek created ZOOKEEPER-1683: - Summary: ZooKeeper client NPE when updating server list on disconnected client Key: ZOOKEEPER-1683 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683 Project: ZooKeeper Issue Type: Bug Reporter: Shevek 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - Background exception caught java.lang.NullPointerException at org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161) ~[zookeeper-3.5.0.jar:3.5.0--1] at org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) ~[zookeeper-3.5.0.jar:3.5.0--1] at com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121) ~[curator-client-1.3.5-SNAPSHOT.jar:?] The duff code is this: ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket(); InetSocketAddress currentHost = (InetSocketAddress) clientCnxnSocket.getRemoteSocketAddress(); boolean reconfigMode = hostProvider.updateServerList(serverAddresses, currentHost); Now, currentHost might be null, if we're not yet connected. But StaticHostProvider.updateServerList indirects on it unconditionally. This would be caught by findbugs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (ZOOKEEPER-1680) Cannot connect with a given sessionId - it is discarded
Shevek created ZOOKEEPER-1680: - Summary: Cannot connect with a given sessionId - it is discarded Key: ZOOKEEPER-1680 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1680 Project: ZooKeeper Issue Type: Bug Reporter: Shevek While the API permits construction of a ZooKeeper client object with a given sessionId, the sessionId can never be used: ClientCnxn line 850: long sessId = (seenRwServerBefore) ? sessionId : 0; The only person who sets seenRwServerBefore is onConnected(). Therefore, it appears that passing a sessionId into a ZooKeeper constructor has no effect, as the ClientCnxn has never seen an RW server before, so it discards it anyway. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (ZOOKEEPER-1677) Misuse of INET_ADDRSTRLEN
[ https://issues.apache.org/jira/browse/ZOOKEEPER-1677?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13616651#comment-13616651 ] Shevek commented on ZOOKEEPER-1677: --- Additional bugs: a) Since grow() does not zero the memory, these comparisons will fail in any case, as the outstanding bytes of a sockaddr_storage over (say) a sockaddr_in will be nondeterministic (unless caller also zeros the passed-in ram, which never happens) b) addrvec_append_addrinfo only copies part of the buffer, which exposes the non-zeroing locally. So there are two different code paths where the trailing bytes of the _storage will be nondeterministic; one fixable locally and one not. > Misuse of INET_ADDRSTRLEN > - > > Key: ZOOKEEPER-1677 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1677 > Project: ZooKeeper > Issue Type: Bug >Affects Versions: 3.5.0 >Reporter: Shevek > > ZOOKEEPER-1355. Add zk.updateServerList(newServerList) (Alex Shraer, > Marshall McMullen via fpj) > > > > git-svn-id: https://svn.apache.org/repos/asf/zookeeper/trunk@1410731 > 13f79535-47bb-0310-9956-ffa450edef68 > +int addrvec_contains(const addrvec_t *avec, const struct sockaddr_storage > *addr) > +{ > +if (!avec || !addr) > +{ > +return 0; > +} > + > +int i = 0; > +for (i = 0; i < avec->count; i++) > +{ > +if(memcmp(&avec->data[i], addr, INET_ADDRSTRLEN) == 0) > +return 1; > +} > + > +return 0; > +} > Pretty sure that should be sizeof(sockaddr_storage). INET_ADDRSTRLEN is the > size of the character buffer which needs to be allocated for the return value > of inet_ntop, which seems to be totally wrong. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Created] (ZOOKEEPER-1677) Misuse of INET_ADDRSTRLEN
Shevek created ZOOKEEPER-1677: - Summary: Misuse of INET_ADDRSTRLEN Key: ZOOKEEPER-1677 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1677 Project: ZooKeeper Issue Type: Bug Affects Versions: 3.5.0 Reporter: Shevek ZOOKEEPER-1355. Add zk.updateServerList(newServerList) (Alex Shraer, Marshall McMullen via fpj) git-svn-id: https://svn.apache.org/repos/asf/zookeeper/trunk@1410731 13f79535-47bb-0310-9956-ffa450edef68 +int addrvec_contains(const addrvec_t *avec, const struct sockaddr_storage *addr) +{ +if (!avec || !addr) +{ +return 0; +} + +int i = 0; +for (i = 0; i < avec->count; i++) +{ +if(memcmp(&avec->data[i], addr, INET_ADDRSTRLEN) == 0) +return 1; +} + +return 0; +} Pretty sure that should be sizeof(sockaddr_storage). INET_ADDRSTRLEN is the size of the character buffer which needs to be allocated for the return value of inet_ntop, which seems to be totally wrong. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira