[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13624086#comment-13624086
 ] 

Shevek commented on ZOOKEEPER-1683:
-----------------------------------

The trouble is that there's hidden state in flight. The question is, what is 
the state of the ZooKeeper client when the constructor returns.

If it is determined that it has a remote socket address, i.e. updateServerList 
should be callable as-written, and be safe, then ClientCnxn needs to 
SYNCHRONOUSLY, i.e. not via SendThread.start() record the "intended" target 
connection address.

If we follow the proposal to allow the address to be null in updateServerList, 
then either:
a) We nulls (treat it as always in the list), in which case the "invisible" 
state of the intended server address of the in-flight connect may still allow 
us to connect to a "wrong" server.
b) We say a null is never in the list, in which case repeated fast calls to 
updateServerList could cause a denial of service of the ZooKeeper client.

The difficulty, I think, is that the state of the client when "new ZooKeeper" 
returns is defined by a race condition. The INTENT of the connection needs to 
be recorded in 'new ClientCnxn' before SendThread.start() goes asynchronous.

A somewhat uglier hack might be to make StaticHostProvider cache the previously 
returned value and give the ClientCnxn a kicking if the cached value was 
removed from the list. While this would also work, I feel that it would 
contribute negatively to the overall complexity and coupledness of the code.
                
> ZooKeeper client NPE when updating server list on disconnected client
> ---------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-1683
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1683
>             Project: ZooKeeper
>          Issue Type: Bug
>            Reporter: Shevek
>            Assignee: Alexander Shraer
>
> 2013-04-04 22:16:15,872 ERROR [pool-4-thread-1] 
> com.netflix.curator.ConnectionState.getZooKeeper (ConnectionState.java:84) - 
> Background exception caught
> java.lang.NullPointerException
>         at 
> org.apache.zookeeper.client.StaticHostProvider.updateServerList(StaticHostProvider.java:161)
>  ~[zookeeper-3.5.0.jar:3.5.0--1]
>         at 
> org.apache.zookeeper.ZooKeeper.updateServerList(ZooKeeper.java:183) 
> ~[zookeeper-3.5.0.jar:3.5.0--1]
>         at 
> com.netflix.curator.HandleHolder$1$1.setConnectionString(HandleHolder.java:121)
>  ~[curator-client-1.3.5-SNAPSHOT.jar:?]
> The duff code is this:
> ClientCnxnSocket clientCnxnSocket = cnxn.sendThread.getClientCnxnSocket();
> InetSocketAddress currentHost = (InetSocketAddress) 
> clientCnxnSocket.getRemoteSocketAddress();
> boolean reconfigMode = hostProvider.updateServerList(serverAddresses, 
> currentHost);
> Now, currentHost might be null, if we're not yet connected. But 
> StaticHostProvider.updateServerList indirects on it unconditionally.
> This would be caught by findbugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to