We tracked this down to a mismatch between /etc/hosts and the loopback adapter. The hosts file defined IPv6 ::1 as localhost, but the loopback adapter only had an IPv4 interface. Every time ZK tried to use IPv6, it failed. The IP address that ZK logged in these cases is, to my mind, mighty strange. I wonder if it might be pointing to some subtle ZK bug.
Chris From: Will Martin <[email protected]> To: "[email protected]" <[email protected]> Date: 11/17/2016 11:09 PM Subject: Re: Very strange trace can you share a. the application being co-ordinated? b. a cfg from a zookeeper server ? -will On 11/17/2016 5:25 PM, Chris Barlock wrote: Have a variation on this from a new set of trace logs: [2016-11-17 14:29:56,798] INFO Initiating client connection, connectString=localhost:2181 sessionTimeout=5000 watcher=com.ibm.tivoli.ccm.config.rest.ConfigClient@34087499 (org.apache.zookeeper.ZooKeeper) [2016-11-17 14:29:56,799] INFO Opening socket connection to server 229.127.0.0/229.127.0.0:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) [2016-11-17 14:29:56,799] ERROR Unable to open socket to 229.127.0.0/229.127.0.0:2181 (org.apache.zookeeper.ClientCnxnSocketNIO) [2016-11-17 14:29:56,800] WARN Session 0x0 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn) java.net.SocketException: Network is unreachable at sun.nio.ch.Net.connect0(Native Method) at sun.nio.ch.Net.connect(Net.java:481) at sun.nio.ch.Net.connect(Net.java:473) at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:662) at org.apache.zookeeper.ClientCnxnSocketNIO.registerAndConnect(ClientCnxnSocketNIO.java:277) at org.apache.zookeeper.ClientCnxnSocketNIO.connect(ClientCnxnSocketNIO.java:287) at org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:967) at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1003) 66.127.0.0 has become 229.127.0.0. This looks suspiciously like localhost/127.0.0.1 is being transformed into <something>.127.0.0. Chris IBM Cloud Application Performance Management Research Triangle Park, NC (919) 543-1286 Internet: [email protected]<mailto:[email protected]> From: Chris Barlock/Raleigh/IBM@IBMUS To: "ZooKeeper Users" <[email protected]>< mailto:[email protected]> Date: 11/15/2016 03:52 PM Subject: Very strange trace In one of our logs, I noticed a very strange trace: [2016-11-01 15:11:38,728] INFO Initiating client connection, connectString=localhost:2181 sessionTimeout=5000 watcher= com.ibm.tivoli.ccm.config.rest.ConfigClient@2d4a7662 (org.apache.zookeeper.ZooKeeper) [2016-11-01 15:11:38,730] INFO Opening socket connection to server adsl-66-127-0-0.dsl.lsan03.pacbell.net/66.127.0.0:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn) Nov 01, 2016 3:11:40 PM com.ibm.tivoli.ccm.config.rest.ZooKeeperClient zkConnect SEVERE: Unable to connect to ZooKeeper! Even though: connectString=localhost:2181 this happens: Opening socket connection to server adsl-66-127-0-0.dsl.lsan03.pacbell.net/66.127.0.0:2181 It should be: Opening socket connection to server localhost/127.0.0.1:2181 The log has many successful connections using localhost and only a few that fail with pacbell.net. Any thoughts on what might be causing this? Chris
