Thanks Chris. But it happens even if I just take down the ZK process. IP address seem to remain the same.
On Wed, Sep 9, 2015 at 11:23 PM, Chris Nauroth <[email protected]> wrote: > Hello Bill, > > When the VMs restart, is it possible that they are assigned different IP > addresses, despite retaining their original hostnames? > > The reason I ask is that we currently have a known issue in that a running > ZooKeeper server will not redo DNS resolution for previously encountered > hostnames in the ensemble. This is documented in issue ZOOKEEPER-1506, > where a proposed patch is undergoing review and testing. > > https://issues.apache.org/jira/browse/ZOOKEEPER-1506 > > > If IP addresses are changing after VM restarts in your environment, then > it seems plausible that you're seeing the symptoms of ZOOKEEPER-1506. > > --Chris Nauroth > > > > > On 9/9/15, 11:09 PM, "Bill Hastings" <[email protected]> wrote: > > >On the node, which is not the leader I get the following messages in the > >log: > > > >04:26:43,076 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >04:26:43,089 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >06:35:25,844 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.11:51367 > >06:38:00,399 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.11:51539 > >07:18:27,940 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.11:52720 > >07:33:58,042 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:33:59,449 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:34:00,854 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:34:02,257 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:34:03,660 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:34:05,063 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:34:06,266 INFO LeaderElection:187 - Server address: hvs2.dwa.local/ > >192.168.8.11:3000 > >07:34:06,585 WARN Learner:234 - Unexpected exception, tries=0, connecting > >to hvs2.dwa.local/192.168.8.11:3000 > >07:55:28,865 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >07:55:29,066 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >07:55:29,471 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >07:55:30,275 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >07:55:31,878 WARN QuorumCnxManager:382 - Cannot open channel to 2 at > >election address hvs2.dwa.local/192.168.8.11:4000 > >07:55:34,106 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.11:55863 > >07:58:01,872 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.11:56662 > > > >On the leader I get the following: > > > >4:19:50,815 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >04:20:46,903 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.10:46459 > >06:36:04,561 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:04,771 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:05,175 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:05,980 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:07,585 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:10,789 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:17,194 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:29,999 WARN QuorumCnxManager:382 - Cannot open channel to 1 at > >election address hvs1.dwa.local/192.168.8.10:4000 > >06:36:53,578 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.10:50285 > >07:16:53,244 WARN LearnerHandler:646 - ******* GOODBYE > >/192.168.8.10:42097 > >******** > >07:17:21,117 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.10:51044 > >07:32:57,213 INFO LeaderElection:187 - Server address: hvs1.dwa.local/ > >192.168.8.10:3000 > >07:32:58,427 INFO LeaderElection:187 - Server address: hvs1.dwa.local/ > >192.168.8.10:3000 > >07:32:59,631 INFO LeaderElection:187 - Server address: hvs1.dwa.local/ > >192.168.8.10:3000 > >07:34:00,575 WARN LearnerHandler:646 - ******* GOODBYE > >/192.168.8.10:43186 > >******** > >07:56:11,493 WARN LearnerHandler:646 - ******* GOODBYE > >/192.168.8.10:43536 > >******** > >07:56:55,045 INFO QuorumCnxManager:511 - Received connection request / > >192.168.8.10:51949 > > > >On Wed, Sep 9, 2015 at 10:42 PM, Bill Hastings <[email protected]> > >wrote: > > > >> Hi All > >> > >> I am running ZK as a 3 node cluster. Each ZK instance is a VMWare VM in > >>a > >> distinct ESX host. Let's assume the three VMs are A, B and C where A is > >>the > >> leader. Now if I take down VM B and C and then bring one of them back > >>up. > >> However the ZK cluster is never formed unless I bounce VM A. How can I > >> troubleshoot this? This however does not happen in a physical > >>environment. > >> > >> -- > >> Cheers > >> Bill > >> > > > > > > > >-- > >Cheers > >Bill > > -- Cheers Bill
