[jira] [Commented] (CURATOR-578) EnsembleTracker replace hostname connectString with wrong ip from zk config

J Robert Ray (Jira) Tue, 25 Aug 2020 18:44:11 -0700


    [ 
https://issues.apache.org/jira/browse/CURATOR-578?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17184838#comment-17184838
 ]


J Robert Ray commented on CURATOR-578:
--------------------------------------

I am experiencing a combination of this problem and Curator eventually 
attempting to attempt to connect to 0.0.0.0 as in CURATOR-392; the fix for that 
does not handle the configuration scenario suggested by the [official Zookeeper 
Docker image|https://hub.docker.com/_/zookeeper] for deploying to Docker Swarm, 
specifically, using "0.0.0.0" as the bind address.

I have a deployment of three Zookeeper nodes in Docker Swarm, and have 
attempted to give them stable IPs by pinning each container to a dedicated node 
and using the node hostname when advertising the service:

{{ZOO_SERVERS: server.1=0.0.0.0:2888:3888;2181 
server.2=host2:2888:3888;host2:2181 server.3=host3:2888:3888;host3:2181}}
{{ ZOO_SERVERS: server.1=host1:2888:3888;host1:2181 
server.2=0.0.0.0:2888:3888;2181 server.3=host3:2888:3888;host3:2181}}
{{ ZOO_SERVERS: server.1=host1:2888:3888;host1:2181 
server.2=host2:2888:3888;host2:2181 server.3=0.0.0.0:2888:3888;2181}}

 

Curator is configured initially with the connection string: 
{{host1:2181,host2:2181,host3:2183}}. Things are fine until a Zookeeper node is 
restarted for some reason.

 

This log is from the application using Curator, at the moment Zookeeper is 
killed on host3. The addresses 10.5.x.x are the valid IP addresses for the 
Docker hosts. The addresses 10.0.x.x are the Docker Swarm node internal 
addresses, which change upon Zookeeper restarting.

[^curator.log]

The client ends up in a loop trying to connect to 10.0.x.x addresses, which may 
no longer be valid, and 0.0.0.0.

 

Apart from this, my Zookeeper cluster does not reliably recover from a node 
restart without manually stopping all but one node (for example, the restarted 
node rejecting new connections because client has a higher zxid), which is 
making me reconsider attempting to run the cluster with Swarm/k8s.

Curator 5.1.0, Zookeeper 3.6.1.

> EnsembleTracker replace hostname connectString with wrong ip from zk config
> ---------------------------------------------------------------------------
>
>                 Key: CURATOR-578
>                 URL: https://issues.apache.org/jira/browse/CURATOR-578
>             Project: Apache Curator
>          Issue Type: Bug
>          Components: Client
>    Affects Versions: 4.0.1
>            Reporter: ying.li
>            Priority: Major
>         Attachments: curator.log
>
>
> I have a zookeeper cluster  which run on a k8s cluster. and I use host name 
> to  connect the zookeeper(like : 
> zookeeper-0.zookeeper-headless.default.svc.cluster.local:2181,zookeeper-1.zookeeper-headless.default.svc.cluster.local:2181,zookeeper-2.zookeeper-headless.default.svc.cluster.local:2181).
>  
> When the zookeeper restart. the zk pod's ip will change.  then  I find my 
> client will use the IP to recreate a client without using the hostname . but 
> the IP is not the latest IP from hostname.so, it will make client never 
> connect to zk , unless restart the client
>  
> After some debug ,I find the EnsembleTracker will change the connectString 
> from hostname to ip when receive the congfig change  event. But in many case, 
> the IP get from hostname will not change after  zk  restart in k8s. so, it 
> will make client never connect to zk , unless restart the client
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (CURATOR-578) EnsembleTracker replace hostname connectString with wrong ip from zk config

Reply via email to