[ 
https://issues.apache.org/jira/browse/KAFKA-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459559#comment-16459559
 ] 

Edoardo Comar commented on KAFKA-6839:
--------------------------------------

Java does DNS caching

[https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html]

 

 

> ZK session retry with cname record
> ----------------------------------
>
>                 Key: KAFKA-6839
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6839
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Tyler Monahan
>            Priority: Major
>
> I have a 3 node kafka cluster setup in aws that talks to a 3 node zk cluster 
> behind an elb. I am giving the kafka instances a dns cname record that points 
> to the aws elb which is another cname record pointing to two A records. When 
> the aws elb cname record changes the two A records it is pointing at and 
> kafka trys to reconnect to zk after losing a session it uses the old A 
> records and not the new ones so the reconnect attempt fails. There appears to 
> be some kind of caching instead of using the record that is set in the config 
> file.
> This is the error message I am seeing in the broker logs.
> {code:java}
> [2018-04-30 20:09:21,449] INFO Opening socket connection to server 
> ip-10-65-68-244.us-west-2.compute.internal/10.65.68.244:2181. Will not 
> attempt to authenticate using SASL (unknown error) 
> (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:24,450] WARN Client session timed out, have not heard from 
> server in 3962ms for sessionid 0x263094512190001 
> (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:24,451] INFO Client session timed out, have not heard from 
> server in 3962ms for sessionid 0x263094512190001, closing socket connection 
> and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:26,532] INFO Opening socket connection to server 
> ip-10-65-84-102.us-west-2.compute.internal/10.65.84.102:2181. Will not 
> attempt to authenticate using SASL (unknown error) 
> (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:29,531] WARN Session 0x263094512190001 for server null, 
> unexpected error, closing socket connection and attempting reconnect 
> (org.apache.zookeeper.ClientCnxn)
> java.net.NoRouteToHostException: No route to host
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at 
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to