[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-4842?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17870093#comment-17870093
 ] 

Swathi Mocharla commented on ZOOKEEPER-4842:
--------------------------------------------

[~phunt] , could we get some help with this issue.

> Zookeeper quorum is not formed intermittently with trailing dot in the 
> cluster domain name
> ------------------------------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-4842
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4842
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: quorum
>    Affects Versions: 3.8.4
>            Reporter: Swathi Mocharla
>            Priority: Major
>
> On kubernetes, we've set up the cluster domain with a trailing dot. Doing so, 
> we are seeing very often that the zookeeper quorum itself is not being 
> established. 
>  
> {code:java}
> bash-4.4$ env -u KAFKA_OPTS zookeeper-shell localhost:2181 config
> Connecting to localhost:2181
> [2024-06-25 10:36:39,178] WARN Client session timed out, have not heard from 
> server in 30031ms for session id 0x0 (org.apache.zookeeper.ClientCnxn)
> [2024-06-25 10:36:39,182] WARN Session 0x0 for server 
> localhost/[0:0:0:0:0:0:0:1]:2181, Closing socket connection. Attempting 
> reconnect except it is a SessionExpiredException. 
> (org.apache.zookeeper.ClientCnxn)
> org.apache.zookeeper.ClientCnxn$SessionTimeoutException: Client session timed 
> out, have not heard from server in 30031ms for session id 0x0
>         at 
> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1257)
> KeeperErrorCode = ConnectionLoss for /zookeeper/config
>  
> {code}
>  
> In the zookeeper logs, we see a lot of IOExceptions,  UnknownHost and 
> Interrupted exceptions.
>  
> {code:java}
> java.io.IOException: ZooKeeperServer not running
>         at 
> org.apache.zookeeper.server.NIOServerCnxn.readLength(NIOServerCnxn.java:565)
>         at 
> org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:350)
>         at 
> org.apache.zookeeper.server.NIOServerCnxnFactory$IOWorkRequest.doWork(NIOServerCnxnFactory.java:508)
>         at 
> org.apache.zookeeper.server.WorkerService$ScheduledWorkRequest.run(WorkerService.java:153)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
>         at 
> java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
>         at java.base/java.lang.Thread.run(Unknown Source)
> {"type":"log", "host":"zk-swkf-2.default", "level":"WARN", 
> "systemid":"zookeeper-2b13339237454984887b4908dc3a6df0", 
> "system":"zookeeper", "time":"2024-06-25T10:23:16.325Z", "timezone":"UTC", 
> "log":{"message":"NIOWorkerThread-1 - 
> org.apache.zookeeper.server.NIOServerCnxn - Close of session 0x0"}}
>  
> java.lang.InterruptedException
>         at 
> java.base/java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(Unknown
>  Source)
>         at 
> org.apache.zookeeper.util.CircularBlockingQueue.poll(CircularBlockingQueue.java:105)
>         at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.pollSendQueue(QuorumCnxManager.java:1453)
>         at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager.access$900(QuorumCnxManager.java:99)
>         at 
> org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:1277)
> {code}
>  
>  
> this is the content of the /etc/resolve.conf
> {code:java}
> bash-4.4$ cat /etc/resolv.conf
> search default.svc.cluster.local svc.cluster.local cluster.local bcmt
> nameserver 10.254.0.10
> options ndots:5{code}
>  
>  
> {code:java}
> [root@vm-10-76-72-33 ckaf-kafka]# nslookup zk-swkf.default.svc.cluster.local.
> Server:         10.76.72.33
> Address:        10.76.72.33#53
> Name:   zk-swkf.default.svc.cluster.local
> Address: 10.254.94.24
> [root@vm-10-76-72-33 ckaf-kafka]# nslookup zk-swkf.default.svc.cluster.local
> Server:         10.76.72.33
> Address:        10.76.72.33#53
> Name:   zk-swkf.default.svc.cluster.local
> Address: 10.254.94.24
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to