[
https://issues.apache.org/jira/browse/KAFKA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jonathan Santilli resolved KAFKA-5971.
--------------------------------------
Resolution: Duplicate
This is getting closed since KAFKA-7165 have been solved
> Broker keeps running even though not registered in ZK
> -----------------------------------------------------
>
> Key: KAFKA-5971
> URL: https://issues.apache.org/jira/browse/KAFKA-5971
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.11.0.0
> Reporter: Igor Canadi
> Priority: Major
>
> We had a curious situation happen to our kafka cluster running version
> 0.11.0.0. One of the brokers was happily running, even though its ID was not
> registered in Zookeeper under `/brokers/ids`.
> Based on the logs, it appears that the broker restarted very quickly and
> there was a node under `/brokers/ids/2` still present from the previous run.
> However, in that case I'd expect the broker to try again or just exit. In
> reality it continued running without any errors in the logs.
> Here's the relevant part of the logs:
> ```
> [2017-09-06 23:50:26,095] INFO Opening socket connection to server
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181. Will not attempt to
> authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,096] INFO Socket connection established to
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, initiating session
> (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,099] WARN Unable to reconnect to ZooKeeper service,
> session 0x15e4477405f1d40 has expired (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,099] INFO zookeeper state changed (Expired)
> (org.I0Itec.zkclient.ZkClient)
> [2017-09-06 23:50:26,099] INFO Unable to reconnect to ZooKeeper service,
> session 0x15e4477405f1d40 has expired, closing socket connection
> (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,099] INFO Initiating client connection,
> connectString=zookeeper:2181 sessionTimeout=6000
> watcher=org.I0Itec.zkclient.ZkClient@2cb4893b (org.apache.zookeeper.ZooKeeper)
> [2017-09-06 23:50:26,102] INFO EventThread shut down for session:
> 0x15e4477405f1d40 (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,107] INFO Opening socket connection to server
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181. Will not attempt to
> authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,108] INFO Socket connection established to
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, initiating session
> (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,111] INFO Session establishment complete on server
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, sessionid =
> 0x15e599a1a3e0013, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,112] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)
> [2017-09-06 23:50:26,114] INFO re-registering broker info in ZK for broker 2
> (kafka.server.KafkaHealthcheck$SessionExpireListener)
> [2017-09-06 23:50:26,115] INFO Creating /brokers/ids/2 (is it secure? false)
> (kafka.utils.ZKCheckedEphemeral)
> [2017-09-06 23:50:26,123] INFO Result of znode creation is: NODEEXISTS
> (kafka.utils.ZKCheckedEphemeral)
> [2017-09-06 23:50:26,124] ERROR Error handling event ZkEvent[New session
> event sent to kafka.server.KafkaHealthcheck$SessionExpireListener@699f40a0]
> (org.I0Itec.zkclient.ZkEventThread)
> java.lang.RuntimeException: A broker is already registered on the path
> /brokers/ids/2. This probably indicates that you either have configured a
> brokerid that is already in use, or else you have shutdown this broker and
> restarted it faster than the zookeeper timeout so it
> at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:417)
> at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:403)
> at kafka.server.KafkaHealthcheck.register(KafkaHealthcheck.scala:70)
> at
> kafka.server.KafkaHealthcheck$SessionExpireListener.handleNewSession(KafkaHealthcheck.scala:104)
> at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:736)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:72)
> [2017-09-06 23:51:42,257] INFO [Group Metadata Manager on Broker 2]: Removed
> 0 expired offsets in 0 milliseconds.
> (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:00:06,198] INFO Unable to read additional data from server
> sessionid 0x15e599a1a3e0013, likely server has closed socket, closing socket
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:06,354] INFO zookeeper state changed (Disconnected)
> (org.I0Itec.zkclient.ZkClient)
> [2017-09-07 00:00:07,675] INFO Opening socket connection to server
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181. Will not attempt to
> authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:07,676] INFO Socket connection established to
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, initiating session
> (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:07,680] INFO Session establishment complete on server
> zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, sessionid =
> 0x15e599a1a3e0013, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:07,681] INFO zookeeper state changed (SyncConnected)
> (org.I0Itec.zkclient.ZkClient)
> [2017-09-07 00:01:42,257] INFO [Group Metadata Manager on Broker 2]: Removed
> 0 expired offsets in 0 milliseconds.
> (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:11:42,257] INFO [Group Metadata Manager on Broker 2]: Removed
> 0 expired offsets in 0 milliseconds.
> (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:21:42,257] INFO [Group Metadata Manager on Broker 2]: Removed
> 0 expired offsets in 0 milliseconds.
> (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:31:42,257] INFO [Group Metadata Manager on Broker 2]: Removed
> 0 expired offsets in 0 milliseconds.
> (kafka.coordinator.group.GroupMetadataManager)
> ```
> The only message that appears after this point is the "Removed 0 expired
> offsets", which happens every 10min.
> Let me know if I can provide any more information!
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)