[
https://issues.apache.org/jira/browse/KAFKA-4634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15823187#comment-15823187
]
Maharajan Shunmuga Sundaram commented on KAFKA-4634:
----------------------------------------------------
I do see following issues in ez6 (Node id : 211) server log
[2017-01-13 01:05:28,418] INFO Closing socket connection to /10.254.2.29.
(kafka.network.Processor)
[2017-01-13 01:05:28,507] INFO re-registering broker info in ZK for broker 211
(kafka.server.KafkaHealthcheck)
[2017-01-13 01:05:28,537] INFO conflict in /brokers/ids/211 data:
{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}
stored data:
{"jmx_port":-1,"timestamp":"1484287051977","host":"ez6","version":1,"port":9092}
(kafka.utils.ZkUtils$)
[2017-01-13 01:05:28,562] INFO I wrote this conflicted ephemeral node
[{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}]
at /brokers/ids/211 a while back in a different session, hence I will backoff
for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$)
to
[2017-01-13 01:35:26,146] INFO I wrote this conflicted ephemeral node
[{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}]
at /brokers/ids/211 a while back in a different session, hence I will backoff
for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$)
[2017-01-13 01:35:32,151] INFO conflict in /brokers/ids/211 data:
{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}
stored data:
{"jmx_port":-1,"timestamp":"1484287051977","host":"ez6","version":1,"port":9092}
(kafka.utils.ZkUtils$)
[2017-01-13 01:35:32,157] INFO I wrote this conflicted ephemeral node
[{"jmx_port":-1,"timestamp":"1484287528509","host":"ez6","version":1,"port":9092}]
at /brokers/ids/211 a while back in a different session, hence I will backoff
for this node to be deleted by Zookeeper and retry (kafka.utils.ZkUtils$)
[2017-01-13 01:35:51,958] INFO [ReplicaFetcherManager on broker 211] Removed
fetcher for partitions [Tracing,0] (kafka.server.ReplicaFetcherManager)
[2017-01-13 01:35:52,049] INFO Closing socket connection to /10.254.2.19.
(kafka.network.Processor)
[2017-01-13 01:35:52,078] INFO Registered broker 211 at path /brokers/ids/211
with address ez6:9092. (kafka.utils.ZkUtils$)
[2017-01-13 01:35:52,078] INFO done re-registering broker
(kafka.server.KafkaHealthcheck)
[2017-01-13 01:35:52,078] INFO Subscribing to /brokers/topics path to watch for
new topics (kafka.server.KafkaHealthcheck)
[2017-01-13 01:35:52,089] INFO New leader is 113
(kafka.server.ZookeeperLeaderElector$LeaderChangeListener)
Almost 30 secs, server tried to connect zookeeper resulted conflicts until it
succeeds.
I brought this in case it may hint any problem.
> Issue of one kafka brokers not listed in zookeeper
> --------------------------------------------------
>
> Key: KAFKA-4634
> URL: https://issues.apache.org/jira/browse/KAFKA-4634
> Project: Kafka
> Issue Type: Bug
> Components: core
> Affects Versions: 0.8.2.1
> Environment: Zookeeper version: 3.4.6-1569965, built on 02/20/2014
> 09:09 GMT
> kafka_2.10-0.8.2.1
> Reporter: Maharajan Shunmuga Sundaram
>
> Hi,
> We have incident that one of the 10 brokers not listed in brokers list of
> zookeeper.
> This is verified by running following command
> >> echo dump | nc cz2 2181
> SessionTracker dump:
> Session Sets (4):
> 0 expire at Fri Jan 13 22:32:14 EST 2017:
> 0 expire at Fri Jan 13 22:32:16 EST 2017:
> 7 expire at Fri Jan 13 22:32:18 EST 2017:
> 0x259968e41e30000
> 0x35996670d5d0001
> 0x35996670d5d0000
> 0x159966708470004
> 0x159966e47760000
> 0x159966708470003
> 0x2599672df260000
> 3 expire at Fri Jan 13 22:32:20 EST 2017:
> 0x159968e41dd0000
> 0x259966708550001
> 0x259966708550000
> ephemeral nodes dump:
> Sessions with Ephemerals (9):
> 0x259966708550000:
> /brokers/ids/112
> 0x259968e41e30000:
> /brokers/ids/213
> 0x159968e41dd0000:
> /brokers/ids/19
> 0x159966708470003:
> /brokers/ids/110
> 0x35996670d5d0000:
> /brokers/ids/113
> /controller
> 0x259966708550001:
> /brokers/ids/111
> 0x159966708470004:
> /brokers/ids/212
> 0x2599672df260000:
> /brokers/ids/29
> 0x35996670d5d0001:
> /brokers/ids/210
> ------
> There are 10 sessions, but only 9 sessions are listed with brokers.
> Broker with id 211 is not listed. Session 0x159966e47760000 is not shown with
> broker id 211.
> In the broker side log, I do see it is connected
> >> zgrep "0x159966e47760000" *log*
>
> zk.log:[2017-01-13 01:05:28,513] INFO Session establishment complete on
> server cz1/10.254.2.19:2181, sessionid = 0x159966e47760000, negotiated
> timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:38,163] INFO Unable to read additional data from
> server sessionid 0x159966e47760000, likely server has closed socket, closing
> socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:39,101] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:40,121] INFO Unable to read additional data from
> server sessionid 0x159966e47760000, likely server has closed socket, closing
> socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:41,770] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:42,439] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:43,235] INFO Unable to read additional data from
> server sessionid 0x159966e47760000, likely server has closed socket, closing
> socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:44,950] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:35:45,837] INFO Unable to read additional data from
> server sessionid 0x159966e47760000, likely server has closed socket, closing
> socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> .
> .
> .
> .
> zk.log:[2017-01-13 01:40:14,818] INFO Unable to read additional data from
> server sessionid 0x159966e47760000, likely server has closed socket, closing
> socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:40:15,916] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:40:19,692] INFO Client session timed out, have not
> heard from server in 3676ms for sessionid 0x159966e47760000, closing socket
> connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:40:20,632] INFO Unable to read additional data from
> server sessionid 0x159966e47760000, likely server has closed socket, closing
> socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:40:20,814] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:40:22,089] WARN Session 0x159966e47760000 for server
> null, unexpected error, closing socket connection and attempting reconnect
> (org.apache.zookeeper.ClientCnxn)
> zk.log:[2017-01-13 01:40:22,562] INFO Session establishment complete on
> server cz2/10.254.2.29:2181, sessionid = 0x159966e47760000, negotiated
> timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> After several tries, broker connected with zookeeper cz2:2181. I am not sure
> how to debug this issue. It would be helpful if someone provides way to debug
> this issue.
> Regards,
> Maharajan S
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)