Hans Kowallik created KAFKA-4562:
------------------------------------
Summary: deadlock heartbeat, metadata-manager, request-handler
Key: KAFKA-4562
URL: https://issues.apache.org/jira/browse/KAFKA-4562
Project: Kafka
Issue Type: Bug
Components: core
Affects Versions: 0.10.1.0
Environment: rhel7, java 1.8.0_77, vmware, two broker setup
Reporter: Hans Kowallik
Found one Java-level deadlock:
=============================
"executor-Heartbeat":
waiting to lock monitor 0x00007f8c9c954378 (object 0x00000006cd17dd18, a
kafka.coordinator.GroupMetadata),
which is held by "group-metadata-manager-0"
"group-metadata-manager-0":
waiting to lock monitor 0x00007f8d60002bc8 (object 0x00000006d9e386e8, a
java.util.LinkedList),
which is held by "kafka-request-handler-1"
"kafka-request-handler-1":
waiting to lock monitor 0x00007f8c9c954378 (object 0x00000006cd17dd18, a
kafka.coordinator.GroupMetadata),
which is held by "group-metadata-manager-0"
When this happens, RAM Usage, network connections and threads increase linearly.
controller can't talk to local broker:
[2016-12-19 16:22:44,639] INFO
[Controller-614897-to-broker-614897-send-thread], Controller 614897 connected
to kafka-dev-614897.lhotse.ov.otto.de:9092 (id: 614897 rack: null) for sending
state change requests (kafka.controller.RequestSendThread)
replication thread can't talk to remote broker:
[2016-12-19 16:22:42,014] WARN [ReplicaFetcherThread-0-614897], Error in fetch
kafka.server.ReplicaFetcherThread$FetchRequest@6cae17f6
(kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 614897 was disconnected before the response
was read
Not failover happens until machine runs out of swap space or kafka is restarted
manually.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)