Hans Kowallik created KAFKA-4562:
------------------------------------

             Summary: deadlock heartbeat, metadata-manager, request-handler
                 Key: KAFKA-4562
                 URL: https://issues.apache.org/jira/browse/KAFKA-4562
             Project: Kafka
          Issue Type: Bug
          Components: core
    Affects Versions: 0.10.1.0
         Environment: rhel7, java 1.8.0_77, vmware, two broker setup
            Reporter: Hans Kowallik



Found one Java-level deadlock:
=============================
"executor-Heartbeat":
  waiting to lock monitor 0x00007f8c9c954378 (object 0x00000006cd17dd18, a 
kafka.coordinator.GroupMetadata),
  which is held by "group-metadata-manager-0"
"group-metadata-manager-0":
  waiting to lock monitor 0x00007f8d60002bc8 (object 0x00000006d9e386e8, a 
java.util.LinkedList),
  which is held by "kafka-request-handler-1"
"kafka-request-handler-1":
  waiting to lock monitor 0x00007f8c9c954378 (object 0x00000006cd17dd18, a 
kafka.coordinator.GroupMetadata),
  which is held by "group-metadata-manager-0"

When this happens, RAM Usage, network connections and threads increase linearly.

controller can't talk to local broker:
[2016-12-19 16:22:44,639] INFO 
[Controller-614897-to-broker-614897-send-thread], Controller 614897 connected 
to kafka-dev-614897.lhotse.ov.otto.de:9092 (id: 614897 rack: null) for sending
 state change requests (kafka.controller.RequestSendThread)

replication thread can't talk to remote broker:
[2016-12-19 16:22:42,014] WARN [ReplicaFetcherThread-0-614897], Error in fetch 
kafka.server.ReplicaFetcherThread$FetchRequest@6cae17f6 
(kafka.server.ReplicaFetcherThread)
java.io.IOException: Connection to 614897 was disconnected before the response 
was read

Not failover happens until machine runs out of swap space or kafka is restarted 
manually.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to