Hans Kowallik created KAFKA-4562: ------------------------------------ Summary: deadlock heartbeat, metadata-manager, request-handler Key: KAFKA-4562 URL: https://issues.apache.org/jira/browse/KAFKA-4562 Project: Kafka Issue Type: Bug Components: core Affects Versions: 0.10.1.0 Environment: rhel7, java 1.8.0_77, vmware, two broker setup Reporter: Hans Kowallik
Found one Java-level deadlock: ============================= "executor-Heartbeat": waiting to lock monitor 0x00007f8c9c954378 (object 0x00000006cd17dd18, a kafka.coordinator.GroupMetadata), which is held by "group-metadata-manager-0" "group-metadata-manager-0": waiting to lock monitor 0x00007f8d60002bc8 (object 0x00000006d9e386e8, a java.util.LinkedList), which is held by "kafka-request-handler-1" "kafka-request-handler-1": waiting to lock monitor 0x00007f8c9c954378 (object 0x00000006cd17dd18, a kafka.coordinator.GroupMetadata), which is held by "group-metadata-manager-0" When this happens, RAM Usage, network connections and threads increase linearly. controller can't talk to local broker: [2016-12-19 16:22:44,639] INFO [Controller-614897-to-broker-614897-send-thread], Controller 614897 connected to kafka-dev-614897.lhotse.ov.otto.de:9092 (id: 614897 rack: null) for sending state change requests (kafka.controller.RequestSendThread) replication thread can't talk to remote broker: [2016-12-19 16:22:42,014] WARN [ReplicaFetcherThread-0-614897], Error in fetch kafka.server.ReplicaFetcherThread$FetchRequest@6cae17f6 (kafka.server.ReplicaFetcherThread) java.io.IOException: Connection to 614897 was disconnected before the response was read Not failover happens until machine runs out of swap space or kafka is restarted manually. -- This message was sent by Atlassian JIRA (v6.3.4#6332)