Ryan Leslie created KAFKA-6479:
----------------------------------

             Summary: Broker file descriptor leak after consumer request timeout
                 Key: KAFKA-6479
                 URL: https://issues.apache.org/jira/browse/KAFKA-6479
             Project: Kafka
          Issue Type: Bug
          Components: controller
    Affects Versions: 1.0.0
            Reporter: Ryan Leslie


When a consumer request times out, i.e. takes longer than request.timeout.ms, 
and the client disconnects from the coordinator, the coordinator may leak file 
descriptors. The following code produces this behavior:


{code:java}
Properties config = new Properties();
config.put("bootstrap.servers", BROKERS);
config.put("group.id", "leak-test");
config.put("key.deserializer", 
"org.apache.kafka.common.serialization.StringDeserializer");
config.put("value.deserializer", 
"org.apache.kafka.common.serialization.StringDeserializer");
config.put("max.poll.interval.ms", Integer.MAX_VALUE);
config.put("request.timeout.ms", 12000);

KafkaConsumer<String, String> consumer1 = new KafkaConsumer<>(config);
KafkaConsumer<String, String> consumer2 = new KafkaConsumer<>(config);

List<String> topics = Collections.singletonList("leak-test");
consumer1.subscribe(topics);
consumer2.subscribe(topics);

consumer1.poll(100); 
consumer2.poll(100);
{code}

When the above executes, consumer 2 will attempt to rebalance indefinitely 
(blocked by the inactive consumer 1), logging a _Marking the coordinator dead_ 
message every 12 seconds after giving up on the JOIN_GROUP request and 
disconnecting. Unless the consumer exits or times out, this will cause a socket 
in CLOSE_WAIT to leak in the coordinator and the broker will eventually run out 
of file descriptors and crash.

Aside from faulty code as in the example above, or an intentional DoS, any 
client bug causing a consumer to block, e.g. KAFKA-6397, could also result in 
this leak.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to