Chia-Ping Tsai created KAFKA-18160:
--------------------------------------
Summary: Interrupting or waking up onPartitionsAssigned in
AsyncConsumer can cause the ConsumerRebalanceListenerCallbackCompletedEvent to
be skipped, potentially leading to corrupted added partitions
Key: KAFKA-18160
URL: https://issues.apache.org/jira/browse/KAFKA-18160
Project: Kafka
Issue Type: Improvement
Reporter: Chia-Ping Tsai
Assignee: Chia-Ping Tsai
I noticed this issue when testing KAFKA-17962. It includes two bugs listed
below.
*ConsumerRebalanceListenerCallbackCompletedEvent is skipped*
`invokeRebalanceCallbacks`could throw WakeupException/InterruptException [0]
and they are NOT handled. Hence, the event
`ConsumerRebalanceListenerCallbackCompletedEvent` is NOT sent to background
thread.
*Solution*: We should use try-catch blocks to propagate both
InterruptedException and WakeupException to the background thread.
*corrupted added partitions*
In the next iteration of invokeRebalanceCallbacks, non-fetchable assigned
partitions are treated as owned partitions [1]. This results in "empty"
partitions being passed to the listener, meaning that the listener never
receives the correctly added partitions after the first execution fails.
Consequently, this causes the test_pause_and_resume_sink (KAFKA-17962) to
become unstable when using AsyncConsumer.
*Solution*: We should add only partitions where pendingOnAssignedCallback is
false to the owned partitions.
[0]
https://github.com/apache/kafka/blob/2d39d5be64d4f5b6446f4b9ec3f32b039707d9d1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AsyncKafkaConsumer.java#L2046
[1]
https://github.com/apache/kafka/blob/2d39d5be64d4f5b6446f4b9ec3f32b039707d9d1/clients/src/main/java/org/apache/kafka/clients/consumer/internals/AbstractMembershipManager.java#L828
--
This message was sent by Atlassian Jira
(v8.20.10#820010)