Luke Chen created KAFKA-14010: --------------------------------- Summary: alterISR request won't retry when receiving retriable error Key: KAFKA-14010 URL: https://issues.apache.org/jira/browse/KAFKA-14010 Project: Kafka Issue Type: Bug Components: core Affects Versions: 3.2.0 Reporter: Luke Chen Assignee: Luke Chen
When submitting the AlterIsr request, we register a future listener to handle the response [here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/cluster/Partition.scala#L1585-L1610]. When receiving retriable error, we expected the AlterIsr request will get retried. And then, we'll re-submit the request again. However, before the future listener got called, we didn't clear the `unsentIsrUpdates`, which causes we failed to "enqueue" the request because we thought there's an in-flight request. We use "try/finally" to make sure the unsentIsrUpdates got cleared ([here|https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/server/AlterPartitionManager.scala#L362-L370], but it happened "after" we retry the request Although the AlterIsr request will get sent next time when the follower sent fetch request to the leader, we still need to fix this issue to make sure the AlterIsr request is sent successfully as we expected. -- This message was sent by Atlassian Jira (v8.20.7#820007)