[jira] [Updated] (KAFKA-20114) Fix race between requestInFlight and backoffDeadlineMs in RPCProducerIdManager causing premature retries

sanghyeok An (Jira) Mon, 02 Feb 2026 16:14:09 -0800


     [ 
https://issues.apache.org/jira/browse/KAFKA-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


sanghyeok An updated KAFKA-20114:
---------------------------------
    Labels: producer transaction  (was: transaction)

> Fix race between requestInFlight and backoffDeadlineMs in 
> RPCProducerIdManager causing premature retries
> --------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-20114
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20114
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: sanghyeok An
>            Assignee: sanghyeok An
>            Priority: Minor
>              Labels: producer, transaction
>         Attachments: image-2026-02-03-08-53-05-655.png
>
>
> RPCProducerIdManager uses two independent atomics, requestInFlight and 
> backoffDeadlineMs. There is a remaining race that can cause premature retries 
> when maybeRequestNextBlock reads an outdated backoffDeadlineMs and then a 
> concurrent in-flight failure applies a new backoff and clears requestInFlight.
> If the interleaving happens such that:
>  * maybeRequestNextBlock reads backoffDeadlineMs before the failure handler 
> updates it, and
>  * the failure handler clears requestInFlight before maybeRequestNextBlock 
> attempts compareAndSet,
> then maybeRequestNextBlock can successfully set requestInFlight and call 
> sendRequest immediately, effectively ignoring the newly applied retry backoff.
>  
> !image-2026-02-03-08-53-05-655.png|width=1040,height=538!
>  
>  
>  
> *Previous discussion in other PR*
> [https://github.com/apache/kafka/pull/21279#issuecomment-3836196135]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (KAFKA-20114) Fix race between requestInFlight and backoffDeadlineMs in RPCProducerIdManager causing premature retries

Reply via email to