[ 
https://issues.apache.org/jira/browse/KAFKA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kirk True updated KAFKA-16555:
------------------------------
    Issue Type: Bug  (was: Task)

> Consumer's RequestState has incorrect logic to determine if inflight
> --------------------------------------------------------------------
>
>                 Key: KAFKA-16555
>                 URL: https://issues.apache.org/jira/browse/KAFKA-16555
>             Project: Kafka
>          Issue Type: Bug
>          Components: clients, consumer
>    Affects Versions: 3.7.0
>            Reporter: Kirk True
>            Assignee: Kirk True
>            Priority: Critical
>              Labels: consumer-threading-refactor, kip-848-client-support
>             Fix For: 3.8.0
>
>
> When running system tests for the new consumer, I've hit an issue where the 
> {{HeartbeatRequestManager}} is sending out multiple concurrent 
> {{CONSUMER_GROUP_REQUEST}} RPCs. The effect is the coordinator creates 
> multiple members which causes downstream assignment problems.
> Here's the order of events:
> * Time 202: {{HearbeatRequestManager.poll()}} determines it's OK to send a 
> request. In so doing, it updates the {{RequestState}}'s {{lastSentMs}} to the 
> current timestamp, 202
> * Time 236: the response is received and response handler is invoked, setting 
> the {{RequestState}}'s {{lastReceivedMs}} to the current timestamp, 236
> * Time 236: {{HearbeatRequestManager.poll()}} is invoked again, and it sees 
> that it's OK to send a request. It creates another request, once again 
> updating the {{RequestState}}'s {{lastSentMs}} to the current timestamp, 236
> * Time 237:  {{HearbeatRequestManager.poll()}} is invoked again, and 
> ERRONEOUSLY decides it's OK to send another request, despite one already in 
> flight.
> Here's the problem with {{requestInFlight()}}:
> {code:java}
> public boolean requestInFlight() {
>     return this.lastSentMs > -1 && this.lastReceivedMs < this.lastSentMs;
> }
> {code}
> On our case, {{lastReceivedMs}} is 236 and {{lastSentMs}} is _also_ 236. So 
> the received timestamp is _equal_ to the sent timestamp, not _less_.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to