[ https://issues.apache.org/jira/browse/KAFKA-16555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Kirk True updated KAFKA-16555: ------------------------------ Issue Type: Bug (was: Task) > Consumer's RequestState has incorrect logic to determine if inflight > -------------------------------------------------------------------- > > Key: KAFKA-16555 > URL: https://issues.apache.org/jira/browse/KAFKA-16555 > Project: Kafka > Issue Type: Bug > Components: clients, consumer > Affects Versions: 3.7.0 > Reporter: Kirk True > Assignee: Kirk True > Priority: Critical > Labels: consumer-threading-refactor, kip-848-client-support > Fix For: 3.8.0 > > > When running system tests for the new consumer, I've hit an issue where the > {{HeartbeatRequestManager}} is sending out multiple concurrent > {{CONSUMER_GROUP_REQUEST}} RPCs. The effect is the coordinator creates > multiple members which causes downstream assignment problems. > Here's the order of events: > * Time 202: {{HearbeatRequestManager.poll()}} determines it's OK to send a > request. In so doing, it updates the {{RequestState}}'s {{lastSentMs}} to the > current timestamp, 202 > * Time 236: the response is received and response handler is invoked, setting > the {{RequestState}}'s {{lastReceivedMs}} to the current timestamp, 236 > * Time 236: {{HearbeatRequestManager.poll()}} is invoked again, and it sees > that it's OK to send a request. It creates another request, once again > updating the {{RequestState}}'s {{lastSentMs}} to the current timestamp, 236 > * Time 237: {{HearbeatRequestManager.poll()}} is invoked again, and > ERRONEOUSLY decides it's OK to send another request, despite one already in > flight. > Here's the problem with {{requestInFlight()}}: > {code:java} > public boolean requestInFlight() { > return this.lastSentMs > -1 && this.lastReceivedMs < this.lastSentMs; > } > {code} > On our case, {{lastReceivedMs}} is 236 and {{lastSentMs}} is _also_ 236. So > the received timestamp is _equal_ to the sent timestamp, not _less_. -- This message was sent by Atlassian Jira (v8.20.10#820010)