Dhruvil Shah created KAFKA-10517: ------------------------------------ Summary: Inefficient consumer processing with fetch sessions Key: KAFKA-10517 URL: https://issues.apache.org/jira/browse/KAFKA-10517 Project: Kafka Issue Type: Bug Reporter: Dhruvil Shah
With the introduction of fetch sessions, the consumer and the broker share a unified view of the partitions being consumed and their current state (fetch_offset, last_propagated_hwm, last_propagated_start_offset, etc.). The consumer is still expected to consume in a round robin manner, however, we have observed certain cases where this is not the case. Because of how we perform memory management on the consumer and implement fetch pipelining, we exclude partitions from a FetchRequest when they have not been drained by the application. This is done by adding these partitions to the `toForget` list in the `FetchRequest`. When partitions are added to the `toForget` list, the broker removes these partitions from its session cache. This causes bit of a divergence between the broker's and the client's view of the metadata. When forgotten partitions are added back to the Fetch after the application have drained them, the server will immediately add them back to the session cache and return a response for them, even if there is no corresponding data. This re-triggers the behavior on the consumer to put this partition on the `toForget` list incorrectly, even though no data for the partition may have been returned. We have seen this behavior to cause an imbalance in lags across partitions as the consumer no longer obeys the round-robin sequence given that the partitions keep shuffling between the `toForget` and `toSend` lists. At a high level, this is caused due to the out of sync session caches on the consumer and broker. This ends up in a state where the partition balance is being maintained by external factors (such as whether metadata was returned for a partition), rather than following the round-robin ordering. -- This message was sent by Atlassian Jira (v8.3.4#803005)