Luke Chen created KAFKA-19460: --------------------------------- Summary: fetch result might have size < fetch.min.bytes even if data is available in replica Key: KAFKA-19460 URL: https://issues.apache.org/jira/browse/KAFKA-19460 Project: Kafka Issue Type: Improvement Reporter: Luke Chen
In the doc of "[fetch.min.bytes|https://kafka.apache.org/documentation/#consumerconfigs_fetch.min.bytes]", it said: ??The minimum amount of data the server should return for a fetch request. If insufficient data is available the request will wait for that much data to accumulate before answering the request.?? It makes users believe the records returned will always greater fetch.min.bytes if there is sufficient data in replica. But even if the data is sufficient is available in the replica, there is still possible the returned records size < fetch.min.bytes. For example: # Config fetch.max.bytes=1500 max.partition.fetch.bytes=1000 fetch.min.bytes=1100 fetch.max.wait.ms=500 # topic foo has 2 partitions, and each partition contains 1 record with size 1000 bytes. # When a consumer fetches data from these 2 partitions, it starts from foo-0, and fetch 1000 bytes of data, and 500 bytes left before reaching fetch.max.bytes. # When fetching foo-1, since we only have 500 bytes available to be fetched, and the first batch size in foo-1 is 1000 bytes, which is greater than 500, so we don't fetch it. # In the end, the total returned size is 1000 bytes, which is less than fetch.min.bytes, without waiting until `fetch.max.wait.ms` expired. It's because we checked the total size in replicas are more than "fetch.min.bytes", so no wait for "fetch.max.wait.ms". I think the logic is correct. It's just we need to update the doc to make it clear to users. We might also need to check `replica.fetch.min.bytes` config. -- This message was sent by Atlassian Jira (v8.20.10#820010)