Luke Chen created KAFKA-19460:
---------------------------------

             Summary: fetch result might have size < fetch.min.bytes even if 
data is available in replica 
                 Key: KAFKA-19460
                 URL: https://issues.apache.org/jira/browse/KAFKA-19460
             Project: Kafka
          Issue Type: Improvement
            Reporter: Luke Chen


In the doc of 
"[fetch.min.bytes|https://kafka.apache.org/documentation/#consumerconfigs_fetch.min.bytes]";,
 it said:

??The minimum amount of data the server should return for a fetch request. If 
insufficient data is available the request will wait for that much data to 
accumulate before answering the request.??

It makes users believe the records returned will always greater fetch.min.bytes 
if there is sufficient data in replica. But even if the data is sufficient is 
available in the replica, there is still possible the returned records size < 
fetch.min.bytes.

 

For example:
 # Config 
fetch.max.bytes=1500
max.partition.fetch.bytes=1000
fetch.min.bytes=1100
fetch.max.wait.ms=500
 # topic foo has 2 partitions, and each partition contains 1 record with size 
1000 bytes.
 # When a consumer fetches data from these 2 partitions, it starts from foo-0, 
and fetch 1000 bytes of data, and 500 bytes left before reaching 
fetch.max.bytes.
 # When fetching foo-1, since we only have 500 bytes available to be fetched, 
and the first batch size in foo-1 is 1000 bytes, which is greater than 500, so 
we don't fetch it.
 # In the end, the total returned size is 1000 bytes, which is less than 
fetch.min.bytes, without waiting until `fetch.max.wait.ms` expired. It's 
because we checked the total size in replicas are more than "fetch.min.bytes", 
so no wait for "fetch.max.wait.ms".

 

I think the logic is correct. It's just we need to update the doc to make it 
clear to users. We might also need to check `replica.fetch.min.bytes` config.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to