[ https://issues.apache.org/jira/browse/KAFKA-19462?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Luke Chen updated KAFKA-19462: ------------------------------ Description: Currently in local fetch case, we'll calculate the remaining bytes to be fetched for each partition via "fetch.max.bytes" and "max.partition.fetch.bytes" configs. For example: # Config: max.partition.fetch.bytes = 1MB fetch.max.bytes = 1.5MB # Topic foo has 2 partitions. # Consumer fetches data from topic foo # Fetches from foo-0 first, it got 1MB of data (max.partition.fetch.bytes), so remaining 0.5 MB of data available to be fetched # Fetches from foo-1 for max 0.5MB. # Total returned 1.5MB records However, in remote + local fetch case, because we don't know how much data we can fetch before querying remote log metadata manager or other resource, we can't have a value to tell replicaManager beforehand. Currently, we treat it as 0 bytes read. And that's why the final returned data could exceed the "fetch.max.bytes" value. For example: # Config: max.partition.fetch.bytes = 1MB fetch.max.bytes = 1.5MB # Topic foo has 2 partitions + topic boo has 1 partition with tiered storage enabled. # Consumer fetches data from topic foo and boo # Fetches from boo-0, because we don't know how much data we can get, return 0, and send to remote async read. # Fetches from foo-0, it got 1MB of data, so remaining 0.5 MB of data available to be fetched # Fetches from foo-1 for max 0.5MB. # remote async read for boo-0, and it got 1MB data (max.partition.fetch.bytes). # Total returned 2.5MB records, which exceeds `fetch.max.bytes = 1.5MB` was: Currently in local fetch case, we'll calculate the remaining bytes to be fetched for each partition via "fetch.max.bytes" and "max.partition.fetch.bytes" configs. For example: # Config: max.partition.fetch.bytes = 1MB fetch.max.bytes = 1.5MB # Topic foo has 2 partitions. # Consumer fetches data from topic foo # Fetches from foo-0 first, it got 1MB of data, so remaining 0.5 MB of data available to be fetched # Fetches from foo-1 for max 0.5MB. # Total returned 1.5MB records However, in remote + local fetch case, because we don't know how much data we can fetch before querying remote log metadata manager or other resource, we can't have a value to tell replicaManager beforehand. Currently, we treat it as 0 bytes read. And that's why the final returned data could exceed the "fetch.max.bytes" value. For example: # Config: max.partition.fetch.bytes = 1MB fetch.max.bytes = 1.5MB # Topic foo has 2 partitions + topic boo has 1 partition with tiered storage enabled. # Consumer fetches data from topic foo and boo # Fetches from boo-0, because we don't know how much data we can get, return 0, and send to remote async read. # Fetches from foo-0, it got 1MB of data, so remaining 0.5 MB of data available to be fetched # Fetches from foo-1 for max 0.5MB. # remote async read for boo-0, and it got 1MB data (max.partition.fetch.bytes). # Total returned 2.5MB records, which exceeds `fetch.max.bytes = 1.5MB` > "fetch.max.bytes" config is not honored when remote + local fetch > ----------------------------------------------------------------- > > Key: KAFKA-19462 > URL: https://issues.apache.org/jira/browse/KAFKA-19462 > Project: Kafka > Issue Type: Bug > Reporter: Luke Chen > Assignee: Luke Chen > Priority: Major > > Currently in local fetch case, we'll calculate the remaining bytes to be > fetched for each partition via "fetch.max.bytes" and > "max.partition.fetch.bytes" configs. For example: > # Config: > max.partition.fetch.bytes = 1MB > fetch.max.bytes = 1.5MB > # Topic foo has 2 partitions. > # Consumer fetches data from topic foo > # Fetches from foo-0 first, it got 1MB of data (max.partition.fetch.bytes), > so remaining 0.5 MB of data available to be fetched > # Fetches from foo-1 for max 0.5MB. > # Total returned 1.5MB records > However, in remote + local fetch case, because we don't know how much data we > can fetch before querying remote log metadata manager or other resource, we > can't have a value to tell replicaManager beforehand. Currently, we treat it > as 0 bytes read. And that's why the final returned data could exceed the > "fetch.max.bytes" value. > For example: > # Config: > max.partition.fetch.bytes = 1MB > fetch.max.bytes = 1.5MB > # Topic foo has 2 partitions + topic boo has 1 partition with tiered storage > enabled. > # Consumer fetches data from topic foo and boo > # Fetches from boo-0, because we don't know how much data we can get, return > 0, and send to remote async read. > # Fetches from foo-0, it got 1MB of data, so remaining 0.5 MB of data > available to be fetched > # Fetches from foo-1 for max 0.5MB. > # remote async read for boo-0, and it got 1MB data > (max.partition.fetch.bytes). > # Total returned 2.5MB records, which exceeds `fetch.max.bytes = 1.5MB` > -- This message was sent by Atlassian Jira (v8.20.10#820010)