I want my consumers to process large batches, so I aim to have the consumer
listener "awake", say, on 1800mb of data or every 5min, whichever comes
first.

Mine is a kafka-springboot application, the topic has 28 partitions, and
this is the configuration I explicitly change:

| Parameter                 | Value I set | Default Value | Why I set it
this way   |
| ------------------------- | ----------- | ------------- |
----------------------- |
| fetch.max.bytes           | 1801mb      | 50mb          |
fetch.min.bytes+1mb     |
| fetch.min.bytes           | 1800mb      | 1b            | desired batch
size      |
| fetch.max.wait.ms         | 5min        | 500ms         | desired cadence
        |
| max.partition.fetch.bytes | 1801mb      | 1mb           | unbalanced
partitions   |
| request.timeout.ms        | 5min+1sec   | 30sec         |
fetch.max.wait.ms + 1sec|
| max.poll.records          | 10000       | 500           | 1500 found too
low      |
| max.poll.interval.ms      | 5min+1sec   | 5min          |
fetch.max.wait.ms + 1sec|

Nevertheless, I produce ~2gb of data to the topic, and I see the
consumer-listener (a Batch Listener) is called many times per second -- way
more than desired rate.

I logged the serialized-size of the `ConsumerRecords<?,?>` argument, and
found that it is never more than 55mb.
This hints that I was not able to set fetch.max.bytes above the default
50mb.

Any idea how I can troubleshoot this?



-----
Edit:
I found this question:
https://stackoverflow.com/questions/72812954/kafka-msk-a-configuration-of-high-fetch-max-wait-ms-and-fetch-min-bytes-is-beh?rq=1

Is it really impossible as stated?

Reply via email to