Hi, You mention topic size, but I will explain using partition size. If you want to control the maximum size of all partitions combined, you need to take that value, divide it by the number of partitions and set the retention.bytes setting to the result.
A topic partition is stored in segments, with only the latest segment written to, called the active segment. If the topic cleanup policy is set to deletion then Kafka cleans up only old segments that contain records older than the configured retention time, or if the maximum size for the partition is exceeded. In your case the retention partition size is 10GB and retention time is 1 hour. With a produce rate of 11GB per hour it would mean that the size based clean up rule is triggered before the time based cleanup. The oldest inactive segments which contain the oldest messages would be deleted, even if they were produced in the last hour. Old segments will be deleted until partition size is less than retention.bytes value, or no old segments exist. The segment sizes and message age can also be controlled with the segment.bytes and segment.ms settings. A new segment is created when the active segment moves beyond one of these settings. Creating more files to track can create other performance issues, because of CPU/RAM/File system limitations. Keep monitoring the performance to get a feel of your cluster performance. Producers should not notice these cleanups, unless the broker has to create and clean segments too often. Then there can be delays or producer errors. I hope this answers your question. Kind regards, Richard Bosch Developer Advocate Axual BV https://axual.com/ On Thu, Nov 9, 2023 at 3:00 PM Yeikel Santana <em...@yeikel.com> wrote: > Hi all, > > This might be a common question, but unfortunately, I couldn't find a > reliable answer or documentation to guide me. There are various conflicting > ideas. > > If a producer tries to ingest at a faster rate than the configuration set > in the topic, what will happen? > > Example: > > - Topic size: 10 GB > - Retention Period: 1h > - Producer rate: 11 GB/h > > > Will Kafka: > > - Aggressively delete older messages even if the retention period is > greater than the age of the message? > - Reject messages from the producer until there is room for new messages? > - Potentially delete newer or older messages to make room? > - Any other type of data handling? > > Thanks! >