Hi, my name is Pere, Software Engineer turned Consultant for the last few years, :-/ ...., but keep doing Kafka quite a lot of my day-to-day.
I know not a sexy issue, but last few days, I worked with customers who again hit https://issues.apache.org/jira/browse/KAFKA-4169 when using producing with compression, getting a few uncomfortable RecordTooLargeExceptions. While I see and agree with the community, this issue is not that important, it keeps annoying starters, so I thought to try giving it a try to do something here. I updated the issue with some comments of mine, but in a nutshell, unless I'm missing something, it certainly could be. I think the max.request.size config and the related check are wrong and problematic, Jorge already mentioned that in the ticket. In my opinion, it mixes requests (batches) with single messages, and it does not take into account any compression. I can understand when people get confused, and so do I from time to time. Before starting a KIP, and writing a ton I wanted to drop here a message and see how you feel about possible solutions I have thought. To my understanding, all might need a KIP, but please let me know if I got it wrong. * As Jorge proposed in the ticket, use the CompressionRateEstimator to complement the existing check and be more accurate for single messages than it is right now. All with having in mind that it will be working with an estimator. This will require, in my mind to add a new configuration for a single message size, isn't it? * Remove completely the fail-fast check earlier mentioned and is currently in place. Introduce a new check at the tryAppend methods within the RecordAccumulator. The check-in this case will be using, as in option 1, the CompressionRateEstimator. In this case, I see the current variable as being more accurate, as we will be doing request level (ProducerBatch) level checks. What do you think? Looking forward to help out, -- Pere Urbon-Bayes Software Architect https://twitter.com/purbon https://www.linkedin.com/in/purbon/