Hi,
  my name is Pere, Software Engineer turned Consultant for the last few
years, :-/ ...., but keep doing Kafka quite a lot of my day-to-day.

I know not a sexy issue, but last few days, I worked with customers who
again hit  https://issues.apache.org/jira/browse/KAFKA-4169 when using
producing with compression, getting a few uncomfortable
RecordTooLargeExceptions.

While I see and agree with the community, this issue is not that important,
it keeps annoying starters, so I thought to try giving it a try to do
something here.

I updated the issue with some comments of mine, but in a nutshell, unless
I'm missing something, it certainly could be. I think the max.request.size
config and the related check are wrong and problematic, Jorge already
mentioned that in the ticket. In my opinion, it mixes requests (batches)
with single messages, and it does not take into account any compression. I
can understand when people get confused, and so do I from time to time.

Before starting a KIP, and writing a ton I wanted to drop here a message
and see how you feel about possible solutions I have thought. To my
understanding, all might need a KIP, but please let me know if I got it
wrong.

* As Jorge proposed in the ticket, use the CompressionRateEstimator to
complement the existing check and be more accurate for single messages than
it is right now. All with having in mind that it will be working with an
estimator. This will require, in my mind to add a new configuration for a
single message size, isn't it?

* Remove completely the fail-fast check earlier mentioned and is currently
in place. Introduce a new check at the tryAppend methods within the
RecordAccumulator. The check-in this case will be using, as in option 1,
the CompressionRateEstimator. In this case, I see the current variable as
being more accurate, as we will be doing request level (ProducerBatch)
level checks.

What do you think?

Looking forward to help out,


-- 
Pere Urbon-Bayes
Software Architect
https://twitter.com/purbon
https://www.linkedin.com/in/purbon/

Reply via email to