Hi Sergio,

I wonder if « max-bytes » would be a better name than « max-batches-size ».
The intend is more explicit. What do you think?

Best,
David

Le sam. 5 mars 2022 à 10:36, Luke Chen <show...@gmail.com> a écrit :

> Hi Sergio,
>
> Thanks for the explanation! Very clear!
> I think we should put this example and explanation into KIP.
>
> Other comments:
> 1. If the *max-batches-size* is too small that results in no records
> output, will we output any information to the user?
> 2. After your explanation, I guess the use of *max-batches-size* won't
> conflict with *max-message-size*, right?
> That is, user can set the 2 arguments at the same time. Is that correct?
>
> Thank you.
> Luke
>
> Thank you.
> Luke
>
> On Sat, Mar 5, 2022 at 4:47 PM Sergio Daniel Troiano
> <sergio.troi...@adevinta.com.invalid> wrote:
>
> > hey Luke,
> >
> > thanks for the interest, it is a good question, please let me explain
> you:
> >
> > *max-message-size *a filter for the size of each batch, so for example if
> > Iset --max-message-size 1000 bytes and my segment log has 300 batches,
> 150
> > of them has a size of 500 bytes  and the other 150 has a size of 2000
> bytes
> > then the script will skip the las 150 ones as each batch is heavier than
> > the limit.
> >
> > In the other hand following the same example above with *max-batches-size
> > *set
> > to 1000 bytes it will only print out the first 2 batches (500 bytes each)
> > and stop, This will avoid reading the whole file
> >
> >
> > Also if all of them are smaller than 1000 bytes it will end up printing
> out
> > all the batches.
> > The idea of my change is to limit the *amount* of batches no matter their
> > size.
> >
> > I hope this reply helps.
> > Best regards.
> >
> > On Sat, 5 Mar 2022 at 08:00, Luke Chen <show...@gmail.com> wrote:
> >
> > > Hi Sergio,
> > >
> > > Thanks for the KIP!
> > >
> > > One question:
> > > I saw there's a `max-message-size` argument that seems to do the same
> > thing
> > > as you want.
> > > Could you help explain what's the difference between `max-message-size`
> > and
> > > `max-batches-size`?
> > >
> > > Thank you.
> > > Luke
> > >
> > > On Sat, Mar 5, 2022 at 3:21 AM Kirk True <k...@mustardgrain.com>
> wrote:
> > >
> > > > Hi Sergio,
> > > >
> > > > Thanks for the KIP. I don't know anything about the log segment
> > > internals,
> > > > but the logic and implementation seem sound.
> > > >
> > > > Three questions:
> > > >  1. Since the --max-batches-size unit is bytes, does it matter if
> that
> > > > size doesn't align to a record boundary?
> > > >  2. Can you add a check to make sure that --max-batches-size doesn't
> > > allow
> > > > the user to pass in a negative number?
> > > >  3. Can you add/update any unit tests related to the DumpLogSegments
> > > > arguments?
> > > > Thanks,
> > > > Kirk
> > > >
> > > > On Thu, Mar 3, 2022, at 1:32 PM, Sergio Daniel Troiano wrote:
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-824%3A+Allowing+dumping+segmentlogs+limiting+the+batches+in+the+output
> > > > >
> > > >
> > >
> >
>

Reply via email to