Hi Sergio, I wonder if « max-bytes » would be a better name than « max-batches-size ». The intend is more explicit. What do you think?
Best, David Le sam. 5 mars 2022 à 10:36, Luke Chen <show...@gmail.com> a écrit : > Hi Sergio, > > Thanks for the explanation! Very clear! > I think we should put this example and explanation into KIP. > > Other comments: > 1. If the *max-batches-size* is too small that results in no records > output, will we output any information to the user? > 2. After your explanation, I guess the use of *max-batches-size* won't > conflict with *max-message-size*, right? > That is, user can set the 2 arguments at the same time. Is that correct? > > Thank you. > Luke > > Thank you. > Luke > > On Sat, Mar 5, 2022 at 4:47 PM Sergio Daniel Troiano > <sergio.troi...@adevinta.com.invalid> wrote: > > > hey Luke, > > > > thanks for the interest, it is a good question, please let me explain > you: > > > > *max-message-size *a filter for the size of each batch, so for example if > > Iset --max-message-size 1000 bytes and my segment log has 300 batches, > 150 > > of them has a size of 500 bytes and the other 150 has a size of 2000 > bytes > > then the script will skip the las 150 ones as each batch is heavier than > > the limit. > > > > In the other hand following the same example above with *max-batches-size > > *set > > to 1000 bytes it will only print out the first 2 batches (500 bytes each) > > and stop, This will avoid reading the whole file > > > > > > Also if all of them are smaller than 1000 bytes it will end up printing > out > > all the batches. > > The idea of my change is to limit the *amount* of batches no matter their > > size. > > > > I hope this reply helps. > > Best regards. > > > > On Sat, 5 Mar 2022 at 08:00, Luke Chen <show...@gmail.com> wrote: > > > > > Hi Sergio, > > > > > > Thanks for the KIP! > > > > > > One question: > > > I saw there's a `max-message-size` argument that seems to do the same > > thing > > > as you want. > > > Could you help explain what's the difference between `max-message-size` > > and > > > `max-batches-size`? > > > > > > Thank you. > > > Luke > > > > > > On Sat, Mar 5, 2022 at 3:21 AM Kirk True <k...@mustardgrain.com> > wrote: > > > > > > > Hi Sergio, > > > > > > > > Thanks for the KIP. I don't know anything about the log segment > > > internals, > > > > but the logic and implementation seem sound. > > > > > > > > Three questions: > > > > 1. Since the --max-batches-size unit is bytes, does it matter if > that > > > > size doesn't align to a record boundary? > > > > 2. Can you add a check to make sure that --max-batches-size doesn't > > > allow > > > > the user to pass in a negative number? > > > > 3. Can you add/update any unit tests related to the DumpLogSegments > > > > arguments? > > > > Thanks, > > > > Kirk > > > > > > > > On Thu, Mar 3, 2022, at 1:32 PM, Sergio Daniel Troiano wrote: > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-824%3A+Allowing+dumping+segmentlogs+limiting+the+batches+in+the+output > > > > > > > > > > > > > > >