Hi Ismael and all devs, Is there any comments/suggestions to this KIP? If no, I'm going to update the KIP based on my previous mail, and start a vote tomorrow or next week.
Thank you. Luke On Mon, Oct 18, 2021 at 2:40 PM Luke Chen <show...@gmail.com> wrote: > Hi Ismael, > Thanks for your comments. > > 1. Why do we have to reallocate the buffer? We can keep a list of buffers > instead and avoid reallocation. > -> Do you mean we allocate multiple buffers with "buffer.initial.size", > and link them together (with linked list)? > ex: > a. We allocate 4KB initial buffer > | 4KB | > > b. when new records reached and the remaining buffer is not enough for the > records, we create another batch with "batch.initial.size" buffer > ex: we already have 3KB of data in the 1st buffer, and here comes the 2KB > record > > | 4KB (1KB remaining) | > now, record: 2KB coming > We fill the 1st 1KB into 1st buffer, and create new buffer, and linked > together, and fill the rest of data into it > | 4KB (full) | ---> | 4KB (3KB remaining) | > > Is that what you mean? > If so, I think I like this idea! > If not, please explain more detail about it. > Thank you. > > 2. I think we should also consider tweaking the semantics of batch.size so > that the sent batches can be larger if the batch is not ready to be sent > (while still respecting max.request.size and perhaps a new max.batch.size). > > --> In the KIP, I was trying to make the "batch.size" as the upper bound > of the batch size, and introduce a "batch.initial.size" as initial batch > size. > So are you saying that we can let "batch.size" as initial batch size and > introduce a "max.batch.size" as upper bound value? > That's a good suggestion, but that would change the semantics of > "batch.size", which might surprise some users. I think my original proposal > ("batch.initial.size") is safer for users. What do you think? > > Thank you. > Luke > > > On Mon, Oct 18, 2021 at 3:12 AM Ismael Juma <ism...@juma.me.uk> wrote: > >> I think we should also consider tweaking the semantics of batch.size so >> that the sent batches can be larger if the batch is not ready to be sent >> (while still respecting max.request.size and perhaps a new >> max.batch.size). >> >> Ismael >> >> On Sun, Oct 17, 2021, 12:08 PM Ismael Juma <ism...@juma.me.uk> wrote: >> >> > Hi Luke, >> > >> > Thanks for the KIP. Why do we have to reallocate the buffer? We can >> keep a >> > list of buffers instead and avoid reallocation. >> > >> > Ismael >> > >> > On Sun, Oct 17, 2021, 2:02 AM Luke Chen <show...@gmail.com> wrote: >> > >> >> Hi Kafka dev, >> >> I'd like to start the discussion for the proposal: KIP-782: Expandable >> >> batch size in producer. >> >> >> >> The main purpose for this KIP is to have better memory usage in >> producer, >> >> and also save users from the dilemma while setting the batch size >> >> configuration. After this KIP, users can set a higher batch.size >> without >> >> worries, and of course, with an appropriate "batch.initial.size" and >> >> "batch.reallocation.factor". >> >> >> >> Derailed description can be found here: >> >> >> >> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-782%3A+Expandable+batch+size+in+producer >> >> >> >> Any comments and feedback are welcome. >> >> >> >> Thank you. >> >> Luke >> >> >> > >> >