Hi all,

Seeing that we got all out votes needed with 3 binding votes and 0
nonbinding.
I consider this KIP passed.

Cheers,
Richard

On Fri, Oct 18, 2019 at 9:17 AM Guozhang Wang <wangg...@gmail.com> wrote:

> Thanks Richard, I'm +1 on the KIP
>
> On Thu, Oct 17, 2019 at 3:51 PM Richard Yu <yohan.richard...@gmail.com>
> wrote:
>
> > Hi Guozhang, Jason,
> >
> > I've updated the KIP to include this warning as well.
> > If there is anything else that we need for it, let me know. :)
> > Otherwise, we should vote this KIP in.
> >
> > Cheers,
> > Richard
> >
> > On Thu, Oct 17, 2019 at 10:41 AM Jason Gustafson <ja...@confluent.io>
> > wrote:
> >
> > > Hi Guozhang,
> > >
> > > It's a fair point. For control records, I think it's a non-issue since
> > they
> > > are tiny and not batched. So the only case where this might matter is
> > large
> > > batch deletions. I think the size difference is not a major issue
> itself,
> > > but I think it's worth mentioning in the KIP the risk of exceeding the
> > max
> > > message size. I think the code should probably make this more of a soft
> > > limit when cleaning. We have run into scenarios in the past as well
> where
> > > recompression has actually increased message size. We may also want to
> be
> > > able to upconvert messages to the new format in the future in the
> > cleaner.
> > >
> > > -Jason
> > >
> > >
> > >
> > > On Thu, Oct 17, 2019 at 9:08 AM Guozhang Wang <wangg...@gmail.com>
> > wrote:
> > >
> > > > Here's my understanding: when log compaction kicks in, the system
> time
> > at
> > > > the moment would be larger than the message timestamp to be
> compacted,
> > so
> > > > the modification on the batch timestamp would practically be
> increasing
> > > its
> > > > value, and hence the deltas for each inner message would be negative
> to
> > > > maintain their actual timestamp. Depending on the time diff between
> the
> > > > actual timestamp of the message and the time when log compaction
> > happens,
> > > > this negative delta can be large or small since it not long depends
> on
> > > the
> > > > cleaner thread wakeup frequency but also dirty ratio etc.
> > > >
> > > > With varInt encoding, the num.bytes needed for encode an int varies
> > from
> > > 1
> > > > to 5 bytes; before compaction, the deltas should be relatively small
> > > > positive values compared with the base timestamp, and hence most
> > likely 1
> > > > or 2 bytes needed to encode, after compaction, the deltas could be
> > > > relatively large negative values that may take more bytes to encode.
> > > With a
> > > > record batch of 512 in practice, and suppose after compaction each
> > record
> > > > would take 2 more byte for encoding deltas, that would be 1K more per
> > > > batch. Usually it would not be too big of an issue with reasonable
> > sized
> > > > message, but I just wanted to point out this as a potential
> regression.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > > On Wed, Oct 16, 2019 at 9:36 PM Richard Yu <
> yohan.richard...@gmail.com
> > >
> > > > wrote:
> > > >
> > > > > Hi Guozhang,
> > > > >
> > > > > Your understanding basically is on point.
> > > > >
> > > > > I haven't looked into the details for what happens if we change the
> > > base
> > > > > timestamp and how its calculated, so I'm not clear on how small the
> > > delta
> > > > > (or big) is.
> > > > > To be fair, would the the delta size pose a big problem if it takes
> > up
> > > > more
> > > > > bytes to encode?
> > > > >
> > > > > Cheers,
> > > > > Richard
> > > > >
> > > > > On Wed, Oct 16, 2019 at 7:36 PM Guozhang Wang <wangg...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hello Richard,
> > > > > >
> > > > > > Thanks for the KIP, I just have one clarification regarding "So
> the
> > > > idea
> > > > > is
> > > > > > to set the base timestamp to the delete horizon and adjust the
> > deltas
> > > > > > accordingly." My understanding is that during compaction, for
> each
> > > > > > compacted new segment, we would set its base offset of each batch
> > as
> > > > the
> > > > > > delete horizon, which is the "current system time that cleaner
> has
> > > seen
> > > > > so
> > > > > > far", and adjust the delta timestamps of each of the inner
> records
> > of
> > > > the
> > > > > > batch (and practically the deltas will be all negative)?
> > > > > >
> > > > > > If that's case, could we do some back of the envelope calculation
> > on
> > > > > what's
> > > > > > the possible smallest case of deltas? Note that since we use
> varInt
> > > for
> > > > > > delta values for each record, the smaller the negative delta,
> that
> > > > would
> > > > > > take more bytes to encode.
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > > On Wed, Oct 16, 2019 at 6:48 PM Jason Gustafson <
> > ja...@confluent.io>
> > > > > > wrote:
> > > > > >
> > > > > > > +1. Thanks Richard.
> > > > > > >
> > > > > > > On Wed, Oct 16, 2019 at 10:04 AM Richard Yu <
> > > > > yohan.richard...@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > Want to try to get this KIP wrapped up. So it would be great
> if
> > > we
> > > > > can
> > > > > > > get
> > > > > > > > some votes.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Richard
> > > > > > > >
> > > > > > > > On Tue, Oct 15, 2019 at 12:58 PM Jun Rao <j...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Richard,
> > > > > > > > >
> > > > > > > > > Thanks for the updated KIP. +1 from me.
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Tue, Oct 15, 2019 at 12:46 PM Richard Yu <
> > > > > > > yohan.richard...@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Jun,
> > > > > > > > > >
> > > > > > > > > > I've updated the link accordingly. :)
> > > > > > > > > > Here is the updated KIP link:
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-534%3A+Retain+tombstones+and+transaction+markers+for+approximately+delete.retention.ms+milliseconds
> > > > > > > > > >
> > > > > > > > > > Cheers,
> > > > > > > > > > Richard
> > > > > > > > > >
> > > > > > > > > > On Mon, Oct 14, 2019 at 5:12 PM Jun Rao <
> j...@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Richard,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the KIP. Looks good to me overall. A few
> minor
> > > > > > comments
> > > > > > > > > below.
> > > > > > > > > > >
> > > > > > > > > > > 1. Could you change the title from "Retain tombstones"
> to
> > > > > "Retain
> > > > > > > > > > > tombstones and transaction markers" to make it more
> > > general?
> > > > > > > > > > >
> > > > > > > > > > > 2. Could you document which bit in the batch attribute
> > will
> > > > be
> > > > > > used
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > new flag? The current format of the batch attribute is
> > the
> > > > > > > following.
> > > > > > > > > > >
> > > > > > > > > > > *
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -------------------------------------------------------------------------------------------------
> > > > > > > > > > > *  | Unused (6-15) | Control (5) | Transactional (4) |
> > > > > Timestamp
> > > > > > > Type
> > > > > > > > > > > (3) | Compression Type (0-2) |
> > > > > > > > > > >
> > > > > > > > > > > *
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> -------------------------------------------------------------------------------------------------
> > > > > > > > > > >
> > > > > > > > > > > 3. Could you provide the reasons for the rejected
> > > proposals?
> > > > > For
> > > > > > > > > > > proposal 1, one reason is that it doesn't cover the
> > > > transaction
> > > > > > > > > > > markers. For proposal 2, one reason is that the
> interval
> > > > record
> > > > > > > > header
> > > > > > > > > > > could be exposed to the clients.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Oct 14, 2019 at 4:42 PM Richard Yu <
> > > > > > > > yohan.richard...@gmail.com
> > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > The discussion for KIP-534 seems to have concluded.
> > > > > > > > > > > > So I wish to vote this in so that we can get it done.
> > > Its a
> > > > > > small
> > > > > > > > bug
> > > > > > > > > > > fix.
> > > > > > > > > > > > :)
> > > > > > > > > > > >
> > > > > > > > > > > > Below is the KIP link:
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-534%3A+Retain+tombstones+for+approximately+delete.retention.ms+milliseconds
> > > > > > > > > > > >
> > > > > > > > > > > > Cheers,
> > > > > > > > > > > > Richard
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>
>
> --
> -- Guozhang
>

Reply via email to