Re: [VOTE] KIP-280: Enhanced log compaction

Jason Gustafson Mon, 13 Aug 2018 16:26:14 -0700

Hey Luis,

Thanks for the explanation. I'd suggest adding the use case to the
motivation section.


I think my only hesitation about the header-based compaction is that it is
the first time we are putting a schema requirement on header values. I
wonder if it's better to leave Kafka agnostic. For example, maybe the
compaction strategy could be a plugin which allows custom derivation of the
compaction key. Say something like this:

class CompactionKey {
  byte[] key;
  long version;
}

interface CompactionStrategy {
  CompactionKey deriveCompactionKey(Record record);
}

The idea is to leave schemas in the hands of users. Have you considered
something like that?

Thanks,
Jason

On Sat, Aug 11, 2018 at 2:04 AM, Luís Cabral <[email protected]>
wrote:

> Hi Jason,
>
> The initial (and still only) requirement I wanted out of this KIP was to
> have the header strategy.
> This is because I want to be able to version both by date/time or by
> (externally provided) sequence, this is specially important if you are
> running in multiple environments, which may cause the commonly known issue
> of the clocks being slightly de-synchronized.
> Having the timestamp strategy was added to the KIP as the result of the
> discussions, where it was seen as a potential benefit for other clients who
> may prefer that.
>
> Cheers,
> Luís
>
> From: Jason Gustafson
> Sent: 10 August 2018 22:38
> To: dev
> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>
> Hi Luis,
>
> It's still not very clear to me why we need the header-based strategy. Can
> you elaborate why having the timestamp-based approach alone is not
> sufficient? The use case in the motivation just describes a "most recent
> snapshot" use case.
>
> Thanks,
> Jason
>
> On Thu, Aug 9, 2018 at 4:36 AM, Luís Cabral <[email protected]
> >
> wrote:
>
> > Hi,
> >
> >
> > So, after a "short" break, I've finally managed to find time to resume
> > this KIP. Sorry to all for the delay.
> >
> > Continuing the conversation of the configurations being global vs  topic,
> > I've checked this and it seems that they are only available globally.
> >
> > This configuration is passed to the log cleaner via
> "CleanerConfig.scala",
> > which only accepts global configurations. This seems intentional, as the
> > log cleaner is not mutable and doesn't get instantiated that often. I
> think
> > that changing this to accept per-topic configuration would be very nice,
> > but perhaps as a part of a different KIP.
> >
> >
> > Following the Kafka documentation, these are the settings I'm referring
> to:
> >
> > -- --
> >
> >     Updating Log Cleaner Configs
> >
> >     Log cleaner configs may be updated dynamically at cluster-default
> > level used by all brokers. The changes take effect on the next iteration
> of
> > log cleaning. One or more of these configs may be updated:
> >
> >         * log.cleaner.threads
> >
> >         * log.cleaner.io.max.bytes.per.second
> >
> >         * log.cleaner.dedupe.buffer.size
> >
> >         * log.cleaner.io.buffer.size
> >
> >         * log.cleaner.io.buffer.load.factor
> >
> >         * log.cleaner.backoff.ms
> >
> > -- --
> >
> >
> >
> > Please feel free to confirm, otherwise I will update the KIP to reflect
> > these configuration nuances in the next few days.
> >
> >
> > Best Regards,
> >
> > Luis
> >
> >
> >
> > On Monday, July 9, 2018, 1:57:38 PM GMT+2, Andras Beni <
> > [email protected]> wrote:
> >
> >
> >
> >
> >
> > Hi Luís,
> >
> > Can you please clarify how the header value has to be encoded in case log
> > compaction strategy is 'header'. As I see current PR reads varLong in
> > CleanerCache.extractVersion and read String and uses toLong in
> > Cleaner.extractVersion while the KIP says no more than 'the header value
> > (which must be of type "long")'.
> >
> > Otherwise +1 for the KIP
> >
> > As for current implementation: it seems in Cleaner class header key
> > "version" is hardwired.
> >
> > Andras
> >
> >
> >
> > On Fri, Jul 6, 2018 at 10:36 PM Jun Rao <[email protected]> wrote:
> >
> > > Hi, Guozhang,
> > >
> > > For #4, what you suggested could make sense for timestamp based de-dup,
> > but
> > > not sure how general it is since the KIP also supports de-dup based on
> > > header.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Jul 6, 2018 at 1:12 PM, Guozhang Wang <[email protected]>
> > wrote:
> > >
> > > > Hello Jun,
> > > > Thanks for your feedbacks. I'd agree on #3 that it's worth adding a
> > > special
> > > > check to not delete the last message, since although unlikely, it is
> > > still
> > > > possible that a new active segment gets rolled out but contains no
> data
> > > > yet, and hence the actual last message in this case would be in a
> > > > "compact-able" segment.
> > > >
> > > > For the second part of #4 you raised, maybe we could educate users to
> > > set "
> > > > message.timestamp.difference.max.ms" to be no larger than "
> > > > log.cleaner.delete.retention.ms" (its default value is
> > Long.MAX_VALUE)?
> > > A
> > > > more aggressive approach would be changing the default value of the
> > > former
> > > > to be the value of the latter if:
> > > >
> > > > 1. cleanup.policy = compact OR compact,delete
> > > > 2. log.cleaner.compaction.strategy != offset
> > > >
> > > > Because in this case I think it makes sense to really allow users
> send
> > > any
> > > > data longer than "log.cleaner.delete.retention.ms", WDYT?
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Fri, Jul 6, 2018 at 11:51 AM, Jun Rao <[email protected]> wrote:
> > > >
> > > > > Hi, Luis,
> > > > >
> > > > > 1. The cleaning policy is configurable at both global and topic
> > level.
> > > > The
> > > > > global one has the name log.cleanup.policy and the topic level has
> > the
> > > > name
> > > > > cleanup.policy by just stripping the log prefix. We can probably do
> > the
> > > > > same for the new configs.
> > > > >
> > > > > 2. Since this KIP may require an admin to configure a larger dedup
> > > buffer
> > > > > size, it would be useful to document this impact in the wiki and
> the
> > > > > release notes.
> > > > >
> > > > > 3. Yes, it's unlikely for the last message to be removed in the
> > current
> > > > > implementation since we never clean the active segment. However, in
> > > > theory,
> > > > > this can happen. So it would be useful to guard this explicitly.
> > > > >
> > > > > 4. Just thought about another issue. We probably want to be a bit
> > > careful
> > > > > with key deletion. Currently, one can delete a key by sending a
> > message
> > > > > with a delete tombstone (a null payload). To prevent a reader from
> > > > missing
> > > > > a deletion if it's removed too quickly, we depend on a
> configuration
> > > > > log.cleaner.delete.retention.ms (defaults to 1 day). The delete
> > > > tombstone
> > > > > will only be physically removed from the log after that amount of
> > time.
> > > > The
> > > > > expectation is that a reader should finish reading to the end of
> the
> > > log
> > > > > after consuming a message within that configured time. With the new
> > > > > strategy, we have similar, but slightly different problems. The
> first
> > > > > problem is that the delete tombstone may be delivered earlier than
> an
> > > > > outdated record in offset order to a consumer. In order for the
> > > consumer
> > > > > not to take the outdated record, the consumer should cache the
> > deletion
> > > > > tombstone for some configured amount of time. We ca probably
> > piggyback
> > > > this
> > > > > on log.cleaner.delete.retention.ms, but we need to document this.
> > The
> > > > > second problem is that once the delete tombstone is physically
> > removed
> > > > from
> > > > > the log, how can we prevent outdated records to be added
> (otherwise,
> > > they
> > > > > will never be garbage collected)? Not sure what's the best way to
> do
> > > > this.
> > > > > One possible way is to push this back to the application and
> require
> > > the
> > > > > user not to publish outdated records after
> > > log.cleaner.delete.retention.
> > > > ms
> > > > > .
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Jul 4, 2018 at 11:11 AM, Luís Cabral
> > > > <[email protected]
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jun,
> > > > > >
> > > > > > -:  1. I guess both new configurations will be at the topic
> level?
> > > > > >
> > > > > > They will exist in the global configuration, at the very least.
> > > > > > I would like to have them on the topic level as well, but there
> is
> > an
> > > > > > inconsistency between the cleanup/compaction properties that
> exist
> > > > “only
> > > > > > globally” vs “globally + per topic”.
> > > > > > I haven’t gotten around to investigating why, and if that reason
> > > would
> > > > > > then also impact the properties I’m suggesting. At first glance
> > they
> > > > seem
> > > > > > to belong with the properties that are "only globally”
> configured,
> > > but
> > > > > > Guozhang has written earlier with a suggestion of a compaction
> > > property
> > > > > > that works for both (though I haven’t had the time to look into
> it
> > > yet,
> > > > > > unfortunately).
> > > > > >
> > > > > > -:  2. Since the log cleaner now needs to keep both the offset
> and
> > > > > another
> > > > > > long (say timestamp) in the de-dup map, it reduces the number of
> > keys
> > > > > that
> > > > > > we can keep in the map and thus may require more rounds of
> > cleaning.
> > > > This
> > > > > > is probably not a big issue, but it will be useful to document
> this
> > > > > impact
> > > > > > in the KIP.
> > > > > >
> > > > > > As a reader, I tend to prefer brief documentation on new features
> > > (they
> > > > > > tend to be too many for me to find the willpower to read a
> 200-page
> > > > essay
> > > > > > about each one), so that influences me to avoid writing about
> every
> > > > > > micro-impact that may exist, and simply leave it inferred (as you
> > > have
> > > > > just
> > > > > > done).
> > > > > > But I also don’t feel strongly enough about it to argue either
> way.
> > > So,
> > > > > > after reading my argument, if you still insist, I’ll happily add
> > this
> > > > > there.
> > > > > >
> > > > > > -: 3. With the new cleaning strategy, we want to be a bit careful
> > > with
> > > > > > removing the last message in a partition (which is possible now).
> > We
> > > > need
> > > > > > to preserve the offset of the last message so that we don't reuse
> > the
> > > > > > offset for a different message. One way to simply never remove
> the
> > > last
> > > > > > message. Another way (suggested by Jason) is to create an empty
> > > message
> > > > > > batch.
> > > > > >
> > > > > > That is a good point, but isn’t the last message always kept
> > > > regardless?
> > > > > > In all of my tests with this approach, never have I seen it being
> > > > > removed.
> > > > > > This is not because I made it so while changing the code, it was
> > > simply
> > > > > > like this before, which made me smile!
> > > > > > Given these results, I just *assumed* (oops) that these scenarios
> > > > > > represented the reality, so the compaction would only affected
> the
> > > > > “tail”,
> > > > > > while the “head” remained untouched. Now that you say its
> possible
> > > that
> > > > > the
> > > > > > last message actually gets overwritten somehow, I guess a new
> > bullet
> > > > > point
> > > > > > will have to be added to the KIP for this (after I’ve found the
> > time
> > > to
> > > > > > review the portion of the code that enacts this behaviour).
> > > > > >
> > > > > > Kind Regards,
> > > > > > Luís Cabral
> > > > > >
> > > > > > From: Jun Rao
> > > > > > Sent: 03 July 2018 23:58
> > > > > > To: dev
> > > > > > Subject: Re: [VOTE] KIP-280: Enhanced log compaction
> > > > > >
> > > > > > Hi, Luis,
> > > > > >
> > > > > > Thanks for the KIP. Overall, this seems a useful KIP. A few
> > comments
> > > > > below.
> > > > > >
> > > > > > 1. I guess both new configurations will be at the topic level?
> > > > > > 2. Since the log cleaner now needs to keep both the offset and
> > > another
> > > > > long
> > > > > > (say timestamp) in the de-dup map, it reduces the number of keys
> > that
> > > > we
> > > > > > can keep in the map and thus may require more rounds of cleaning.
> > > This
> > > > is
> > > > > > probably not a big issue, but it will be useful to document this
> > > impact
> > > > > in
> > > > > > the KIP.
> > > > > > 3. With the new cleaning strategy, we want to be a bit careful
> with
> > > > > > removing the last message in a partition (which is possible now).
> > We
> > > > need
> > > > > > to preserve the offset of the last message so that we don't reuse
> > the
> > > > > > offset for a different message. One way to simply never remove
> the
> > > last
> > > > > > message. Another way (suggested by Jason) is to create an empty
> > > message
> > > > > > batch.
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Sat, Jun 9, 2018 at 12:39 AM, Luís Cabral
> > > > > <[email protected]
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Any takers on having a look at this KIP and voting on it?
> > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 280%3A+Enhanced+log+compaction
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Luis
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>
>

Re: [VOTE] KIP-280: Enhanced log compaction

Reply via email to