RE: [VOTE] KIP-280: Enhanced log compaction

Luís Cabral Thu, 16 Aug 2018 10:32:31 -0700

Hi,

@Guozhang & @Jun:
There were some left over comments from when the topic was still fresh (you may 
have to read the email chain to refresh your memory). Are these now clarified 
for you?


@Jason:
Was the reason opaque to you? I think we can avoid adding something so simple 
to the description.
As for the suggestion, that is indeed a wonderful idea. I suggest that you 
create a KIP and address the topic with the rest of the community.

Kind Regards,
Luis

From: Jason Gustafson
Sent: 14 August 2018 01:25
To: dev
Subject: Re: [VOTE] KIP-280: Enhanced log compaction

Hey Luis,

Thanks for the explanation. I'd suggest adding the use case to the
motivation section.

I think my only hesitation about the header-based compaction is that it is
the first time we are putting a schema requirement on header values. I
wonder if it's better to leave Kafka agnostic. For example, maybe the
compaction strategy could be a plugin which allows custom derivation of the
compaction key. Say something like this:

class CompactionKey {
  byte[] key;
  long version;
}

interface CompactionStrategy {
  CompactionKey deriveCompactionKey(Record record);
}

The idea is to leave schemas in the hands of users. Have you considered
something like that?

Thanks,
Jason

On Sat, Aug 11, 2018 at 2:04 AM, Luís Cabral <[email protected]>
wrote:

> Hi Jason,
>
> The initial (and still only) requirement I wanted out of this KIP was to
> have the header strategy.
> This is because I want to be able to version both by date/time or by
> (externally provided) sequence, this is specially important if you are
> running in multiple environments, which may cause the commonly known issue
> of the clocks being slightly de-synchronized.
> Having the timestamp strategy was added to the KIP as the result of the
> discussions, where it was seen as a potential benefit for other clients who
> may prefer that.
>
> Cheers,
> Luís
>
> From: Jason Gustafson
> Sent: 10 August 2018 22:38
> To: dev
> Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>
> Hi Luis,
>
> It's still not very clear to me why we need the header-based strategy. Can
> you elaborate why having the timestamp-based approach alone is not
> sufficient? The use case in the motivation just describes a "most recent
> snapshot" use case.
>
> Thanks,
> Jason
>
> On Thu, Aug 9, 2018 at 4:36 AM, Luís Cabral <[email protected]
> >
> wrote:
>
> > Hi,
> >
> >
> > So, after a "short" break, I've finally managed to find time to resume
> > this KIP. Sorry to all for the delay.
> >
> > Continuing the conversation of the configurations being global vs  topic,
> > I've checked this and it seems that they are only available globally.
> >
> > This configuration is passed to the log cleaner via
> "CleanerConfig.scala",
> > which only accepts global configurations. This seems intentional, as the
> > log cleaner is not mutable and doesn't get instantiated that often. I
> think
> > that changing this to accept per-topic configuration would be very nice,
> > but perhaps as a part of a different KIP.
> >
> >
> > Following the Kafka documentation, these are the settings I'm referring
> to:
> >
> > -- --
> >
> >     Updating Log Cleaner Configs
> >
> >     Log cleaner configs may be updated dynamically at cluster-default
> > level used by all brokers. The changes take effect on the next iteration
> of
> > log cleaning. One or more of these configs may be updated:
> >
> >         * log.cleaner.threads
> >
> >         * log.cleaner.io.max.bytes.per.second
> >
> >         * log.cleaner.dedupe.buffer.size
> >
> >         * log.cleaner.io.buffer.size
> >
> >         * log.cleaner.io.buffer.load.factor
> >
> >         * log.cleaner.backoff.ms
> >
> > -- --
> >
> >
> >
> > Please feel free to confirm, otherwise I will update the KIP to reflect
> > these configuration nuances in the next few days.
> >
> >
> > Best Regards,
> >
> > Luis
> >
> >
> >
> > On Monday, July 9, 2018, 1:57:38 PM GMT+2, Andras Beni <
> > [email protected]> wrote:
> >
> >
> >
> >
> >
> > Hi Luís,
> >
> > Can you please clarify how the header value has to be encoded in case log
> > compaction strategy is 'header'. As I see current PR reads varLong in
> > CleanerCache.extractVersion and read String and uses toLong in
> > Cleaner.extractVersion while the KIP says no more than 'the header value
> > (which must be of type "long")'.
> >
> > Otherwise +1 for the KIP
> >
> > As for current implementation: it seems in Cleaner class header key
> > "version" is hardwired.
> >
> > Andras
> >
> >
> >
> > On Fri, Jul 6, 2018 at 10:36 PM Jun Rao <[email protected]> wrote:
> >
> > > Hi, Guozhang,
> > >
> > > For #4, what you suggested could make sense for timestamp based de-dup,
> > but
> > > not sure how general it is since the KIP also supports de-dup based on
> > > header.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Jul 6, 2018 at 1:12 PM, Guozhang Wang <[email protected]>
> > wrote:
> > >
> > > > Hello Jun,
> > > > Thanks for your feedbacks. I'd agree on #3 that it's worth adding a
> > > special
> > > > check to not delete the last message, since although unlikely, it is
> > > still
> > > > possible that a new active segment gets rolled out but contains no
> data
> > > > yet, and hence the actual last message in this case would be in a
> > > > "compact-able" segment.
> > > >
> > > > For the second part of #4 you raised, maybe we could educate users to
> > > set "
> > > > message.timestamp.difference.max.ms" to be no larger than "
> > > > log.cleaner.delete.retention.ms" (its default value is
> > Long.MAX_VALUE)?
> > > A
> > > > more aggressive approach would be changing the default value of the
> > > former
> > > > to be the value of the latter if:
> > > >
> > > > 1. cleanup.policy = compact OR compact,delete
> > > > 2. log.cleaner.compaction.strategy != offset
> > > >
> > > > Because in this case I think it makes sense to really allow users
> send
> > > any
> > > > data longer than "log.cleaner.delete.retention.ms", WDYT?
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Fri, Jul 6, 2018 at 11:51 AM, Jun Rao <[email protected]> wrote:
> > > >
> > > > > Hi, Luis,
> > > > >
> > > > > 1. The cleaning policy is configurable at both global and topic
> > level.
> > > > The
> > > > > global one has the name log.cleanup.policy and the topic level has
> > the
> > > > name
> > > > > cleanup.policy by just stripping the log prefix. We can probably do
> > the
> > > > > same for the new configs.
> > > > >
> > > > > 2. Since this KIP may require an admin to configure a larger dedup
> > > buffer
> > > > > size, it would be useful to document this impact in the wiki and
> the
> > > > > release notes.
> > > > >
> > > > > 3. Yes, it's unlikely for the last message to be removed in the
> > current
> > > > > implementation since we never clean the active segment. However, in
> > > > theory,
> > > > > this can happen. So it would be useful to guard this explicitly.
> > > > >
> > > > > 4. Just thought about another issue. We probably want to be a bit
> > > careful
> > > > > with key deletion. Currently, one can delete a key by sending a
> > message
> > > > > with a delete tombstone (a null payload). To prevent a reader from
> > > > missing
> > > > > a deletion if it's removed too quickly, we depend on a
> configuration
> > > > > log.cleaner.delete.retention.ms (defaults to 1 day). The delete
> > > > tombstone
> > > > > will only be physically removed from the log after that amount of
> > time.
> > > > The
> > > > > expectation is that a reader should finish reading to the end of
> the
> > > log
> > > > > after consuming a message within that configured time. With the new
> > > > > strategy, we have similar, but slightly different problems. The
> first
> > > > > problem is that the delete tombstone may be delivered earlier than
> an
> > > > > outdated record in offset order to a consumer. In order for the
> > > consumer
> > > > > not to take the outdated record, the consumer should cache the
> > deletion
> > > > > tombstone for some configured amount of time. We ca probably
> > piggyback
> > > > this
> > > > > on log.cleaner.delete.retention.ms, but we need to document this.
> > The
> > > > > second problem is that once the delete tombstone is physically
> > removed
> > > > from
> > > > > the log, how can we prevent outdated records to be added
> (otherwise,
> > > they
> > > > > will never be garbage collected)? Not sure what's the best way to
> do
> > > > this.
> > > > > One possible way is to push this back to the application and
> require
> > > the
> > > > > user not to publish outdated records after
> > > log.cleaner.delete.retention.
> > > > ms
> > > > > .
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Jul 4, 2018 at 11:11 AM, Luís Cabral
> > > > <[email protected]
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jun,
> > > > > >
> > > > > > -:  1. I guess both new configurations will be at the topic
> level?
> > > > > >
> > > > > > They will exist in the global configuration, at the very least.
> > > > > > I would like to have them on the topic level as well, but there
> is
> > an
> > > > > > inconsistency between the cleanup/compaction properties that
> exist
> > > > “only
> > > > > > globally” vs “globally + per topic”.
> > > > > > I haven’t gotten around to investigating why, and if that reason
> > > would
> > > > > > then also impact the properties I’m suggesting. At first glance
> > they
> > > > seem
> > > > > > to belong with the properties that are "only globally”
> configured,
> > > but
> > > > > > Guozhang has written earlier with a suggestion of a compaction
> > > property
> > > > > > that works for both (though I haven’t had the time to look into
> it
> > > yet,
> > > > > > unfortunately).
> > > > > >
> > > > > > -:  2. Since the log cleaner now needs to keep both the offset
> and
> > > > > another
> > > > > > long (say timestamp) in the de-dup map, it reduces the number of
> > keys
> > > > > that
> > > > > > we can keep in the map and thus may require more rounds of
> > cleaning.
> > > > This
> > > > > > is probably not a big issue, but it will be useful to document
> this
> > > > > impact
> > > > > > in the KIP.
> > > > > >
> > > > > > As a reader, I tend to prefer brief documentation on new features
> > > (they
> > > > > > tend to be too many for me to find the willpower to read a
> 200-page
> > > > essay
> > > > > > about each one), so that influences me to avoid writing about
> every
> > > > > > micro-impact that may exist, and simply leave it inferred (as you
> > > have
> > > > > just
> > > > > > done).
> > > > > > But I also don’t feel strongly enough about it to argue either
> way.
> > > So,
> > > > > > after reading my argument, if you still insist, I’ll happily add
> > this
> > > > > there.
> > > > > >
> > > > > > -: 3. With the new cleaning strategy, we want to be a bit careful
> > > with
> > > > > > removing the last message in a partition (which is possible now).
> > We
> > > > need
> > > > > > to preserve the offset of the last message so that we don't reuse
> > the
> > > > > > offset for a different message. One way to simply never remove
> the
> > > last
> > > > > > message. Another way (suggested by Jason) is to create an empty
> > > message
> > > > > > batch.
> > > > > >
> > > > > > That is a good point, but isn’t the last message always kept
> > > > regardless?
> > > > > > In all of my tests with this approach, never have I seen it being
> > > > > removed.
> > > > > > This is not because I made it so while changing the code, it was
> > > simply
> > > > > > like this before, which made me smile!
> > > > > > Given these results, I just *assumed* (oops) that these scenarios
> > > > > > represented the reality, so the compaction would only affected
> the
> > > > > “tail”,
> > > > > > while the “head” remained untouched. Now that you say its
> possible
> > > that
> > > > > the
> > > > > > last message actually gets overwritten somehow, I guess a new
> > bullet
> > > > > point
> > > > > > will have to be added to the KIP for this (after I’ve found the
> > time
> > > to
> > > > > > review the portion of the code that enacts this behaviour).
> > > > > >
> > > > > > Kind Regards,
> > > > > > Luís Cabral
> > > > > >
> > > > > > From: Jun Rao
> > > > > > Sent: 03 July 2018 23:58
> > > > > > To: dev
> > > > > > Subject: Re: [VOTE] KIP-280: Enhanced log compaction
> > > > > >
> > > > > > Hi, Luis,
> > > > > >
> > > > > > Thanks for the KIP. Overall, this seems a useful KIP. A few
> > comments
> > > > > below.
> > > > > >
> > > > > > 1. I guess both new configurations will be at the topic level?
> > > > > > 2. Since the log cleaner now needs to keep both the offset and
> > > another
> > > > > long
> > > > > > (say timestamp) in the de-dup map, it reduces the number of keys
> > that
> > > > we
> > > > > > can keep in the map and thus may require more rounds of cleaning.
> > > This
> > > > is
> > > > > > probably not a big issue, but it will be useful to document this
> > > impact
> > > > > in
> > > > > > the KIP.
> > > > > > 3. With the new cleaning strategy, we want to be a bit careful
> with
> > > > > > removing the last message in a partition (which is possible now).
> > We
> > > > need
> > > > > > to preserve the offset of the last message so that we don't reuse
> > the
> > > > > > offset for a different message. One way to simply never remove
> the
> > > last
> > > > > > message. Another way (suggested by Jason) is to create an empty
> > > message
> > > > > > batch.
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Sat, Jun 9, 2018 at 12:39 AM, Luís Cabral
> > > > > <[email protected]
> > > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > Any takers on having a look at this KIP and voting on it?
> > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 280%3A+Enhanced+log+compaction
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Luis
> > > > > > >
> > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>
>

RE: [VOTE] KIP-280: Enhanced log compaction

Reply via email to