RE: [DISCUSS] KIP-280: Enhanced log compaction

Luís Cabral Tue, 10 Apr 2018 01:23:01 -0700

Hi Guozhang,

Thank you for the feedback!



bq. I'm also in favor of making the `timestamp` a preserved config value along 
with `offset`, for which we would not go into the headers to look for the 
matching key, but directly look into the timestamp field of the message.

Do you mean using an automatically filled timestamp, or a timestamp set by the 
record producer client?
Conceptually, I would also prefer the record version to be on the same level as 
“key”. Sadly, however, that means a huge non-backwards compatible change in the 
API, as the stream between the client and the server is done via 
serialization/deserialization -- we would be moving the byte positions in the 
new release of Kafka, and possibly corrupt pre-existing topics.


bq. About the `long` conversion from `byte[]` values, why couldn't we just 
compare on the byte arrays directly, than enforcing it to be encoded as a long 
type, then comparing on two longs?

Sadly, byte arrays are not comparable. You can use them to check for equality, 
but they don’t give you any representation of being greater or lesser than 
another byte array until they are converted into an actually comparable value.


Cheers


From: Guozhang Wang
Sent: 09 April 2018 22:19
To: [email protected]
Cc: Konstantin Chukhlomin
Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction

Thanks for the KIP.

1) I'm also in favor of making the `timestamp` a preserved config value
along with `offset`, for which we would not go into the headers to look for
the matching key, but directly look into the timestamp field of the message.

2) About the `long` conversion from `byte[]` values, why couldn't we just
compare on the byte arrays directly, than enforcing it to be encoded as a
long type, then comparing on two longs? If we directly compare the bytes,
we can 2.a) allow flexible types as the compaction keys, 2.b) even for long
typed compaction key, comparing their encoded bytes directly should still
work for positive values (for negative timestamp support we already have a
KIP here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-
228+Negative+record+timestamp+support, cc'ing the proposer as well, in
which the encoded bytes could still be correctly compared lexichronically).


Guozhang


On Mon, Apr 9, 2018 at 11:34 AM, Matthias J. Sax <[email protected]>
wrote:

> Thanks for clarification.
>
> I understand the config name now. Makes sense.
>
>
> > bq. It might also be good, to elaborate why you suggest "long" for the
> compaction value is the KIP itself.
> >
> > I would prefer that the definition simply be "long", to keep the code
> contract cleaner, and leave the clients to infer the usages according to
> their scenarios.
> > But if its important, then I can add it (let me know).
>
> I think adding couple of examples would be good enough.
>
>
> > bq. One more though: the KIP basically allows, that a record with larger
> offset is deleted while a record with smaller offset is preserved (if the
> record with smaller offset has a larger "compaction value" than the record
> with the larger offset). I don't see a issue with this atm, just wanted to
> point it out, as it seems to be an important change in behavior (compaction
> does not strictly "move forward" any longer if you wish).
> >
> > This is the whole purpose of the KIP, though. Is it not clear that this
> is the intention?
>
> The KIP is clear. Just wanted to point it out. Not sure if others have
> concerns about this change in behavior.
>
> > bq. Do you effectively mean, that the value has exactly 8 bytes?
> >
> > Not exactly... At the moment, the code in the pull request expects that
> the header value supports "byte[] => String => Long" as a valid conversion,
> so having a direct "long-as-byte[]" is not considered to be valid.
> > I am not sure which practice is preferred in Kafka, but given the base
> approach of "byte[]" as the header value, it seemed saner to support a
> String value rather than a Long.
> > Sadly, since Kafka needs to comprehend and process the actual value, I
> could not find a way to support both approaches.
> >
> > This last part is tricky, and where I expected more debates to arise,
> though in the end it can be boiled down to a "Long vs String" thing, so I
> hope it doesn't become a blocker...
>
> That's a tricky question... Personally, I think a direct byte[]->long
> conversion would be straight forward. What is the advantage of the
> intermediate "String" type?
>
> We could also introduce a magic byte that indicate the type of the value
> -- but I am not sure if we would need this flexibility. It also make
> serializing the value on the client side more complex.
>
> Whatever the decision is, the KIP should explain that value format is
> expected in detail.
>
>
>
> -Matthias
>
>
> On 4/9/18 2:20 AM, Luís Cabral wrote:
> > Hi,
> >
> >
> > bq. About naming: the broker config has `cleaner` in it, while the topic
> config does not. Is might be more consistent if either both have `cleaner`
> or none of them? (Personally, I would prefer to strip `cleaner` for the
> broker config.)
> >
> >
> > Do you mean the "log.cleaner." prefix attached to the global config?
> > If so, then this seems to be the naming approach used in this project,
> so I would rather stick to it (changing this will likely lead to a bigger
> --off topic-- discussion).
> >
> >
> >
> > bq. The sentence "toggle the compaction strategy to this approach" does
> not make clear that is should be the default -- even if you follow a common
> Kafka pattern, the KIP should make it explicit (new people might not be
> familiar with the pattern and cannot infer from the name itself that it is
> supposed to be a global default setting that can be overwritten on a
> per-topic bases with the second config).
> >
> >
> > Sorry, I'm having some problems understanding this part.
> > If it is about the meaning of the properties (global vs local), then
> isn't this explained in the Kafka Docs?
> >   https://kafka.apache.org/documentation
> >
> > Otherwise, could you kindly elaborate?
> >
> >
> >
> > bq. It might also be good, to elaborate why you suggest "long" for the
> compaction value is the KIP itself.
> >
> >
> > I would prefer that the definition simply be "long", to keep the code
> contract cleaner, and leave the clients to infer the usages according to
> their scenarios.
> > But if its important, then I can add it (let me know).
> >
> >
> >
> > bq. One more though: the KIP basically allows, that a record with larger
> offset is deleted while a record with smaller offset is preserved (if the
> record with smaller offset has a larger "compaction value" than the record
> with the larger offset). I don't see a issue with this atm, just wanted to
> point it out, as it seems to be an important change in behavior (compaction
> does not strictly "move forward" any longer if you wish).
> >
> >
> > This is the whole purpose of the KIP, though. Is it not clear that this
> is the intention?
> >
> >
> >
> > bq. Think you forgot to actually add this case. :)
> >
> >
> > Its added there, let me know if you have trouble getting to it.
> > Quoting: "When both records being compacted contain a matching
> "compaction value", then the record with the highest offset will be kept;"
> >
> >
> >
> > bq. Do you effectively mean, that the value has exactly 8 bytes?
> >
> >
> > Not exactly... At the moment, the code in the pull request expects that
> the header value supports "byte[] => String => Long" as a valid conversion,
> so having a direct "long-as-byte[]" is not considered to be valid.
> > I am not sure which practice is preferred in Kafka, but given the base
> approach of "byte[]" as the header value, it seemed saner to support a
> String value rather than a Long.
> > Sadly, since Kafka needs to comprehend and process the actual value, I
> could not find a way to support both approaches.
> >
> > This last part is tricky, and where I expected more debates to arise,
> though in the end it can be boiled down to a "Long vs String" thing, so I
> hope it doesn't become a blocker...
> >
> >
> >
> > Thanks again for having a look and for the valuable feedback!
> > Cheers
> >
> >
> >
> > On Sunday, April 8, 2018, 10:41:39 PM GMT+2, Matthias J. Sax <
> [email protected]> wrote:
> >
> >
> >
> >
> >
> > Thanks for clarification. This make sense to me.
> >
> > About naming: the broker config has `cleaner` in it, while the topic
> > config does not. Is might be more consistent if either both have
> > `cleaner` or none of them? (Personally, I would prefer to strip
> > `cleaner` for the broker config.)
> >
> > The sentence "toggle the compaction strategy to this approach" does not
> > make clear that is should be the default -- even if you follow a common
> > Kafka pattern, the KIP should make it explicit (new people might not be
> > familiar with the pattern and cannot infer from the name itself that it
> > is supposed to be a global default setting that can be overwritten on a
> > per-topic bases with the second config).
> >
> > For the timestamp compaction, long is good enough. The idea about my
> > comment was really to add "timestamp" as reserved value for the
> > parameter and this should use the record's metadata timestamp (instead
> > of a header key named 'timestamp'). Before you update the KIP, other
> > might want to share their opinion about this.
> >
> > It might also be good, to elaborate why you suggest "long" for the
> > compaction value is the KIP itself.
> >
> > One more though: the KIP basically allows, that a record with larger
> > offset is deleted while a record with smaller offset is preserved (if
> > the record with smaller offset has a larger "compaction value" than the
> > record with the larger offset). I don't see a issue with this atm, just
> > wanted to point it out, as it seems to be an important change in
> > behavior (compaction does not strictly "move forward" any longer if you
> > wish).
> >
> >>> bq. With the header approach it is not ensured that each record uses a
> unique "compaction value" (in contrast to offsets). Thus, what should the
> behaviour be, if two messages have the same "compaction value" in the
> header? (For timestamps, there is the same issue, and one idea was to use
> the offset as tie-breaker)
> >>>
> >>> Sorry, I forgot to mention that in the KIP. In the pull request used
> with the KIP you can see that it is indeed using the offset as a
> tie-breaker in case the header values are the same.
> >>> I’ll make this clear by adding it as part of the proposed changes.
> >
> > Think you forgot to actually add this case. :)
> >
> >
> > One more question:
> >
> >> cast-able to "long"
> >
> > Do you effectively mean, that the value has exactly 8 bytes?
> >
> >
> > -Matthias
> >
> > On 4/8/18 3:44 AM, Luís Cabral wrote:
> >> Hi Matthias,
> >>
> >>
> >> bq. Why do we need two new configs? Why is the topic config
> `compaction.strategy` not sufficient?
> >>
> >> As I understand these configurations, one allows you to configure the
> default for all topics while the other allows you to configure a single
> topic directly.
> >> If this is incorrect, or if having a global toggle is not desired, then
> I have no issues with having only the topic-relevant configuration.
> >>
> >>
> >> bq. For Kafka Streams we did think about a timestamp base compaction at
> some point (internal brain storming)---we never thought this through in
> details, but it might be a good idea to discuss it in this KIP and maybe
> piggy-back it if we want it (as a second pre-defined strategy "timestamp"
> next to "offset"?)
> >>
> >> The reason why I went for a “long” value here was mainly to support the
> 2 most common versioning patterns around: incremental numerals and
> timestamp (long representing milliseconds since 0h, January 1, 1970 GMT).
> >> Is this not enough to represent the strategy you guys had in mind? I
> would love to hear more about those discussions so this KIP can fulfil some
> more requirements that I am not aware of at the moment.
> >>
> >>
> >> bq. With the header approach it is not ensured that each record uses a
> unique "compaction value" (in contrast to offsets). Thus, what should the
> behaviour be, if two messages have the same "compaction value" in the
> header? (For timestamps, there is the same issue, and one idea was to use
> the offset as tie-breaker)
> >>
> >> Sorry, I forgot to mention that in the KIP. In the pull request used
> with the KIP you can see that it is indeed using the offset as a
> tie-breaker in case the header values are the same.
> >> I’ll make this clear by adding it as part of the proposed changes.
> >>
> >>
> >> bq. What should the behaviour be, if a message does not encode the
> "compaction key" in the header?
> >>
> >> The intention is that if both records being compared don’t have this
> value, then the offset is used instead. However, if only one of these
> records doesn’t have it, then whichever record has a “compaction key” is
> kept (as the other is considered to be anomalous).
> >> I’ll also add this to the proposed changes in the KIP to highlight
> these fall-back behaviours.
> >>
> >>
> >> Thank you for the feedback and looking forward for more replies!
> >>
> >> Cheers
> >>
> >>
> >> From: Matthias J. Sax
> >> Sent: 08 April 2018 05:29
> >> To: [email protected]
> >> Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction
> >>
> >> Luís,
> >>
> >> thanks a lot for this KIP. Very interesting idea.
> >>
> >> Couple of questions:
> >>
> >>   - Why do we need two new configs? Why is the topic config
> >> `compaction.strategy` not sufficient?
> >>
> >>   - For Kafka Streams we did think about a timestamp base compaction at
> >> some point (internal brain storming)---we never thought this through in
> >> details, but it might be a good idea to discuss it in this KIP and maybe
> >> piggy-back it if we want it (as a second pre-defined strategy
> >> "timestamp" next to "offset"?)
> >>
> >>   - With the header approach it is not ensured that each record uses a
> >> unique "compaction value" (in contrast to offsets). Thus, what should
> >> the behavior be, if two messages have the same "compaction value" in the
> >> header? (For timestamps, there is the same issue, and one idea was to
> >> use the offset as tie-breaker)
> >>
> >>   - What should the behavior be, if a message does not encode the
> >> "compaction key" in the header?
> >>
> >>
> >> -Matthias
> >>
> >>
> >> On 4/5/18 11:59 PM, Luís Cabral wrote:
> >>>
> >>> Thank you very much for taking the time to read it.
> >>>
> >>> bq. In the 'Proposed Changes' section, can you expand 'OCC' ?
> >>> I've made the 'OCC' into a link pointing to the appropriate Wiki page
> explaining what it is. This is not a particularly important part of the
> change, it is just to reference the similarity between this proposal and
> the version control offered by OCC.
> >>>
> >>> bq. Is it possible to enumerate the keys ?
> >>> Do you mean hard-coding the header key used, rather than using a
> free-text solution? If I were to hard-code header with key "version", for
> example, then this may conflict with other clients that already use this
> header for something else, making it cumbersome for them to try and use
> this strategy, should they want it.
> >>> If I misunderstood your points, then please correct me. I appreciate
> the feedback!    On Thursday, April 5, 2018, 5:13:47 PM GMT+2, Ted Yu <
> [email protected]> wrote:
> >>>
> >>>   In the 'Proposed Changes' section, can you expand 'OCC' ?
> >>>
> >>> bq. Specifically changing this to anything other than "*offset*"
> >>>
> >>> Is it possible to enumerate the keys ? In the future, more metadata
> would
> >>> be defined in record header - it is better to avoid collision.
> >>>
> >>> Cheers
> >>>
> >>> On Thu, Apr 5, 2018 at 2:05 AM, Luís Cabral
> <[email protected]>
> >>> wrote:
> >>>
> >>>>
> >>>> This is embarassingly hard to fix... going again...
> >>>> ----
> >>>> KIP-280:  https://cwiki.apache.org/confluence/display/
> >>>> KAFKA/KIP-280%3A+Enhanced+log+compaction
> >>>> -----
> >>>> Pull-4822:  https://github.com/apache/kafka/pull/4822
> >>>>
> >>>>
> >>>>     On Thursday, April 5, 2018, 11:03:22 AM GMT+2, Luís Cabral
> >>>> <[email protected]> wrote:
> >>>>
> >>>>   Fixing the links:KIP-280:  https://cwiki.apache.org/
> confluence/display/
> >>>> KAFKA/KIP-280%3A+Enhanced+log+compactionPull-4822:  https://
> >>>> github.com/apache/kafka/pull/4822
> >>>>
> >>>>
> >>>> On 2018/04/0508:44:00, Luís Cabral <[email protected]> wrote:
> >>>>> Helloall,>
> >>>>> Starting adiscussion for this feature.>
> >>>>> KIP-280  :  https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>> 280%3A+Enhanced+log+compactionPull-4822:  https://github.com/apache/
> >>>> kafka/pull/4822>
> >>>>
> >>>>> KindRegards,Luís>
> >>>>
> >>>>
> >>
> >>
> >>
>
>


-- 
-- Guozhang

RE: [DISCUSS] KIP-280: Enhanced log compaction

Reply via email to