Let me rephrase your answer to make sure I understand what you suggest:

If compaction strategy is configured to use "offset", and if there is a
header in the record with `key == offset`, than we should use the value
of the record header instead of the actual record offset?

Do I understand this correctly? If yes, what is the advantage of doing
this? From my point of view, it might be problematic, because if user A
creates a topic and configures "offset" compaction (with the intend that
the record offset should be uses), than a second user B can add a header
with key "offset" and thus break the intention of user A.

Also, if existing topics might have data with record header key
"offset", the change would not be backward compatible either.


-Matthias

On 6/16/18 6:59 PM, Ted Yu wrote:
> Pardon the brevity in my previous reply.
> I was talking about this bullet:
> 
> bq. When this configuration is set to anything other than "*offset*" or "
> *timestamp*", then the record headers are scanned for a key matching this
> value.
> 
> My point is that if matching key in the header is found, its value should
> take precedence over the value of the configuration.
> I understand that such interpretation may have slight performance cost.
> 
> Cheers
> 
> On Sat, Jun 16, 2018 at 6:29 PM, Matthias J. Sax <matth...@confluent.io>
> wrote:
> 
>> Ted,
>>
>> I am also not sure what you mean by "Shouldn't the selection in header
>> have higher precedence over the configuration"? What selection do you
>> mean? And want configuration?
>>
>>
>> About the first point, I think this is actually a valid concern: To
>> address this issue, it seems that we would need to change the accepted
>> format of the config. Instead of "offset", "timestamp", "<header-key>",
>> we could replace the last one with "header=<header-key>".
>>
>> WDYT?
>>
>>
>> -Matthias
>>
>> On 6/15/18 3:06 AM, Ted Yu wrote:
>>> If selection exists in header, the selection should override the config
>> value.
>>> Cheers
>>> -------- Original message --------From: Luis Cabral
>> <luis_cab...@yahoo.com.INVALID> Date: 6/15/18  1:40 AM  (GMT-08:00) To:
>> dev@kafka.apache.org Subject: Re: [VOTE] KIP-280: Enhanced log compaction
>>> Hi,
>>>
>>> bq. Can the value be determined now ? My thinking is that what if there
>> is a third compaction strategy proposed in the future ? We should guard
>> against user unknowingly choosing the 'future' strategy.
>>>
>>> The idea is that the header name to use is flexible, which protects
>> current clients that may want to use this from having to adapt their
>> already existing header names (they can just specify a new name).
>>>
>>> bq. Shouldn't the selection in header have higher precedence over the
>> configuration ?
>>>
>>> Not sure what you mean here, could you clarify?
>>>
>>> bq. Please create JIRA if you haven't already.
>>>
>>> Done: https://issues.apache.org/jira/browse/KAFKA-7061
>>>
>>> Cheers,
>>> Luís
>>>
>>>> On 11 Jun 2018, at 01:50, Ted Yu <yuzhih...@gmail.com> wrote:
>>>>
>>>> bq. When this configuration is set to anything other than "*offset*" or
>> "
>>>> *timestamp*", then the record headers are scanned for a key matching
>> this
>>>> value.
>>>>
>>>> Can the value be determined now ? My thinking is that what if there is a
>>>> third compaction strategy proposed in the future ? We should guard
>> against
>>>> user unknowingly choosing the 'future' strategy.
>>>>
>>>> bq. If this header is found
>>>>
>>>> Shouldn't the selection in header have higher precedence over the
>> configuration
>>>> ?
>>>>
>>>> Please create JIRA if you haven't already.
>>>>
>>>> Thanks
>>>>
>>>> On Sat, Jun 9, 2018 at 12:39 AM, Luís Cabral
>> <luis_cab...@yahoo.com.invalid>
>>>> wrote:
>>>>
>>>>> Hi all,
>>>>>
>>>>> Any takers on having a look at this KIP and voting on it?
>>>>>
>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>> 280%3A+Enhanced+log+compaction
>>>>>
>>>>> Cheers,
>>>>> Luis
>>>>>
>>>
>>
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to