Since this is not moving forward, how about I proceed with the currently 
documented changes, and any improvements (such as configuration changes) can be 
taken up afterwards by whoever wants it under a different KIP?
    On Thursday, October 11, 2018, 9:47:12 AM GMT+2, Luís Cabral 
<luis_cab...@yahoo.com> wrote:  
 
  Hi Matthias,

How can this be done?

Kind Regards,
Luis
 
    On Sunday, September 30, 2018, 9:10:01 PM GMT+2, Matthias J. Sax 
<matth...@confluent.io> wrote:  
 
 Luis,

What is the status of this KIP?

I tend to agree, that introducing the feature only globally, might be
less useful (I would assume that people want to turn it on, on a
per-topic basis). As I am not familiar with the corresponding code, I
cannot judge the complexity to add topic level configs, however, it
seems to be worth to include it in the KIP.


-Matthias



On 9/21/18 1:59 PM, Bertus Greeff wrote:
> Someone pointed out to me that my scenario is also resolved by using Kafka 
> transactions.  Zombie fencing which is essentially my scenario was one of the 
> scenarios that transactions were designed to solve.  I was going to use the 
> ideas of this KIP to solve it but using transactions seems even better 
> because out of order messages never even make it into the topic.  They are 
> blocked by the broker.
> 
> -----Original Message-----
> From: Guozhang Wang <wangg...@gmail.com> 
> Sent: Saturday, September 1, 2018 11:33 AM
> To: dev <dev@kafka.apache.org>
> Subject: Re: [DISCUSS] KIP-280: Enhanced log compaction
> 
> Hello Luis,
> 
> Thanks for your thoughtful responses, here are my two cents:
> 
> 1) I think having the new configs with per-topic granularity would not 
> introduce much memory overhead or logic complexity, as all you need is to 
> remember this at the topic metadata cache. If I've missed some details about 
> the complexity, could you elaborate a bit more?
> 
> 2) I agree with you: the current `ConfigDef.Validator` only scope on the 
> validated config value itself, and hence cannot be dependent on another 
> config.
> 
> 4) I think Jun's point is that since we need the latest message in the log 
> segment for the timestamp tracking, we cannot delete it actually: with offset 
> based only policy, this is naturally guaranteed; but now with other policy, 
> it is not guaranteed to never be compacted away, and hence we need to 
> "enforce" to special-handle this case and not delete it.
> 
> 
> 
> Guozhang
> 
> 
> 
> Guozhang
> 
> 
> On Wed, Aug 29, 2018 at 9:25 AM, Luís Cabral <luis_cab...@yahoo.com.invalid>
> wrote:
> 
>> Hi all,
>>
>> Since there has been a rejuvenated interest in this KIP, it felt 
>> better to downgrade it back down to [DISCUSSION], as we aren't really 
>> voting on it anymore.
>> I'll try to address the currently pending questions on the following 
>> points, so please bear with me while we go through them all:
>>
>> 1) Configuration: Topic vs Global
>>
>> Here we all agree that having a configuration per-topic is the best 
>> option. However, this is not possible with the current design of the 
>> compaction solution. Yes, it is true that "some" properties that 
>> relate to compaction are configurable per-topic, but those are only 
>> the properties that act outside(!) of the actual compaction logic, 
>> such as configuring the start-compaction trigger with "ratio" or 
>> compaction duration with "lag.ms ".
>> This logic can, of course, be re-designed to suit our wishes, but this 
>> is not a direct effort, and if we have spent months arguing about the 
>> extra 8 bytes per record, for sure we would spend another few dozen 
>> months discussing the memory impact that doing this change to the 
>> properties will invariably have.
>> As such, I will limit this KIP to ONLY have these configurations globally.
>>
>> 2) Configuration: Fail-fast vs Fallback
>>
>>
>> Ideally, I would also like to prevent the application to start if the 
>> configuration is somehow invalid.
>> However (another 'however'), the way the configuration is handled 
>> prevents adding dependencies between them, so we can't add logic that 
>> says "configuration X is invalid if configuration Y is so-and-such".
>> Again, this can be re-designed to add this feature to the 
>> configuration logic, but it would again be a big change just by 
>> itself, so this KIP is again limited to use ONLY what is already in place.
>>
>> 3) Documenting the memory impact on the KIP
>>
>> This is now added to the KIP, though this topic is more complicated 
>> than 'memory impact'. E.g.: this change doesn't translate to an actual 
>> memory impact, it just means that the compaction will potentially 
>> encompass less records per execution.
>>
>> 4) Documenting how we deal with the last message in the log
>>
>> I have 2 interpretations for this request: "the last message in the log"
>> or "the last message with a shared key on the log"
>> For the former: there is no change to the logic on how the last 
>> message is handled. Only the "tail" gets compacted, so the "head" 
>> (which includes the last message) still keeps the last message
>>
>> 5) Documenting how the key deletion will be handled
>>
>> I'm having some trouble understanding this one; do you mean how keys 
>> are deleted in general, or?
>>
>> Cheers,
>> Luis Cabral
>>
>>    On Friday, August 24, 2018, 1:54:54 AM GMT+2, Jun Rao 
>> <j...@confluent.io>
>> wrote:
>>
>>  Hi, Luis,
>>
>> Thanks for the reply. A few more comments below.
>>
>> 1. About the topic level configuration. It seems that it's useful for 
>> the new configs to be at the topic level. Currently, the following 
>> configs related to compaction are already at the topic level.
>>
>> min.cleanable.dirty.ratio
>> min.compaction.lag.ms
>> cleanup.policy
>>
>> 2. Have you documented the memory impact in the KIP?
>>
>> 3. Could you document how we deal with the last message in the log, 
>> which is potentially cleanable now?
>>
>> 4. Could you document how key deletion will be handled?
>>
>> 10. As for Jason's proposal on CompactionStrategy, it does make the 
>> feature more general. On the other hand, it will be useful not to 
>> require user-level code if the compaction value only comes from the header.
>>
>> 20. "If compaction.strategy.header is chosen and 
>> compaction.strategy.header is not set, the KIP falls back to offset." 
>> I am wondering if it's better to just fail the configuration in the case.
>>
>> Jun
>>
>>
>>
>> On Thu, Aug 16, 2018 at 1:33 PM, Guozhang Wang <wangg...@gmail.com> wrote:
>>
>>> Regarding "broker-agnostic of headers": there are some KIPs from 
>>> Streams
>> to
>>> use headers for internal purposes as well, e.g. KIP-258 and KIP-213 
>>> (I admit there may be a conflict with user space, but practically I 
>>> think it is very rare). So I think we are very likely going to make 
>>> Kafka
>> internals
>>> to be "headers-aware" anyways.
>>>
>>> Regarding the general API: I think it is a good idea in general, but 
>>> it
>> may
>>> still have limits: note that right now our KIP enforce a header type 
>>> to
>> be
>>> long, and we have a very careful discussion about the fall-back 
>>> policy if header does not have the specified key or if the value is 
>>> not long-typed; but if we enforce long type version in the 
>>> interface, it would require users trying to customizing their 
>>> compaction logic (think: based on some value payload field) to 
>>> transform their fields to long as well. So I feel we can still go 
>>> with the current proposed approach, and only consider
>> this
>>> general API if we observe it does have a general usage requirement. 
>>> By
>> that
>>> time we can still extend the config values of "log.cleaner.compaction.
>>> strategy" to "offset, timestamp, header, myFuncName".
>>>
>>> @Bertus
>>>
>>> Thanks for your feedback, I believe the proposed config is indeed 
>>> for
>> both
>>> global (for the whole broker) and per-topic, Luís can confirm if 
>>> this is the case, and update the wiki page to make it clear.
>>>
>>>
>>> Guozhang
>>>
>>>
>>> On Thu, Aug 16, 2018 at 9:09 AM, Bertus Greeff < 
>>> bgre...@microsoft.com.invalid> wrote:
>>>
>>>> I'm interested to know the status of this KIP.  I see that the 
>>>> status
>> is
>>>> "Voting".  How long does this normally take?
>>>>
>>>> We want to use Kafka and this KIP provides exactly the log 
>>>> compaction logic that we want for many of our projects.
>>>>
>>>> One piece of feedback that I have is that log.cleaner.compaction.
>>> strategy
>>>> and log.cleaner.compaction.strategy.header needs to be per topic.  
>>>> The text of the KIP makes it sound that the config is only 
>>>> available for
>> all
>>>> topics but this makes no sense.  Different topics will need 
>>>> different strategies and/or headers.
>>>>
>>>> From the KIP:
>>>> Provide the configuration for the individual topics None of the 
>>>> configurations for log compaction are available at topic level, so 
>>>> adding it there is not a part of this KIP
>>>>
>>>>
>>>>
>>>> On 2018/04/05 08:44:00, Luís Cabral <l...@yahoo.com.INVALID> wrote:
>>>>> Hello all,>
>>>>> Starting a discussion for this feature.>
>>>>> KIP-280  :  
>>>>> https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2
>>>>> Fcwiki.apache.org%2Fconfluence%2Fdisplay%2FKAFKA%2FKIP-&amp;data
>>>>> =02%7C01%7Cbgreeff%40microsoft.com%7C42ac1e21f70c4b0a243108d6103
>>>>> 959ed%7C72f988bf86f141af91ab2d7cd011db47%7C1%7C0%7C6367142358482
>>>>> 30609&amp;sdata=gtulffYXEDbf%2BNt1b1P%2FkGeDQvmcf2PZ9nQlo4oi%2Bo
>>>>> Q%3D&amp;reserved=0
>>> 280%
>>>> 3A+Enhanced+log+compactionPull-4822 :  
>>>> 3A+Enhanced+log+https://na01.safelinks.protection.outlook.com/?url
>>>> 3A+Enhanced+log+=https%3A%2F%2Fgithub.com%2Fapache%2Fkafk&amp;data
>>>> 3A+Enhanced+log+=02%7C01%7Cbgreeff%40microsoft.com%7C42ac1e21f70c4
>>>> 3A+Enhanced+log+b0a243108d6103959ed%7C72f988bf86f141af91ab2d7cd011
>>>> 3A+Enhanced+log+db47%7C1%7C0%7C636714235848230609&amp;sdata=s5lASk
>>>> 3A+Enhanced+log+fvK%2F21wmK8vsbdlXIXINfFFWZCtXzBLWhdCiA%3D&amp;res
>>>> 3A+Enhanced+log+erved=0
>>>> a/pull/4822>
>>>>
>>>>> Kind Regards,Luís>
>>>>
>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
> 
> 
> 
> --
> -- Guozhang
> 
    

Reply via email to