Hi,

The only concrete example i can think of is a case for limiting disk usage.
Say, i had something like Connect running that was tracking changes in a
database. Downstream i don't really care about every change, i just want
the latest values, so compaction could be enabled. However, the kafka
cluster has limited disk space so we need to limit the size of each
partition.
In a previous life i have done the same, just without compaction turned on.

Besides, i don't think it costs us anything in terms of added complexity to
enable it for time & size based retention - the code already does this for
us.

Thanks,
Damian

On Fri, 12 Aug 2016 at 05:30 Neha Narkhede <n...@confluent.io> wrote:

> Jun,
>
> The motivation for this KIP is to handle joins and windows in Kafka
> streams better and since Streams supports time-based windows, the KIP
> suggests combining time-based deletion and compaction.
>
> It might make sense to do the same for size-based windows, but can you
> think of a concrete use case? If not, perhaps we can come back to it.
> On Thu, Aug 11, 2016 at 3:08 PM Jun Rao <j...@confluent.io> wrote:
>
>> Hi, Damian,
>>
>> Thanks for the proposal. It makes sense to use time-based deletion
>> retention and compaction together, as you mentioned in the KStream.
>>
>> Is there a use case where we want to combine size-based deletion retention
>> and compaction together?
>>
>> Jun
>>
>> On Thu, Aug 11, 2016 at 2:00 AM, Damian Guy <damian....@gmail.com> wrote:
>>
>> > Hi Jason,
>> >
>> > Thanks for your input - appreciated.
>> >
>> > 1. Would it make sense to use this KIP in the consumer coordinator to
>> > > expire offsets based on the topic's retention time? Currently, we
>> have a
>> > > periodic task which scans the full cache to check which offsets can be
>> > > expired, but we might be able to get rid of this if we had a callback
>> to
>> > > update the cache when a segment was deleted. Technically offsets can
>> be
>> > > given their own expiration time, but it seems questionable whether we
>> > need
>> > > this going forward (the new consumer doesn't even expose it at the
>> > moment).
>> > >
>> >
>> > The KIP in its current form isn't adding a callback. So you'd still
>> need to
>> > scan the cache and remove any expired offsets, however you wouldn't send
>> > the tombstone messages.
>> > Having a callback sounds useful, though it isn't clear to me how you
>> would
>> > know which offsets to remove from the cache on segment deletion? I will
>> > look into it.
>> >
>> >
>> > > 2. This KIP could also be useful for expiration in the case of a cache
>> > > maintained on the client, but I don't see an obvious way that we'd be
>> > able
>> > > to leverage it since there's no indication to the client when a
>> segment
>> > has
>> > > been deleted (unless they reload the cache from the beginning of the
>> > log).
>> > > One approach I can think of would be to write corresponding
>> tombstones as
>> > > necessary when a segment is removed, but that seems pretty heavy. Have
>> > you
>> > > considered this problem?
>> > >
>> > >
>> > We've not considered this and I'm not sure we want to as part of this
>> KIP.
>> >
>> > Thanks,
>> > Damian
>> >
>> >
>> > >
>> > >
>> > > On Mon, Aug 8, 2016 at 12:41 AM, Damian Guy <damian....@gmail.com>
>> > wrote:
>> > >
>> > > > Hi,
>> > > >
>> > > > We have created KIP 71: Enable log compaction and deletion to
>> co-exist`
>> > > >
>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>> > > > 71%3A+Enable+log+compaction+and+deletion+to+co-exist
>> > > >
>> > > > Please take a look. Feedback is appreciated.
>> > > >
>> > > > Thank you
>> > > >
>> > >
>> >
>>
>

Reply via email to