Hi Damian, Thanks for the KIP. We have a number of use cases in which we maintain a materialized cache of a compacted topic. The consumer coordinator, for example, has a cache of consumer offsets which is populated from the __consumer_offsets topic. Kafka Connect also uses this pattern for its own offset and config storage. The key distinction in the latter case is that the cache is maintained on the client. So a couple questions about the potential impact of this KIP on these use cases:
1. Would it make sense to use this KIP in the consumer coordinator to expire offsets based on the topic's retention time? Currently, we have a periodic task which scans the full cache to check which offsets can be expired, but we might be able to get rid of this if we had a callback to update the cache when a segment was deleted. Technically offsets can be given their own expiration time, but it seems questionable whether we need this going forward (the new consumer doesn't even expose it at the moment). 2. This KIP could also be useful for expiration in the case of a cache maintained on the client, but I don't see an obvious way that we'd be able to leverage it since there's no indication to the client when a segment has been deleted (unless they reload the cache from the beginning of the log). One approach I can think of would be to write corresponding tombstones as necessary when a segment is removed, but that seems pretty heavy. Have you considered this problem? It may not be necessary to address this problem in this KIP, but since the need for expiration seems very common for this use case, it could save a lot of duplicated effort if the broker provided a builtin mechanism for it. Thanks, Jason On Mon, Aug 8, 2016 at 12:41 AM, Damian Guy <damian....@gmail.com> wrote: > Hi, > > We have created KIP 71: Enable log compaction and deletion to co-exist` > > https://cwiki.apache.org/confluence/display/KAFKA/KIP- > 71%3A+Enable+log+compaction+and+deletion+to+co-exist > > Please take a look. Feedback is appreciated. > > Thank you >