[ 
https://issues.apache.org/jira/browse/KAFKA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878024#comment-16878024
 ] 

James Ritt commented on KAFKA-4212:
-----------------------------------

Hi [~mjsax], thanks for your input! I'm not the original requestor, but my 
understanding of this JIRA is that it would be useful to have a TTL KV cache 
within streams: something I'd agree with, and it appears a commenter or two 
above also agree. The main draw for me is that the TTL cache, instead of 
growing unbounded, would then essentially mimic the underlying topic which we 
created with `cleanup.policy=delete` & `delete.retention.ms` set.

I very much could be wrong, but I think rocksDB and a topic setup with 
`cleanup.policy=delete` & `delete.retention.ms` both work off wall-clock time, 
so they would be congruent in that respect? And it's true that not being 
event-based could introduce discrepancies between the two (in particular, 
imagine the cache is configured with the same TTL as the topic but that the 
cache is offline for a couple hours, when the cache comes back online it will 
hold onto the values for an extra couple hours), but that could be fine as long 
as application semantics don't depend upon cache eviction.

An example might help: our current use case is to store a cache of revoked auth 
tokens. These tokens contain an expiration and are relatively short-lived, so 
we setup the containing topic with `delete.retention.ms` equal to their 
lifetime. We were then hoping to use Stream's `GlobalKTable` cache on this 
topic. With my PR, we could use the newly-added TTL KV cache with the same TTL 
as the underlying topic. And in this situation, wall-clock skew is fine, as 
there is no harm in them persisting in the cache for extra time. Without this 
change, I believe our underlying rocksDB cache would grow unbounded.

> Add a key-value store that is a TTL persistent cache
> ----------------------------------------------------
>
>                 Key: KAFKA-4212
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4212
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>    Affects Versions: 0.10.0.1
>            Reporter: Elias Levy
>            Priority: Major
>              Labels: api
>
> Some jobs needs to maintain as state a large set of key-values for some 
> period of time.  I.e. they need to maintain a TTL cache of values potentially 
> larger than memory. 
> Currently Kafka Streams provides non-windowed and windowed key-value stores.  
> Neither is an exact fit to this use case.  
> The {{RocksDBStore}}, a {{KeyValueStore}}, stores one value per key as 
> required, but does not support expiration.  The TTL option of RocksDB is 
> explicitly not used.
> The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment 
> dropping, but it stores multiple items per key, based on their timestamp.  
> But this store can be repurposed as a cache by fetching the items in reverse 
> chronological order and returning the first item found.
> KAFKA-2594 introduced a fixed-capacity in-memory LRU caching store, but here 
> we desire a variable-capacity memory-overflowing TTL caching store.
> Although {{RocksDBWindowsStore}} can be repurposed as a cache, it would be 
> useful to have an official and proper TTL cache API and implementation.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to