[ https://issues.apache.org/jira/browse/KAFKA-4212?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16878024#comment-16878024 ]
James Ritt commented on KAFKA-4212: ----------------------------------- Hi [~mjsax], thanks for your input! I'm not the original requestor, but my understanding of this JIRA is that it would be useful to have a TTL KV cache within streams: something I'd agree with, and it appears a commenter or two above also agree. The main draw for me is that the TTL cache, instead of growing unbounded, would then essentially mimic the underlying topic which we created with `cleanup.policy=delete` & `delete.retention.ms` set. I very much could be wrong, but I think rocksDB and a topic setup with `cleanup.policy=delete` & `delete.retention.ms` both work off wall-clock time, so they would be congruent in that respect? And it's true that not being event-based could introduce discrepancies between the two (in particular, imagine the cache is configured with the same TTL as the topic but that the cache is offline for a couple hours, when the cache comes back online it will hold onto the values for an extra couple hours), but that could be fine as long as application semantics don't depend upon cache eviction. An example might help: our current use case is to store a cache of revoked auth tokens. These tokens contain an expiration and are relatively short-lived, so we setup the containing topic with `delete.retention.ms` equal to their lifetime. We were then hoping to use Stream's `GlobalKTable` cache on this topic. With my PR, we could use the newly-added TTL KV cache with the same TTL as the underlying topic. And in this situation, wall-clock skew is fine, as there is no harm in them persisting in the cache for extra time. Without this change, I believe our underlying rocksDB cache would grow unbounded. > Add a key-value store that is a TTL persistent cache > ---------------------------------------------------- > > Key: KAFKA-4212 > URL: https://issues.apache.org/jira/browse/KAFKA-4212 > Project: Kafka > Issue Type: Improvement > Components: streams > Affects Versions: 0.10.0.1 > Reporter: Elias Levy > Priority: Major > Labels: api > > Some jobs needs to maintain as state a large set of key-values for some > period of time. I.e. they need to maintain a TTL cache of values potentially > larger than memory. > Currently Kafka Streams provides non-windowed and windowed key-value stores. > Neither is an exact fit to this use case. > The {{RocksDBStore}}, a {{KeyValueStore}}, stores one value per key as > required, but does not support expiration. The TTL option of RocksDB is > explicitly not used. > The {{RocksDBWindowsStore}}, a {{WindowsStore}}, can expire items via segment > dropping, but it stores multiple items per key, based on their timestamp. > But this store can be repurposed as a cache by fetching the items in reverse > chronological order and returning the first item found. > KAFKA-2594 introduced a fixed-capacity in-memory LRU caching store, but here > we desire a variable-capacity memory-overflowing TTL caching store. > Although {{RocksDBWindowsStore}} can be repurposed as a cache, it would be > useful to have an official and proper TTL cache API and implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)