showuon commented on code in PR #21089:
URL: https://github.com/apache/kafka/pull/21089#discussion_r2641816522
##########
storage/src/test/java/org/apache/kafka/storage/internals/log/RemoteIndexCacheTest.java:
##########
@@ -1309,4 +1309,103 @@ private Set<Thread> getRunningCleanerThread() {
.filter(t -> t.isAlive() &&
t.getName().startsWith(REMOTE_LOG_INDEX_CACHE_CLEANER_THREAD))
.collect(Collectors.toSet());
}
+
+ @Test
+ public void testCacheTtlCanBeDisabled() throws IOException {
+ // Test that TTL can be disabled by setting it to -1
+ long ttlMs = -1L;
+ FakeTicker fakeTicker = new FakeTicker();
+ RemoteIndexCache ttlCache = new RemoteIndexCache(1024 * 1024L, ttlMs,
rsm, logDir.toString(), fakeTicker);
+ try {
+ RemoteIndexCache.Entry entry = ttlCache.getIndexEntry(rlsMetadata);
+ assertNotNull(entry);
+
+ fakeTicker.advance(TimeUnit.HOURS.toNanos(24));
+ ttlCache.internalCache().cleanUp();
+
+ RemoteIndexCache.Entry cachedEntry =
ttlCache.internalCache().getIfPresent(rlsMetadata.remoteLogSegmentId().id());
+ assertNotNull(cachedEntry, "Entry should remain cached when TTL
disabled");
+ } finally {
+ Utils.closeQuietly(ttlCache, "RemoteIndexCache");
+ }
+ }
+
+ @Test
+ public void testCacheTtlEviction() throws IOException {
Review Comment:
Nice tests! Thanks for adding them! Could we add one more test to verify
that both size-based and time-based eviction strategy work well? Currently we
have `testCacheEntryExpiry` for size-based eviction test. So, maybe mix them
together with scenario like:
1. create a RemoteIndexCache with sizeLimit and ttlMs set.
2. add entries over sizeLimit, and make sure eviction happened
3. ticker advanced over ttlMs, and make sure eviction happened
something like that. Thanks.
##########
storage/src/main/java/org/apache/kafka/storage/internals/log/RemoteIndexCache.java:
##########
@@ -138,10 +166,21 @@ public void resizeCacheSize(long
remoteLogIndexFileCacheSize) {
}
}
- private Cache<Uuid, Entry> initEmptyCache(long maxSize) {
- return Caffeine.newBuilder()
+ private Cache<Uuid, Entry> initEmptyCache(long maxSize, long ttlMs, Ticker
ticker) {
+ Caffeine<Uuid, Entry> builder = Caffeine.newBuilder()
.maximumWeight(maxSize)
- .weigher((Uuid key, Entry entry) -> (int) entry.entrySizeBytes)
+ .recordStats()
Review Comment:
Is this necessary?
> Note that recording statistics requires bookkeeping to be performed with
each operation, and thus imposes a performance penalty on cache operations.
If possible, I think we should also make it configurable and by default, it
should be disabled. WDYT?
##########
storage/src/main/java/org/apache/kafka/server/log/remote/storage/RemoteLogManagerConfig.java:
##########
@@ -92,6 +92,15 @@ public final class RemoteLogManagerConfig {
"from remote storage in the local storage.";
public static final long
DEFAULT_REMOTE_LOG_INDEX_FILE_CACHE_TOTAL_SIZE_BYTES = 1024 * 1024 * 1024L;
+ public static final String REMOTE_LOG_INDEX_FILE_CACHE_TTL_MS_PROP =
"remote.log.index.file.cache.ttl.ms";
+ public static final String REMOTE_LOG_INDEX_FILE_CACHE_TTL_MS_DOC = "The
maximum time in milliseconds an index file entry can remain in the cache " +
+ "after its last access. After this duration, the entry will be
evicted even if there is available space. " +
+ "This helps prevent stale index files from remaining in cache
indefinitely, particularly when a broker is no longer the leader " +
+ "for a partition or when read-from-replica is enabled. Evicted
index files are automatically re-fetched from remote storage when needed. " +
+ "Default is 1 hour (3600000 ms), which provides sufficient time
for clients to read a partition/segment while ensuring stale entries " +
+ "don't accumulate. Set to -1 to disable time-based eviction and
use only size-based eviction.";
+ public static final long DEFAULT_REMOTE_LOG_INDEX_FILE_CACHE_TTL_MS =
3600000L; // 1 hour
Review Comment:
Default value is 1 hour or 15 mins are both fine for me.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]