Nandini Singhal created KAFKA-19970:
---------------------------------------

             Summary: Add time-based eviction to tiered storage index cache to 
prevent stale entries from accumulating
                 Key: KAFKA-19970
                 URL: https://issues.apache.org/jira/browse/KAFKA-19970
             Project: Kafka
          Issue Type: Improvement
          Components: Tiered-Storage
            Reporter: Nandini Singhal


The remote log index cache (RemoteIndexCache) currently only uses weight-based 
eviction. This can cause old, smaller index files to remain cached indefinitely 
while newer indices thrash the cache, leading to inefficient cache utilization 
and increased remote fetch failures.

(RemoteIndexCache.java:142-144):
{code:java}
  return Caffeine.newBuilder()
          .maximumWeight(maxSize)
          .weigher((Uuid key, Entry entry) -> (int) entry.entrySizeBytes)
          .evictionListener(...)
          .build();{code}
  In environments with:
  - Heavy backfill workloads (reading old data once, then moving to newer data)
  - Sequential read patterns through tiered storage
  - Variable index file sizes

  The cache can end up in a state where:
  - Old index files from completed backfills remain cached (small, low 
frequency)
  - Newer index files thrash continuously (larger, similar frequency)
  - Cache hit rate degrades over time
  - Increased remote storage fetch errors due to cache misses

 

Add time-based eviction using Caffeine's expireAfterAccess to the cache 
configuration so that even if an entry remains in a favorable frequency bucket, 
it will be evicted after not being accessed for the configured duration.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to