Copilot commented on code in PR #4732:
URL: https://github.com/apache/bookkeeper/pull/4732#discussion_r2983682736


##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -171,6 +199,18 @@ public void addLocation(Batch batch, long ledgerId, long 
entryId, long location)
             key.recycle();
             value.recycle();
         }
+
+        // Update the last entry cache if this entry is newer.
+        // ConcurrentLongLongHashMap.get() returns -1 if not found.
+        long cachedLastEntry = lastEntryCache.get(ledgerId);
+        if (cachedLastEntry < entryId) {
+            // Clear the cache if it exceeds the max size to bound memory 
usage.
+            // The cache will quickly repopulate with hot ledgers on 
subsequent reads.
+            if (cachedLastEntry < 0 && lastEntryCache.size() >= 
lastEntryCacheMaxSize) {
+                lastEntryCache.clear();
+            }
+            lastEntryCache.put(ledgerId, entryId);
+        }

Review Comment:
   `addLocation()` can repopulate `lastEntryCache` even after 
`delete(ledgerId)` has been called (the cache update happens unconditionally 
here). That creates a correctness risk with existing behavior: 
`removeOffsetFromDeletedLedgers()` later removes `ledgerId` from 
`deletedLedgers` *and* deletes the DB range, but it does not clear 
`lastEntryCache` again. If entries were added after `delete()` (see 
`addLedgerAfterDeleteTest`), `getLastEntryInLedger()` may return a stale cached 
entryId for a ledger whose indexes were deleted. Consider skipping cache 
updates when `deletedLedgers.contains(ledgerId)` is true, and/or clearing cache 
entries for `ledgersToDelete` inside `removeOffsetFromDeletedLedgers()` before 
removing them from `deletedLedgers`.



##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -115,6 +126,18 @@ public long getLastEntryInLedger(long ledgerId) throws 
IOException {
     }
 
     private long getLastEntryInLedgerInternal(long ledgerId) throws 
IOException {
+        // Check in-memory cache first to avoid expensive getFloor() calls.
+        // ConcurrentLongLongHashMap.get() returns -1 if not found.
+        long cachedLastEntry = lastEntryCache.get(ledgerId);
+        if (cachedLastEntry >= 0) {
+            if (log.isDebugEnabled()) {
+                log.debug("Found last entry for ledger {} in cache: {}", 
ledgerId, cachedLastEntry);
+            }
+            stats.getGetLastEntryInLedgerStats()
+                    .registerSuccessfulEvent(0, TimeUnit.NANOSECONDS);
+            return cachedLastEntry;

Review Comment:
   The cache hit path records a latency of `0` via `registerSuccessfulEvent(0, 
...)`. This will skew `getLastEntryInLedger` latency metrics (and can make 
averages look artificially perfect under high cache hit rates). Please record 
the actual elapsed time for cache hits (eg. measure from method entry) or use a 
separate counter/metric for cache hits if you want to distinguish them.



##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -139,6 +162,11 @@ private long getLastEntryInLedgerInternal(long ledgerId) 
throws IOException {
                 if (log.isDebugEnabled()) {
                     log.debug("Found last page in storage db for ledger {} - 
last entry: {}", ledgerId, lastEntryId);
                 }
+                // Populate cache for future lookups
+                if (lastEntryCache.size() >= lastEntryCacheMaxSize) {
+                    lastEntryCache.clear();
+                }
+                lastEntryCache.put(ledgerId, lastEntryId);

Review Comment:
   `ConcurrentLongLongHashMap` requires *values* to be `>= 0` (it uses `-1` as 
a sentinel). `lastEntryId` can be `-1` when the last key for a ledger is the 
special entry (see existing tests that write entryId `-1`), so 
`lastEntryCache.put(ledgerId, lastEntryId)` can throw 
`IllegalArgumentException` and break `getLastEntryInLedger()`. Consider 
guarding against negative `lastEntryId` or storing an encoded value (eg. 
`entryId + 1`) so `-1` can be cached safely.



##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -45,15 +46,25 @@
  */
 public class EntryLocationIndex implements Closeable {
 
+    static final String LAST_ENTRY_CACHE_MAX_SIZE = 
"dbStorage_lastEntryCacheMaxSize";
+    private static final long DEFAULT_LAST_ENTRY_CACHE_MAX_SIZE = 10_000;
+
     private final KeyValueStorage locationsDb;
     private final ConcurrentLongHashSet deletedLedgers = 
ConcurrentLongHashSet.newBuilder().build();
+    private final ConcurrentLongLongHashMap lastEntryCache;
+    private final long lastEntryCacheMaxSize;
     private final EntryLocationIndexStats stats;
     private boolean isCompacting;
 
     public EntryLocationIndex(ServerConfiguration conf, KeyValueStorageFactory 
storageFactory, String basePath,
             StatsLogger stats) throws IOException {
         locationsDb = storageFactory.newKeyValueStorage(basePath, "locations", 
DbConfigType.EntryLocation, conf);
 
+        this.lastEntryCacheMaxSize = conf.getLong(LAST_ENTRY_CACHE_MAX_SIZE, 
DEFAULT_LAST_ENTRY_CACHE_MAX_SIZE);
+        this.lastEntryCache = ConcurrentLongLongHashMap.newBuilder()
+                .expectedItems((int) Math.min(lastEntryCacheMaxSize, 
Integer.MAX_VALUE))
+                .build();

Review Comment:
   `lastEntryCacheMaxSize` is used directly as `expectedItems(...)` with the 
default `ConcurrentLongLongHashMap` concurrency level (16). 
`ConcurrentLongLongHashMap` enforces `expectedItems > 0` and `expectedItems >= 
concurrencyLevel`, so setting `dbStorage_lastEntryCacheMaxSize` to a small 
value (e.g., 1–15) or 0 will throw `IllegalArgumentException` during 
`EntryLocationIndex` construction. Please clamp/validate the config and/or 
explicitly set the cache map concurrency level (eg. `concurrencyLevel(min(16, 
expectedItems))`) and treat `<= 0` as “cache disabled”.
   ```suggestion
           long configuredLastEntryCacheMaxSize =
                   conf.getLong(LAST_ENTRY_CACHE_MAX_SIZE, 
DEFAULT_LAST_ENTRY_CACHE_MAX_SIZE);
           if (configuredLastEntryCacheMaxSize <= 0) {
               // Treat non-positive values as "cache disabled", but still 
create a minimal map
               this.lastEntryCacheMaxSize = 0;
               this.lastEntryCache = ConcurrentLongLongHashMap.newBuilder()
                       .expectedItems(1)
                       .concurrencyLevel(1)
                       .build();
           } else {
               this.lastEntryCacheMaxSize = configuredLastEntryCacheMaxSize;
               int expectedItems = (int) Math.min(this.lastEntryCacheMaxSize, 
Integer.MAX_VALUE);
               int concurrencyLevel = Math.min(16, expectedItems);
               this.lastEntryCache = ConcurrentLongLongHashMap.newBuilder()
                       .expectedItems(expectedItems)
                       .concurrencyLevel(concurrencyLevel)
                       .build();
           }
   ```



##########
bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndexTest.java:
##########
@@ -205,6 +207,52 @@ public void testDeleteSpecialEntry() throws IOException {
         assertEquals(0, idx.getLocation(40312, 10));
     }
 
+    @Test
+    public void testGetLastEntryInLedgerCache() throws Exception {
+        File tmpDir = File.createTempFile("bkTest", ".dir");
+        tmpDir.delete();
+        tmpDir.mkdir();
+        tmpDir.deleteOnExit();
+
+        EntryLocationIndex idx = new EntryLocationIndex(serverConfiguration, 
KeyValueStorageRocksDB.factory,
+                tmpDir.getAbsolutePath(), NullStatsLogger.INSTANCE);
+
+        // Add entries for ledger 1
+        idx.addLocation(1, 0, 100);
+        idx.addLocation(1, 1, 101);
+        idx.addLocation(1, 2, 102);
+
+        // Add entries for ledger 2
+        idx.addLocation(2, 0, 200);
+        idx.addLocation(2, 5, 205);
+
+        // First call should hit RocksDB and populate cache
+        assertEquals(2, idx.getLastEntryInLedger(1));
+        assertEquals(5, idx.getLastEntryInLedger(2));
+
+        // Second call should hit cache and return same result
+        assertEquals(2, idx.getLastEntryInLedger(1));
+        assertEquals(5, idx.getLastEntryInLedger(2));
+
+        // Adding a newer entry should update cache
+        idx.addLocation(1, 10, 110);
+        assertEquals(10, idx.getLastEntryInLedger(1));
+
+        // Delete should invalidate cache
+        idx.delete(1);
+        try {
+            idx.getLastEntryInLedger(1);
+            fail("Should have thrown NoEntryException");
+        } catch (Bookie.NoEntryException e) {
+            // expected
+        }

Review Comment:
   This test’s “cache invalidation on delete” assertion currently relies on 
`deletedLedgers.contains(ledgerId)` throwing before the cache is consulted, so 
it doesn’t actually verify that the cache entry was cleared (or that stale 
cache entries can’t leak after `removeOffsetFromDeletedLedgers()` removes the 
ledger from `deletedLedgers`). To validate cache invalidation, consider calling 
`idx.removeOffsetFromDeletedLedgers()` after `idx.delete(1)` and then asserting 
`getLastEntryInLedger(1)` throws because the DB entries are gone (which would 
fail if the cache still returned a stale value).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to