Copilot commented on code in PR #4732:
URL: https://github.com/apache/bookkeeper/pull/4732#discussion_r2983682736
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -171,6 +199,18 @@ public void addLocation(Batch batch, long ledgerId, long
entryId, long location)
key.recycle();
value.recycle();
}
+
+ // Update the last entry cache if this entry is newer.
+ // ConcurrentLongLongHashMap.get() returns -1 if not found.
+ long cachedLastEntry = lastEntryCache.get(ledgerId);
+ if (cachedLastEntry < entryId) {
+ // Clear the cache if it exceeds the max size to bound memory
usage.
+ // The cache will quickly repopulate with hot ledgers on
subsequent reads.
+ if (cachedLastEntry < 0 && lastEntryCache.size() >=
lastEntryCacheMaxSize) {
+ lastEntryCache.clear();
+ }
+ lastEntryCache.put(ledgerId, entryId);
+ }
Review Comment:
`addLocation()` can repopulate `lastEntryCache` even after
`delete(ledgerId)` has been called (the cache update happens unconditionally
here). That creates a correctness risk with existing behavior:
`removeOffsetFromDeletedLedgers()` later removes `ledgerId` from
`deletedLedgers` *and* deletes the DB range, but it does not clear
`lastEntryCache` again. If entries were added after `delete()` (see
`addLedgerAfterDeleteTest`), `getLastEntryInLedger()` may return a stale cached
entryId for a ledger whose indexes were deleted. Consider skipping cache
updates when `deletedLedgers.contains(ledgerId)` is true, and/or clearing cache
entries for `ledgersToDelete` inside `removeOffsetFromDeletedLedgers()` before
removing them from `deletedLedgers`.
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -115,6 +126,18 @@ public long getLastEntryInLedger(long ledgerId) throws
IOException {
}
private long getLastEntryInLedgerInternal(long ledgerId) throws
IOException {
+ // Check in-memory cache first to avoid expensive getFloor() calls.
+ // ConcurrentLongLongHashMap.get() returns -1 if not found.
+ long cachedLastEntry = lastEntryCache.get(ledgerId);
+ if (cachedLastEntry >= 0) {
+ if (log.isDebugEnabled()) {
+ log.debug("Found last entry for ledger {} in cache: {}",
ledgerId, cachedLastEntry);
+ }
+ stats.getGetLastEntryInLedgerStats()
+ .registerSuccessfulEvent(0, TimeUnit.NANOSECONDS);
+ return cachedLastEntry;
Review Comment:
The cache hit path records a latency of `0` via `registerSuccessfulEvent(0,
...)`. This will skew `getLastEntryInLedger` latency metrics (and can make
averages look artificially perfect under high cache hit rates). Please record
the actual elapsed time for cache hits (eg. measure from method entry) or use a
separate counter/metric for cache hits if you want to distinguish them.
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -139,6 +162,11 @@ private long getLastEntryInLedgerInternal(long ledgerId)
throws IOException {
if (log.isDebugEnabled()) {
log.debug("Found last page in storage db for ledger {} -
last entry: {}", ledgerId, lastEntryId);
}
+ // Populate cache for future lookups
+ if (lastEntryCache.size() >= lastEntryCacheMaxSize) {
+ lastEntryCache.clear();
+ }
+ lastEntryCache.put(ledgerId, lastEntryId);
Review Comment:
`ConcurrentLongLongHashMap` requires *values* to be `>= 0` (it uses `-1` as
a sentinel). `lastEntryId` can be `-1` when the last key for a ledger is the
special entry (see existing tests that write entryId `-1`), so
`lastEntryCache.put(ledgerId, lastEntryId)` can throw
`IllegalArgumentException` and break `getLastEntryInLedger()`. Consider
guarding against negative `lastEntryId` or storing an encoded value (eg.
`entryId + 1`) so `-1` can be cached safely.
##########
bookkeeper-server/src/main/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndex.java:
##########
@@ -45,15 +46,25 @@
*/
public class EntryLocationIndex implements Closeable {
+ static final String LAST_ENTRY_CACHE_MAX_SIZE =
"dbStorage_lastEntryCacheMaxSize";
+ private static final long DEFAULT_LAST_ENTRY_CACHE_MAX_SIZE = 10_000;
+
private final KeyValueStorage locationsDb;
private final ConcurrentLongHashSet deletedLedgers =
ConcurrentLongHashSet.newBuilder().build();
+ private final ConcurrentLongLongHashMap lastEntryCache;
+ private final long lastEntryCacheMaxSize;
private final EntryLocationIndexStats stats;
private boolean isCompacting;
public EntryLocationIndex(ServerConfiguration conf, KeyValueStorageFactory
storageFactory, String basePath,
StatsLogger stats) throws IOException {
locationsDb = storageFactory.newKeyValueStorage(basePath, "locations",
DbConfigType.EntryLocation, conf);
+ this.lastEntryCacheMaxSize = conf.getLong(LAST_ENTRY_CACHE_MAX_SIZE,
DEFAULT_LAST_ENTRY_CACHE_MAX_SIZE);
+ this.lastEntryCache = ConcurrentLongLongHashMap.newBuilder()
+ .expectedItems((int) Math.min(lastEntryCacheMaxSize,
Integer.MAX_VALUE))
+ .build();
Review Comment:
`lastEntryCacheMaxSize` is used directly as `expectedItems(...)` with the
default `ConcurrentLongLongHashMap` concurrency level (16).
`ConcurrentLongLongHashMap` enforces `expectedItems > 0` and `expectedItems >=
concurrencyLevel`, so setting `dbStorage_lastEntryCacheMaxSize` to a small
value (e.g., 1–15) or 0 will throw `IllegalArgumentException` during
`EntryLocationIndex` construction. Please clamp/validate the config and/or
explicitly set the cache map concurrency level (eg. `concurrencyLevel(min(16,
expectedItems))`) and treat `<= 0` as “cache disabled”.
```suggestion
long configuredLastEntryCacheMaxSize =
conf.getLong(LAST_ENTRY_CACHE_MAX_SIZE,
DEFAULT_LAST_ENTRY_CACHE_MAX_SIZE);
if (configuredLastEntryCacheMaxSize <= 0) {
// Treat non-positive values as "cache disabled", but still
create a minimal map
this.lastEntryCacheMaxSize = 0;
this.lastEntryCache = ConcurrentLongLongHashMap.newBuilder()
.expectedItems(1)
.concurrencyLevel(1)
.build();
} else {
this.lastEntryCacheMaxSize = configuredLastEntryCacheMaxSize;
int expectedItems = (int) Math.min(this.lastEntryCacheMaxSize,
Integer.MAX_VALUE);
int concurrencyLevel = Math.min(16, expectedItems);
this.lastEntryCache = ConcurrentLongLongHashMap.newBuilder()
.expectedItems(expectedItems)
.concurrencyLevel(concurrencyLevel)
.build();
}
```
##########
bookkeeper-server/src/test/java/org/apache/bookkeeper/bookie/storage/ldb/EntryLocationIndexTest.java:
##########
@@ -205,6 +207,52 @@ public void testDeleteSpecialEntry() throws IOException {
assertEquals(0, idx.getLocation(40312, 10));
}
+ @Test
+ public void testGetLastEntryInLedgerCache() throws Exception {
+ File tmpDir = File.createTempFile("bkTest", ".dir");
+ tmpDir.delete();
+ tmpDir.mkdir();
+ tmpDir.deleteOnExit();
+
+ EntryLocationIndex idx = new EntryLocationIndex(serverConfiguration,
KeyValueStorageRocksDB.factory,
+ tmpDir.getAbsolutePath(), NullStatsLogger.INSTANCE);
+
+ // Add entries for ledger 1
+ idx.addLocation(1, 0, 100);
+ idx.addLocation(1, 1, 101);
+ idx.addLocation(1, 2, 102);
+
+ // Add entries for ledger 2
+ idx.addLocation(2, 0, 200);
+ idx.addLocation(2, 5, 205);
+
+ // First call should hit RocksDB and populate cache
+ assertEquals(2, idx.getLastEntryInLedger(1));
+ assertEquals(5, idx.getLastEntryInLedger(2));
+
+ // Second call should hit cache and return same result
+ assertEquals(2, idx.getLastEntryInLedger(1));
+ assertEquals(5, idx.getLastEntryInLedger(2));
+
+ // Adding a newer entry should update cache
+ idx.addLocation(1, 10, 110);
+ assertEquals(10, idx.getLastEntryInLedger(1));
+
+ // Delete should invalidate cache
+ idx.delete(1);
+ try {
+ idx.getLastEntryInLedger(1);
+ fail("Should have thrown NoEntryException");
+ } catch (Bookie.NoEntryException e) {
+ // expected
+ }
Review Comment:
This test’s “cache invalidation on delete” assertion currently relies on
`deletedLedgers.contains(ledgerId)` throwing before the cache is consulted, so
it doesn’t actually verify that the cache entry was cleared (or that stale
cache entries can’t leak after `removeOffsetFromDeletedLedgers()` removes the
ledger from `deletedLedgers`). To validate cache invalidation, consider calling
`idx.removeOffsetFromDeletedLedgers()` after `idx.delete(1)` and then asserting
`getLastEntryInLedger(1)` throws because the DB entries are gone (which would
fail if the cache still returned a stale value).
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]