gnodet commented on PR #22311:
URL: https://github.com/apache/camel/pull/22311#issuecomment-4148963092
_Claude Code on behalf of Guillaume Nodet_
## Follow-up: What about fixing `SimpleLRUCache` directly?
### The root cause
`SimpleLRUCache.remove(Object key)` (line 183) does `delegate.remove(key)`
but **doesn't decrement `totalChanges`** and **doesn't remove the key's entry
from `lastChanges`**. So the deque accumulates "ghost" entries — keys that were
added and then removed from the delegate but still sit in the deque.
When `totalChanges` exceeds `2 * maximumCacheSize` (100,000 for the file
component), `compressChangesIfNeeded()` fires on every subsequent `put()`. It
iterates the entire deque, deduplicates by key, but ghost entries for removed
keys still survive deduplication (they're unique keys). The deque never shrinks
below the number of distinct ghost keys.
### What a fix would look like
1. Decrement `totalChanges` in `remove()` to prevent unbounded growth:
```java
@Override
public V remove(Object key) {
ValueHolder<V> removed = delegate.remove(key);
if (removed != null) {
totalChanges.decrementAndGet();
}
return extractValue(removed);
}
```
2. Filter out ghost entries in `compressChangesIfNeeded()`:
```java
private void compressChangesIfNeeded() {
if (isQueueFull()) {
Deque<Entry<K, ValueHolder<V>>> newChanges = new
ConcurrentLinkedDeque<>();
Deque<Entry<K, ValueHolder<V>>> currentChanges =
lastChanges.getAndSet(newChanges);
Set<K> keys = new HashSet<>();
Entry<K, ValueHolder<V>> entry;
while ((entry = currentChanges.pollLast()) != null) {
// skip ghost entries for keys no longer in the cache
if (keys.add(entry.getKey()) &&
delegate.containsKey(entry.getKey())) {
newChanges.addFirst(entry);
}
}
totalChanges.set(newChanges.size());
}
}
```
### Benefits vs. drawbacks
| | Fix `SimpleLRUCache` | This PR's `LinkedHashMap` approach |
|---|---|---|
| **Scope** | Fixes all callers (file, AWS S3, IBM COS, any future use) |
Only fixes the file component |
| **Complexity** | Small, targeted change in existing code | Adds a new
factory method + new code path |
| **LRU semantics** | Preserves LRU eviction (access-order) | Switches to
insertion-order eviction (FIFO) |
| **Thread safety** | Already thread-safe (`ConcurrentHashMap` + locks) |
Relies on `MemoryIdempotentRepository`'s lock; `getCacheSize()`/`doStop()` are
unguarded |
| **Risk** | Higher — `SimpleLRUCache` is used broadly, subtle concurrency
bugs possible | Lower — isolated change, only affects file component default |
| **Performance** | Slightly heavier (deque + ConcurrentHashMap overhead) |
Lighter (plain `LinkedHashMap`, no deque) |
### Recommendation
Ideally do both: fix `SimpleLRUCache` because the ghost entry bug is a real
defect that will bite other callers (AWS S3 and IBM COS endpoints have the same
in-progress repository pattern). But also keep the `LinkedHashMap` approach for
the in-progress repository — it's simpler, lighter, and a better semantic fit
since in-progress files don't need LRU reordering (they're added, processed,
then removed). The `SimpleLRUCache` fix could be tracked as a separate JIRA
ticket.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]