rishabhdaim commented on PR #2904:
URL: https://github.com/apache/jackrabbit-oak/pull/2904#issuecomment-4475600231
thanks @ChlineSaurus for the review. I have asked re-review for the AI
suggested review and below are the findings:
Verdict on each AI claim
> Claim 1: "Cleanup already clears the cache — this adds a redundant second
cold-cache window"
Mostly correct, but overstated. _DefaultCleanupStrategy.cleanup()_ (line 41)
does unconditionally call _context.getSegmentCache().clear()_. The PR adds a
second clear in _notifyCompactionSucceeded →
AbstractCompactionStrategy.java:82_. So there are now two clears per GC
cycle.
However "strictly worse" overstates it. The first clear's intent is
legitimate: evict old-gen segments before new-gen reads start filling the
cache, so new-gen content doesn't compete with stale entries. The
problem is that cleanup immediately follows and clears everything again
anyway — so anything loaded between the two clears is thrown away. This is a
real but mild design issue, not catastrophic.
> Claim 2: "Contradicts CleanupFirstCompactionStrategy's 'transient
segments' comment"
**Misleading**. That comment is in
_CleanupFirstCompactionStrategy.java:60_ and refers to the pre-compaction
cleanup phase — when the new generation hasn't been committed yet and transient
(in-flight) segments exist.
The PR's clear happens after _setHead()_ committed the new generation.
These are different phases with different semantics. The "transient segments"
concern simply doesn't apply post-commit.
> Claim 3: "Wrong order — clears L2 before L1 old-gen eviction"
Technically true but practically negligible. The
_cacheManager.evictOldGeneration()_ call happens inside
_compactionSucceeded()_, which fires after the new code's cache clear. But
after _setHead()_ commits the new
generation, concurrent writers already use the new generation. The window
between the two operations where old-gen segments could sneak back into L2 is
real but vanishingly small.
> Claim 4: "Partial-success path clears too"
**Valid**. _notifyCompactionSucceeded()_ is called from both
_compactionSucceeded()_ (line 288) and _compactionPartiallySucceeded()_ (line
292). For partial success, segments from the new partial generation were just
committed and are about to be read — clearing them from L2 hurts without
clear benefit. This is analogous to why _CleanupFirstCompactionStrategy_
explicitly avoids clearing: those partially-compacted segments are
live data. This is the strongest point in the review.
> Claim 5: "Race between clear() and concurrent putSegment()"
Overstated. The _putSegment()_ implementation calls _id.loaded(segment)_
before _cache.put(id, segment)_. If a concurrent _clear()_ fires between those
two calls, the L1 memoization _(SegmentId.segment)_ gets cleared by
_onRemove() → id.unloaded()_. This results in an L1 miss on the next
access, not data corruption — the segment is still loaded from L2 or disk
correctly. Moreover, this same race already existed before this PR from
the _cleanup's clear()_. Not a new issue.
P.S. : The one claim worth acting on is _#4: notifyCompactionSucceeded()_
should probably skip the cache clear when _!compacted.isComplete() (partial
success)_.
The double-clear concern (#1) is also worth discussing with reviewers —
either justify why the early clear adds value despite cleanup doing the same,
or remove it and rely solely on cleanup's existing clear.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]