[ 
https://issues.apache.org/jira/browse/OAK-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Joerg Hoh resolved OAK-12212.
-----------------------------
    Resolution: Fixed

> Drifts in PersistentDiskCache.cacheSize counter
> -----------------------------------------------
>
>                 Key: OAK-12212
>                 URL: https://issues.apache.org/jira/browse/OAK-12212
>             Project: Jackrabbit Oak
>          Issue Type: Bug
>          Components: segment-azure
>    Affects Versions: 2.0.0
>            Reporter: Joerg Hoh
>            Assignee: Joerg Hoh
>            Priority: Major
>             Fix For: 2.2.0
>
>
> h3. Problem
> The page rendering of an AEM instance got very slow; inspection of the 
> metrics showed that the element count in the SegmentDiskCache was 
> consistently close to 0 (instead in the tens of thousands as normal), and 
> here was a very high rate of evictions. Threaddumps show that these requests 
> were constantly reaching out to the blobstore.
> h3. Observation
> A heap dump of a long-running instance shows:
> * PersistentDiskCache.maxCacheSizeBytes ≈ 20 GiB (matches the configured 
> value)
> * AbstractPersistentCache.cacheSize (an AtomicLong, inherited) ≈ 80 GiB — 
> roughly 4× the configured maximum
> The actual cache directory on disk stays at or below the configured limit; 
> only the in-memory counter has run away.
> h3. Root cause
> {{PersistentDiskCache.writeSegment(...)}} adds {{fileSize}} to the in-memory 
> {{cacheSize}} on every invocation that reaches the write body, but the 
> corresponding file on disk is replaced — not added — when the same segment id 
> is written more than once. The writesPending guard inside {{writeSegment}} 
> only prevents concurrently running tasks for the same id; it does not prevent 
> sequentially submitted tasks. On POSIX file systems, {{Files.move(..., 
> ATOMIC_MOVE)}} maps to rename(2) and silently replaces the destination, so 
> the second (and subsequent) writes leave the directory unchanged in size 
> while still incrementing the counter.
> The eviction loop ({{cleanUpInternal}}) walks the directory and subtracts the 
> actual length of each deleted file once. The "phantom" bytes contributed by 
> redundant writes are therefore never repaid and accumulate monotonically over 
> the lifetime of the JVM.
> In addition, two smaller contributing factors keep the drift unidirectional 
> (upward):
> * cacheSize is initialized to 0 and is never reconciled against the existing 
> cache directory at startup; it relies entirely on incremental accounting 
> being correct.
> * The error branch of {{writeSegment}} deletes segmentFile on any 
> {{Files.move}} failure but does not decrement the counter for whatever 
> contribution that file previously made.
> Triggering workloads Any workload that produces multiple writes for the same 
> segment id over time: concurrent cache misses on the same segment (e.g. 
> compaction, online GC, indexing, mass traversal, standby replication, warm-up 
> after restart). The probability per workload determines the rate at which the 
> counter diverges — instances that run weeks/months will drift by tens of GiB 
> regardless of how the workload looks at any given moment.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to