[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16244909#comment-16244909 ]
Michael Dürig commented on OAK-6915: ------------------------------------ Wow, what an amazing find! Thanks [~dulceanu] and [~frm] for so much endurance. AFAIU the core of the problem is {{SegmentCache.getSegment()}} always reading from disk instead of trying the Guava cache first. A bit frightening that we didn't notice this for so long in any of our tests. The reason why reversing the statements in {{SegmentCache.put()}} (basically undoing the [changes|http://svn.apache.org/viewvc/jackrabbit/oak/trunk/oak-segment-tar/src/main/java/org/apache/jackrabbit/oak/segment/SegmentCache.java?r1=1793527&r2=1793526&pathrev=1793527] from r1793527) seemingly fixes the problem is that in that case {{SegmentId.unloaded()}} is not called for some segments that are evicted from the Guava cache. Those segments will then stay memoized in the {{SegmentId.segment}} field and thus will not need to be loaded from there on. I say seemingly fixes the problem as reversing those statements back is not the right solution IMO as in this case we introduce a memory leak in the {{SegmentId}} class. This is what the comment in {{SegmentCache.put()}} hints at: putting an item into the Guava cache can cause it to be evicted right away. Even before the put method returns. If we call {{SegmentId.loaded()}} after the call to {{SegmentCache.put()}}, we'll never ever get the corresponding call to {{SegmentId.unloaded()}} any more, hence a leak. That behaviour of the Guava cache can easily be reproduced in the debugger by setting the cache size to 1M and observing the sequence of the calls to {{SegmentCache.put()}} and {{SegmentCache.onRemove()}}. The striping of the cache (16.stripes) actually exacerbates the issue as each stripe causes an eviction once its size overflows 1/16th of the total size of the cache. > Minimize the amount of uncached segment reads > --------------------------------------------- > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar > Reporter: Francesco Mari > Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, > OAK-6915-diagnostics.patch > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable<Segment>}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)