[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699407#comment-14699407 ] Michael Dürig commented on OAK-3168: OAK-3055 needs to be merged first should we decide to merge this. > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.4 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653441#comment-14653441 ] Alex Parvulescu commented on OAK-3168: -- bq. AFAICS the simplest fix is to do nothing in the cache eviction handler when the passed segment is null I would not rely on impl details :) Let's have a proper contract for the eviction notification that will explicitly distinguish between different types of evictions, as noted earlier. > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653432#comment-14653432 ] Michael Dürig commented on OAK-3168: Just hit this while working on something else. AFAICS the simplest fix is to do nothing in the cache eviction handler when the passed segment is {{null}}. This avoids fresh entries to the cache to be evicted immediately again. > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651599#comment-14651599 ] Michael Dürig commented on OAK-3168: Couldn't we add a {code} putIfAbsent(K key, V value, int memory) {code} method to the cache? This would be cleanest solution IMO. > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647779#comment-14647779 ] Alex Parvulescu commented on OAK-3168: -- bq. Would it be OK for you if I try that? yes sure! I didn't want to put too much pressure on you to change the cache code :) bq. Possibly patch the removal clause changes look good, I can't say much about the loader related stuff > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647676#comment-14647676 ] Thomas Mueller commented on OAK-3168: - Possibly patch (just for the cache; no tests yet, and SegmentCache not yet changed): {noformat} --- src/main/java/org/apache/jackrabbit/oak/cache/CacheLIRS.java (revision 1692465) +++ src/main/java/org/apache/jackrabbit/oak/cache/CacheLIRS.java (working copy) @@ -34,6 +34,7 @@ import com.google.common.cache.CacheLoader; import com.google.common.cache.CacheStats; import com.google.common.cache.LoadingCache; +import com.google.common.cache.RemovalCause; import com.google.common.cache.Weigher; import com.google.common.collect.ImmutableMap; import com.google.common.util.concurrent.ListenableFuture; @@ -93,10 +94,11 @@ * The method may be called twice for the same key (first if the entry * is resident, and later if the entry is non-resident). * + * @param cause the removal cause * @param key the evicted item's key * @param value the evicted item's value or {@code null} if non-resident */ -void evicted(@Nonnull K key, @Nullable V value); +void evicted(RemovalCause cause, @Nonnull K key, @Nullable V value); } /** @@ -205,13 +207,13 @@ } } -void evicted(Entry entry) { +void evicted(RemovalCause cause, Entry entry) { if (evicted == null) { return; } K key = entry.key; if (key != null) { -evicted.evicted(key, entry.value); +evicted.evicted(cause, key, entry.value); } } @@ -300,9 +302,18 @@ * @throws ExecutionException */ @Override -public V get(K key) throws ExecutionException { +public V get(final K key) throws ExecutionException { int hash = getHash(key); -return getSegment(hash).get(key, hash, loader); +if (loader != null) { +return getSegment(hash).get(key, hash, +new Callable() { +@Override +public V call() throws Exception { +return loader.load(key); +} +}); +} +return getSegment(hash).get(key, hash); } /** @@ -383,7 +394,7 @@ @Override public void invalidate(Object key) { int hash = getHash(key); -getSegment(hash).invalidate(key, hash); +getSegment(hash).invalidate(RemovalCause.EXPLICIT, key, hash); } /** @@ -777,16 +788,16 @@ public void evictedAll() { for (Entry e = stack.stackNext; e != stack; e = e.stackNext) { if (e.value != null) { -cache.evicted(e); +cache.evicted(RemovalCause.EXPLICIT, e); } } for (Entry e = queue.queueNext; e != queue; e = e.queueNext) { if (e.stackNext == null) { -cache.evicted(e); +cache.evicted(RemovalCause.EXPLICIT, e); } } for (Entry e = queue2.queueNext; e != queue2; e = e.queueNext) { -cache.evicted(e); +cache.evicted(RemovalCause.EXPLICIT, e); } } @@ -993,36 +1004,6 @@ return value; } -V get(K key, int hash, CacheLoader loader) throws ExecutionException { -// avoid synchronization if it's in the cache -V value = get(key, hash); -if (value != null) { -return value; -} -if (loader == null) { -return null; -} -synchronized (this) { -value = get(key, hash); -if (value != null) { -return value; -} -long start = System.nanoTime(); -try { -value = loader.load(key); -loadSuccessCount++; -} catch (Exception e) { -loadExceptionCount++; -throw new ExecutionException(e); -} finally { -long time = System.nanoTime() - start; -totalLoadTime += time; -} -put(key, hash, value, cache.sizeOf(key, value)); -return value; -} -} - synchronized V replace(K key, int hash, V value, int memory) { if (containsKey(key, hash)) { return put(key, hash, value, memory); @@ -1042,7 +1023,7 @@ synchronized boolean remove(Object key, int hash, Object value) { V old = get(key, hash); if (old != null && old.equals(value)) { -invalid
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647668#comment-14647668 ] Thomas Mueller commented on OAK-3168: - > what Guava Cache is doing ... > the notification itself also contains the reason, of which 'REPLACED' can be > a possible value. Would it be OK for you if I try that? > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647592#comment-14647592 ] Alex Parvulescu commented on OAK-3168: -- Unfortunately using 'putIfAbsent' doesn't look like a good option, as we also pass in the size of the segment: {code} put(K key, V value, int memory) {code} [~tmueller] what options do we have here? > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update
[ https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647589#comment-14647589 ] Alex Parvulescu commented on OAK-3168: -- To followup with what Guava Cache is doing I see the removal listeners are a bit more complex [0], and the notification itself also contains the reason, of which 'REPLACED' [2] can be a possible value. Adding this level of info to the eviction callbacks might be a bit more involved, so I'm going to look into fixing the segment code. Along these lines I forgot to add Thomas suggestion to use 'putIfAbsent'. {code} cache.asMap().putIfAbsent(..) {code} [0] https://code.google.com/p/guava-libraries/wiki/CachesExplained#Removal_Listeners [1] http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/RemovalCause.html [2] http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/RemovalCause.html#REPLACED > SegmentCache flushes Segment on update > -- > > Key: OAK-3168 > URL: https://issues.apache.org/jira/browse/OAK-3168 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segmentmk >Reporter: Alex Parvulescu >Assignee: Alex Parvulescu > Fix For: 1.3.5 > > > The SegmentCache currently uses the cache eviction call to remove the Segment > instance from memory to help keep the cache memory requirements under control > [0]. > What I've noticed though, is that for a cache update (existing key) there > will also be an eviction call happening, which results in a lot of extra IO > pressure on the SegmentStore which not only is not able to cache the segment, > but is forced to reload it multiple times as the reference gets nullified > after each load. > This comes from the sampling behavior of the SegmentId: it will not hit the > cache each time it needs to load a new Segment, but rather load it from IO > and (re)place it in the cache, based on a sampling rate [1]. > Now I see 2 options: > * change the cache code to _not_ call the eviction callback on updates (or > allow disabling this call on updates) > * change the SegmentTracker code to add the value to the cache only if it's > not there as Segments are immutable, so no harm done. > Raised this issue offline with [~tmueller], [~mduerig] first and as I > understand [~mduerig] is in favor of option one, while [~tmueller] proposed > that the Lirs cache impl should be inline with what the guava cache does, and > depending on that we could choose the right fix here. > Hope this covers everything. > [0] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133 > [1] > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135 -- This message was sent by Atlassian JIRA (v6.3.4#6332)