[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-08-17 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14699407#comment-14699407
 ] 

Michael Dürig commented on OAK-3168:


OAK-3055 needs to be merged first should we decide to merge this.

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.4
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-08-04 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653441#comment-14653441
 ] 

Alex Parvulescu commented on OAK-3168:
--

bq. AFAICS the simplest fix is to do nothing in the cache eviction handler when 
the passed segment is null
I would not rely on impl details :) Let's have a proper contract for the 
eviction notification that will explicitly distinguish between different types 
of evictions, as noted earlier.

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-08-04 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14653432#comment-14653432
 ] 

Michael Dürig commented on OAK-3168:


Just hit this while working on something else. AFAICS the simplest fix is to do 
nothing in the cache eviction handler when the passed segment is {{null}}. This 
avoids fresh entries to the cache to be evicted immediately again. 

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-08-03 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14651599#comment-14651599
 ] 

Michael Dürig commented on OAK-3168:


Couldn't we add a 

{code}
putIfAbsent(K key, V value, int memory)
{code}

method to the cache? This would be cleanest solution IMO.

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-07-30 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647779#comment-14647779
 ] 

Alex Parvulescu commented on OAK-3168:
--

bq. Would it be OK for you if I try that?
yes sure! I didn't want to put too much pressure on you to change the cache 
code :)

bq. Possibly patch
the removal clause changes look good, I can't say much about the loader related 
stuff

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-07-30 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647676#comment-14647676
 ] 

Thomas Mueller commented on OAK-3168:
-

Possibly patch (just for the cache; no tests yet, and SegmentCache not yet 
changed):


{noformat}
--- src/main/java/org/apache/jackrabbit/oak/cache/CacheLIRS.java
(revision 1692465)
+++ src/main/java/org/apache/jackrabbit/oak/cache/CacheLIRS.java
(working copy)
@@ -34,6 +34,7 @@
 import com.google.common.cache.CacheLoader;
 import com.google.common.cache.CacheStats;
 import com.google.common.cache.LoadingCache;
+import com.google.common.cache.RemovalCause;
 import com.google.common.cache.Weigher;
 import com.google.common.collect.ImmutableMap;
 import com.google.common.util.concurrent.ListenableFuture;
@@ -93,10 +94,11 @@
  * The method may be called twice for the same key (first if the entry
  * is resident, and later if the entry is non-resident).
  * 
+ * @param cause the removal cause
  * @param key the evicted item's key
  * @param value the evicted item's value or {@code null} if 
non-resident
  */
-void evicted(@Nonnull K key, @Nullable V value);
+void evicted(RemovalCause cause, @Nonnull K key, @Nullable V value);
 }
 
 /**
@@ -205,13 +207,13 @@
 }
 }
 
-void evicted(Entry entry) {
+void evicted(RemovalCause cause, Entry entry) {
 if (evicted == null) {
 return;
 }
 K key = entry.key;
 if (key != null) {
-evicted.evicted(key, entry.value);
+evicted.evicted(cause, key, entry.value);
 }
 }
 
@@ -300,9 +302,18 @@
  * @throws ExecutionException
  */
 @Override
-public V get(K key) throws ExecutionException {
+public V get(final K key) throws ExecutionException {
 int hash = getHash(key);
-return getSegment(hash).get(key, hash, loader);
+if (loader != null) {
+return getSegment(hash).get(key, hash, 
+new Callable() {
+@Override
+public V call() throws Exception {
+return loader.load(key);
+}
+});
+}
+return getSegment(hash).get(key, hash);
 }
 
 /**
@@ -383,7 +394,7 @@
 @Override
 public void invalidate(Object key) {
 int hash = getHash(key);
-getSegment(hash).invalidate(key, hash);
+getSegment(hash).invalidate(RemovalCause.EXPLICIT, key, hash);
 }
 
 /**
@@ -777,16 +788,16 @@
 public void evictedAll() {
 for (Entry e = stack.stackNext; e != stack; e = e.stackNext) 
{
 if (e.value != null) {
-cache.evicted(e);
+cache.evicted(RemovalCause.EXPLICIT, e);
 }
 }
 for (Entry e = queue.queueNext; e != queue; e = e.queueNext) 
{
 if (e.stackNext == null) {
-cache.evicted(e);
+cache.evicted(RemovalCause.EXPLICIT, e);
 }
 }
 for (Entry e = queue2.queueNext; e != queue2; e = 
e.queueNext) {
-cache.evicted(e);
+cache.evicted(RemovalCause.EXPLICIT, e);
 }
 }
 
@@ -993,36 +1004,6 @@
 return value;
 }
 
-V get(K key, int hash, CacheLoader loader) throws 
ExecutionException {
-// avoid synchronization if it's in the cache
-V value = get(key, hash);
-if (value != null) {
-return value;
-}
-if (loader == null) {
-return null;
-}
-synchronized (this) {
-value = get(key, hash);
-if (value != null) {
-return value;
-}
-long start = System.nanoTime();
-try {
-value = loader.load(key);
-loadSuccessCount++;
-} catch (Exception e) {
-loadExceptionCount++;
-throw new ExecutionException(e);
-} finally {
-long time = System.nanoTime() - start;
-totalLoadTime += time;
-}
-put(key, hash, value, cache.sizeOf(key, value));
-return value;
-}
-}
-
 synchronized V replace(K key, int hash, V value, int memory) {
 if (containsKey(key, hash)) {
 return put(key, hash, value, memory);
@@ -1042,7 +1023,7 @@
 synchronized boolean remove(Object key, int hash, Object value) {
 V old = get(key, hash);
 if (old != null && old.equals(value)) {
-invalid

[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-07-30 Thread Thomas Mueller (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647668#comment-14647668
 ] 

Thomas Mueller commented on OAK-3168:
-

> what Guava Cache is doing ...
> the notification itself also contains the reason, of which 'REPLACED' can be 
> a possible value.

Would it be OK for you if I try that?

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-07-30 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647592#comment-14647592
 ] 

Alex Parvulescu commented on OAK-3168:
--

Unfortunately using 'putIfAbsent' doesn't look like a good option, as we also 
pass in the size of the segment: 
{code}
put(K key, V value, int memory)
{code}

[~tmueller] what options do we have here?

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3168) SegmentCache flushes Segment on update

2015-07-30 Thread Alex Parvulescu (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14647589#comment-14647589
 ] 

Alex Parvulescu commented on OAK-3168:
--

To followup with what Guava Cache is doing I see the removal listeners are a 
bit more complex [0], and the notification itself also contains the reason, of 
which 'REPLACED' [2] can be a possible value.
Adding this level of info to the eviction callbacks might be a bit more 
involved, so I'm going to look into fixing the segment code.
Along these lines I forgot to add Thomas suggestion to use 'putIfAbsent'.
{code}
cache.asMap().putIfAbsent(..)
{code}


[0] 
https://code.google.com/p/guava-libraries/wiki/CachesExplained#Removal_Listeners
[1] 
http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/RemovalCause.html
[2] 
http://docs.guava-libraries.googlecode.com/git-history/release/javadoc/com/google/common/cache/RemovalCause.html#REPLACED

> SegmentCache flushes Segment on update
> --
>
> Key: OAK-3168
> URL: https://issues.apache.org/jira/browse/OAK-3168
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.3.5
>
>
> The SegmentCache currently uses the cache eviction call to remove the Segment 
> instance from memory to help keep the cache memory requirements under control 
> [0].
> What I've noticed though, is that for a cache update (existing key) there 
> will also be an eviction call happening, which results in a lot of extra IO 
> pressure on the SegmentStore which not only is not able to cache the segment, 
> but is forced to reload it multiple times as the reference gets nullified 
> after each load.
> This comes from the sampling behavior of the SegmentId: it will not hit the 
> cache each time it needs to load a new Segment, but rather load it from IO 
> and (re)place it in the cache, based on a sampling rate [1].
> Now I see 2 options:
>  * change the cache code to _not_ call the eviction callback on updates (or 
> allow disabling this call on updates)
>  * change the SegmentTracker code to add the value to the cache only if it's 
> not there as Segments are immutable, so no harm done.
> Raised this issue offline with [~tmueller], [~mduerig] first and as I 
> understand [~mduerig] is in favor of option one, while [~tmueller] proposed 
> that the Lirs cache impl should be inline with what the guava cache does, and 
> depending on that we could choose the right fix here. 
> Hope this covers everything.
> [0] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentTracker.java#L133
> [1] 
> https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/segment/SegmentId.java#L135



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)