[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-6915: --- Attachment: OAK-6915-05.patch 5 the version of the patch: [^OAK-6915-05.patch]. This is essentially the same as the 4th, just that I moved the call to {{SegmentId.loaded()}} inside the loader closure in {{SegmentCache.getSegment()}}. This doesn't change anything in behaviour but I think it does rely less on the implementation details of the cache and gives us stronger guarantees re. the eviction call actually happening *after* we call loaded. Additionally I moved updating the current cache weight statistics in {{SegmentCache#putSegment()}} before the call to {{Cache.put()}}. This ensures we always increment this number before we decrement it, even when {{Cache.put()}} results in immediate eviction. Finally I left a comment regarding the importance of properly ordering these statements in {{SegmentCache.putSegment()}}. [~frm], [~dulceanu], can you double check and fix the wording if necessary. The goal would be that we would still understand what this is about in two years from now. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, OAK-6915-03.patch, > OAK-6915-04.patch, OAK-6915-05.patch, OAK-6915-diagnostics-02.patch, > OAK-6915-diagnostics.patch, OAK-6915.patch, Screen Shot 2017-11-09 at > 14.14.28.png, Screen Shot 2017-11-09 at 14.16.59.png > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari updated OAK-6915: Attachment: OAK-6915-04.patch The fourth version of the patch addresses both the concerns from my previous comment. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, OAK-6915-03.patch, > OAK-6915-04.patch, OAK-6915-diagnostics-02.patch, OAK-6915-diagnostics.patch, > OAK-6915.patch, Screen Shot 2017-11-09 at 14.14.28.png, Screen Shot > 2017-11-09 at 14.16.59.png > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari updated OAK-6915: Attachment: OAK-6915-03.patch The third revision of the patch is a compromise between the second version and trunk. The Guava cache is more aggressively used like in the second version of the patch, and the {{SegmentId#unloaded}} method is called for evicted cache entries like in trunk. There is a slight difference between this patch and trunk when it comes to {{SegmentId#unloaded}}. The method affects only segments that are evicted from the cache. If a segment is never stored in the cache, but its insertion is rejected and it is evicted right away, {{SegmentId#unloaded}} will have no effect. Unit and integration tests pass. {{ManyChildNodesIT}} looks good with 2M nodes and 512 MB RAM. {{StandbyTestIT#testSyncLoop}} shows the following statistics when the diagnostics patch is applied. {noformat} TarMK data segment ID allocations: 38 TarMK bulk segment ID allocations: 2 TarMK uncached data segment reads: 33 TarMK uncached bulk segment reads: 2 TarMK written data segments..: 29 TarMK written bulk segments..: 2 {noformat} > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, OAK-6915-03.patch, > OAK-6915-diagnostics-02.patch, OAK-6915-diagnostics.patch, OAK-6915.patch, > Screen Shot 2017-11-09 at 14.14.28.png, Screen Shot 2017-11-09 at 14.16.59.png > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-6915: --- Attachment: Screen Shot 2017-11-09 at 14.14.28.png Screen Shot 2017-11-09 at 14.16.59.png On OAK-6106 not unloading the segments caused OOMEs with {{ManyChildNodeIT}}. I quickly double checked running this test on trunk commenting the calls to {{Segment.unloaded()}} out and could reproduce the OOME. Attached screenshots show memory and JCM GC with and without the unloaded call when running that test on a 4GB heap and 20M nodes. With unloading disabled there is an OOME at 14:10:30 !Screen Shot 2017-11-09 at 14.14.28.png|width=500! With unloading enabled heap and GC activity is stable: !Screen Shot 2017-11-09 at 14.16.59.png|width=500! I thus think we also need to evaluate our patches for this issue under this aspect. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, > OAK-6915-diagnostics-02.patch, OAK-6915-diagnostics.patch, OAK-6915.patch, > Screen Shot 2017-11-09 at 14.14.28.png, Screen Shot 2017-11-09 at 14.16.59.png > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari updated OAK-6915: Attachment: OAK-6915-diagnostics-02.patch I'm not convinced about the third version of the patch and the usefulness of {{SegmentId#unloaded}}. In order to gather some data, I improved the diagnostics patch. The patch now computes how many segments are written, so to have a feeling about the amount of unique segments the {{FileStore}} has to work with. Additionally, every statistics has been split for data and bulk segments. Since {{SegmentCache}} behaves differently for data and bulk segments, it makes sense to gather separate numbers. I run {{StandbyTestIT#testSyncLoop}} with the diagnostic patch in place and some variations. |trunk|{noformat}TarMK data segment ID allocations: 54 TarMK bulk segment ID allocations: 4 TarMK uncached data segment reads: 99854 TarMK uncached bulk segment reads: 2 TarMK written data segments..: 30 TarMK written bulk segments..: 2 {noformat}| |OAK-6915.patch|{noformat}TarMK data segment ID allocations: 56 TarMK bulk segment ID allocations: 4 TarMK uncached data segment reads: 101649 TarMK uncached bulk segment reads: 2 TarMK written data segments..: 32 TarMK written bulk segments..: 2{noformat}| |OAK-6915-02.patch|{noformat}TarMK data segment ID allocations: 38 TarMK bulk segment ID allocations: 2 TarMK uncached data segment reads: 33 TarMK uncached bulk segment reads: 2 TarMK written data segments..: 29 TarMK written bulk segments..: 2{noformat}| Even if we ignore the number of uncached data segment reads, the ratio between data segment ID allocations and number of written data segments shows something important: segment IDs are not as perfectly internalized as we thought. This has two important consequences. First, no matter if we call {{SegmentId#unloaded}}, there will probably be another {{SegmentId}} out there with the same MSB/LSB that contains a reference to the segment. Actually, as speculated in OAK-6919, {{SegmentCache}} itself might be the culprit of this behaviour. Second, {{SegmentCache}} should not depend on {{SegmentId}} to be perfectly internalized. It is very easy to break this design assumption, and the consequences of it are usually disastrous. It is, in my opinion, better to implement a cache that doesn't work with this assumption in mind. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, > OAK-6915-diagnostics-02.patch, OAK-6915-diagnostics.patch, OAK-6915.patch > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-6915: --- Attachment: OAK-6915.patch [^OAK-6915-03.patch] first checks the Guava cache before actually loading a segment from disk. With this test I got massive improvements for ConcurrentReadTest Unpatched: {noformat} # ConcurrentReadTest C min 10% 50% 90% max N Oak-Segment-Tar1 71 94 108 127 310 539 Oak-Segment-Tar-Cold 1 103892 103892 103892 103892 103892 1 {noformat} Patched: {noformat} # ConcurrentReadTest C min 10% 50% 90% max N Oak-Segment-Tar1 40 98 112 131 354 527 Oak-Segment-Tar-Cold 1 47 96 110 126 256 540 {noformat} Also checking with JMC confirmed that the unpatched version reads multiple GB from disk while the patched version reads virtually nothing. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, > OAK-6915-diagnostics.patch, OAK-6915.patch > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari updated OAK-6915: Attachment: OAK-6915-02.patch The second version of the patch addresses the issues with the integration tests. Moreover, I removed the counters from the diagnostic patch that were made accidentally part of the first version of the patch. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-02.patch, > OAK-6915-diagnostics.patch > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari updated OAK-6915: Attachment: OAK-6915-01.patch The first version of the fix is a partial rewrite of the {{SegmentCache}} to make better use of the underlying Guava cache. The patch also contains a minor fix for a unit test and some additional readjustments of related classes. With this patch and the diagnostics patch in place, I run {{StandbyTestIT}} once more. While the amount of allocations of new {{SegmentId}} remains unchanged, the number of uncached segment reads is down to about 260. There is still some work to do. With the patch in place a couple of integration tests don't pass. Nevertheless, the preliminary data shows that the additional work might be worth the effort. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-01.patch, OAK-6915-diagnostics.patch > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (OAK-6915) Minimize the amount of uncached segment reads
[ https://issues.apache.org/jira/browse/OAK-6915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Francesco Mari updated OAK-6915: Attachment: OAK-6915-diagnostics.patch To prove the point, the diagnostic patch modifies the {{FileStore}} to output some statistics when it's closed. In particular, it shows the number of new {{SegmentId}} instances created and the number of uncached segment reads performed during the lifetime of the {{FileStore}}. With this patch in place, I run {{StandbyTestIT}}. The test shows about 390 {{SegmentId}} allocations and 600K uncached segment reads. > Minimize the amount of uncached segment reads > - > > Key: OAK-6915 > URL: https://issues.apache.org/jira/browse/OAK-6915 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: segment-tar >Reporter: Francesco Mari >Assignee: Francesco Mari > Fix For: 1.8, 1.7.12 > > Attachments: OAK-6915-diagnostics.patch > > > The current implementation of {{SegmentCache}} should make better use of the > underlying Guava cache by relying on the cached segments instead of > unconditionally performing an uncached segment read via the > {{Callable}} passed to {{SegmentCache#getSegment}}. -- This message was sent by Atlassian JIRA (v6.4.14#64029)