[jira] [Closed] (OAK-4501) Avoid reading segment when reading strings and the string is in the cache

2016-07-14 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4501?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain closed OAK-4501.
--

Bulk Close for 1.2.17

> Avoid reading segment when reading strings and the string is in the cache
> -
>
> Key: OAK-4501
> URL: https://issues.apache.org/jira/browse/OAK-4501
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segmentmk
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
> Fix For: 1.4.0, 1.2.17
>
>
> Partial backport of OAK-3330, focused only on the {{Segment#readString}} 
> optimization. Unfortunately OAK-3330 has 2 other commits which are nearly 
> impossible to port, so I'm going to cherrypick the change I'm interested in, 
> which is sufficiently decoupled to have its own dedicated issue.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-4546) Long running DocumentNodeStoreTest

2016-07-14 Thread Amit Jain (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4546?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amit Jain closed OAK-4546.
--

Bulk Close for 1.2.17

> Long running DocumentNodeStoreTest
> --
>
> Key: OAK-4546
> URL: https://issues.apache.org/jira/browse/OAK-4546
> Project: Jackrabbit Oak
>  Issue Type: Test
>  Components: core, documentmk
>Affects Versions: 1.0, 1.2
>Reporter: Marcel Reutegger
>Assignee: Marcel Reutegger
>Priority: Minor
> Fix For: 1.0.32, 1.2.17
>
>
> The test takes 40 to 50 seconds to execute. This is way to long for a unit 
> test. It only affects 1.0 and 1.2 branches. 1.4 and trunk look fine (6 
> seconds).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3309) Segment Tar SegmentCache loader stats

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari closed OAK-3309.
---

Bulk close for Oak Segment Tar 0.0.4.

> Segment Tar SegmentCache loader stats
> -
>
> Key: OAK-3309
> URL: https://issues.apache.org/jira/browse/OAK-3309
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Alex Parvulescu
>Assignee: Alex Parvulescu
>  Labels: gc
> Fix For: 1.6, Segment Tar 0.0.4
>
> Attachments: OAK-3309.patch
>
>
> The existing Segment Cache has no loading-related stats, I'd like to see how 
> complicated it would be to add some.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-4525) Unreferenced node records are not marked as root records in the segment

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari closed OAK-4525.
---

Bulk close for Oak Segment Tar 0.0.4.

> Unreferenced node records are not marked as root records in the segment
> ---
>
> Key: OAK-4525
> URL: https://issues.apache.org/jira/browse/OAK-4525
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.4
>
>
> When a new node record is written, if that record is not referenced by any 
> other record in the segment, it should be marked as a root record in the 
> segment header.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-4201) Add an index of binary references in a tar file

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari closed OAK-4201.
---

Bulk close for Oak Segment Tar 0.0.4.

> Add an index of binary references in a tar file
> ---
>
> Key: OAK-4201
> URL: https://issues.apache.org/jira/browse/OAK-4201
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Chetan Mehrotra
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.4
>
> Attachments: OAK-4201-01.patch
>
>
> Currently for  Blob GC in case of segment {{SegmentBlobReferenceRetriever}} 
> goes through all tar files and extracts the binary references. This has 2 
> issues
> # Logic has go through all the segments in all tar files
> # All segments get loaded in memory once which would affect normal system 
> performance
> This process can be optimized if we also write a file entry in tar (similar 
> to gph i.e. graph and idx i.e. index files) which has entries of all binary 
> references referred to in any segment present in that tar file. Then GC logic 
> would just have read this file and avoid scanning all the segments



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-4260) Define and implement migration from oak-segment to oak-segment-tar

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari closed OAK-4260.
---

Bulk close for Oak Segment Tar 0.0.4.

> Define and implement migration from oak-segment to oak-segment-tar
> --
>
> Key: OAK-4260
> URL: https://issues.apache.org/jira/browse/OAK-4260
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, segmentmk, upgrade
>Reporter: Michael Dürig
>Assignee: Tomek Rękawek
>  Labels: migration
> Fix For: 1.6, Segment Tar 0.0.4
>
>
> We need to come up with a plan, implementation and documentation for how we 
> deal with migrating from {{oak-segment}} to {{oak-segment-next}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Closed] (OAK-3468) Replace BackgroundThread with Scheduler

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari closed OAK-3468.
---

Bulk close for Oak Segment Tar 0.0.4.

> Replace BackgroundThread with Scheduler
> ---
>
> Key: OAK-3468
> URL: https://issues.apache.org/jira/browse/OAK-3468
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>  Labels: technical_debt
> Fix For: Segment Tar 0.0.4
>
>
> I think we should replace the background thread with some kind of a 
> scheduler. The goal would be to decouple threading from scheduling. IMO 
> threads should not be managed by the application but by the container. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4467) Upgrade commons-io to 2.5 and remove ReversedLinesFileReader

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4467?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4467:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Upgrade commons-io to 2.5 and remove ReversedLinesFileReader
> 
>
> Key: OAK-4467
> URL: https://issues.apache.org/jira/browse/OAK-4467
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: auth-external, blob, commons, core, examples, parent, 
> pojosr, run, segment-tar, segmentmk, webapp
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
> Attachments: OAK_4467.patch
>
>
> For OAK-2605 we copied the source of {{ReversedLinesFileReader}} to Oak to 
> get the fix for IO-471 in. As this is now fixed in {{commons-io}} 2.5, I 
> suggest we upgrade our dependency and remove that duplicated class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4506) CompactionAndCleanupIT.offlineCompaction() fails on Windows

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4506?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4506:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> CompactionAndCleanupIT.offlineCompaction() fails on Windows
> ---
>
> Key: OAK-4506
> URL: https://issues.apache.org/jira/browse/OAK-4506
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.6
>
> Attachments: OAK-4506-01.patch
>
>
> The integration test {{CompactionAndCleanupIT.offlineCompaction()}} has been 
> reported to fail on Windows.
> {noformat}
> Running org.apache.jackrabbit.oak.segment.CompactionAndCleanupIT
> Tests run: 12, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 47.951 sec 
> <<< FAILURE!
> offlineCompaction(org.apache.jackrabbit.oak.segment.CompactionAndCleanupIT)  
> Time elapsed: 6.005 sec  <<< FAILURE!
> java.lang.AssertionError: File Store 1st blob added size expected in interval 
> [15415808,15468236] but was: 15571456
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at 
> org.apache.jackrabbit.oak.segment.CompactionAndCleanupIT.assertSize(CompactionAndCleanupIT.java:473)
> at 
> org.apache.jackrabbit.oak.segment.CompactionAndCleanupIT.offlineCompaction(CompactionAndCleanupIT.java:209)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:606)
> at 
> org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:50)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4450) Properly split the FileStore into read-only and r/w variants

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4450?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4450:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Properly split the FileStore into read-only and r/w variants 
> -
>
> Key: OAK-4450
> URL: https://issues.apache.org/jira/browse/OAK-4450
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> The {{ReadOnlyFileStore}} class currently simply overrides the {{FileStore}} 
> class replacing all mutator methods with a trivial implementation. This 
> approach however leaks into its ancestor as the read only store needs to pass 
> a flag to the constructor of its super class so some fields can be 
> instantiated properly for the read only case. 
> We should clean this up to properly separate the read only and the r/w store. 
> Most likely we should factor the commonalities into a common, abstract base 
> class.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3696) Improve SegmentMK resilience

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-3696:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Improve SegmentMK resilience
> 
>
> Key: OAK-3696
> URL: https://issues.apache.org/jira/browse/OAK-3696
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: segment-tar
>Reporter: Michael Marth
>Assignee: Michael Dürig
>  Labels: resilience
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Epic for collection SegmentMK resilience improvements



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4104) Refactor reading records from segments

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4104:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Refactor reading records from segments
> --
>
> Key: OAK-4104
> URL: https://issues.apache.org/jira/browse/OAK-4104
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> We should refactor how records (e.g. node states) are read from segments. 
> Currently this is scattered and replicated across various places. All of 
> which hard coding certain indexes into a byte buffer (see calls to 
> {{Record.getOffset}} for how bad this is). 
> The current implementation makes it very hard to maintain the code and evolve 
> the segment format. We should optimally have one place per segment version 
> defining the format as a single source of truth which is then reused by the 
> various parts in of the SegmentMK, tooling and tests. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4040) Some classes from o.a.j.o.plugins.segment.compaction should be exported

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4040?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4040:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Some classes from o.a.j.o.plugins.segment.compaction should be exported
> ---
>
> Key: OAK-4040
> URL: https://issues.apache.org/jira/browse/OAK-4040
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Classes 
> {{org.apache.jackrabbit.oak.plugins.segment.compaction.CompactionStrategy}} 
> and 
> {{org.apache.jackrabbit.oak.plugins.segment.compaction.CompactionStrategyMBean}}
>  should be exported. The former is used in the public API of multiple classes 
> from {{org.apache.jackrabbit.oak.plugins.segment.file}} and 
> {{org.apache.jackrabbit.oak.plugins.segment}}, while the latter is used as 
> interface type for a service registered in the whiteboard.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4247) Deprecate oak-segment

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4247:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Deprecate oak-segment
> -
>
> Key: OAK-4247
> URL: https://issues.apache.org/jira/browse/OAK-4247
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, segmentmk
>Reporter: Michael Dürig
>Priority: Critical
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Before the next major release we need to deprecate {{oak-segment}} and make 
> {{oak-segment-tar}} the new default implementation:
> * Deprecate all classes in {{oak-segment}}
> * Update documentation to reflect this change
> * Update tooling to target {{oak-segment-tar}} (See OAK-4246). 
> * Update dependencies of upstream modules / projects from {{oak-segment}} to 
> {{oak-segment-tar}}. 
> * Ensure {{oak-segment-tar}} gets properly released (See OAK-4258). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4293) Refactor / rework compaction gain estimation

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4293?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4293:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Refactor / rework compaction gain estimation 
> -
>
> Key: OAK-4293
> URL: https://issues.apache.org/jira/browse/OAK-4293
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Alex Parvulescu
>  Labels: gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> I think we have to take another look at {{CompactionGainEstimate}} and see 
> whether we can up with a more efficient way to estimate the compaction gain. 
> The current implementation is expensive wrt. IO, CPU and cache coherence. If 
> we want to keep an estimation step we need IMO come up with a cheap way (at 
> least 2 orders of magnitude cheaper than compaction). Otherwise I would 
> actually propose to remove the current estimation approach entirely 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2833) Refactor TarMK

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-2833:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Refactor TarMK
> --
>
> Key: OAK-2833
> URL: https://issues.apache.org/jira/browse/OAK-2833
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Container issue for refactoring the TarMK to make it more testable, 
> maintainable, extensible and less entangled. 
> For example the segment format should be readable, writeable through 
> standalone means so tests, tools and production code can share this code. 
> Currently there is a lot of code duplication involved here. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4106) Reclaimed size reported by FileStore.cleanup is off

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4106?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4106:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Reclaimed size reported by FileStore.cleanup is off
> ---
>
> Key: OAK-4106
> URL: https://issues.apache.org/jira/browse/OAK-4106
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Priority: Minor
>  Labels: cleanup, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> The current implementation simply reports the difference between the 
> repository size before cleanup to the size after cleanup. As cleanup runs 
> concurrently to other commits, the size increase contributed by those is not 
> accounted for. In the extreme case where cleanup cannot reclaim anything this 
> can even result in negative values being reported. 
> We should either change the wording of the respective log message and speak 
> of before and after sizes or adjust our calculation of reclaimed size 
> (preferred). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2896) Putting many elements into a map results in many small segments.

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2896?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-2896:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Putting many elements into a map results in many small segments. 
> -
>
> Key: OAK-2896
> URL: https://issues.apache.org/jira/browse/OAK-2896
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Critical
>  Labels: performance
> Fix For: 1.6, Segment Tar 0.0.6
>
> Attachments: OAK-2896.png, OAK-2896.xlsx, size-dist.png
>
>
> There is an issue with how the HAMT implementation 
> ({{SegmentWriter.writeMap()}} interacts with the 256 segment references limit 
> when putting many entries into the map: This limit gets regularly reached 
> once the maps contains about 200k entries. At that points segments get 
> prematurely flushed resulting in more segments, thus more references and thus 
> even smaller segments. It is common for segments to be as small as 7k with a 
> tar file containing up to 35k segments. This is problematic as at this point 
> handling of the segment graph becomes expensive, both memory and CPU wise. I 
> have seen persisted segment graphs as big as 35M where the usual size is a 
> couple of ks. 
> As the HAMT map is used for storing children of a node this might have an 
> advert effect on nodes with many child nodes. 
> The following code can be used to reproduce the issue: 
> {code}
> SegmentWriter writer = new SegmentWriter(segmentStore, getTracker(), V_11);
> MapRecord baseMap = null;
> for (;;) {
> Map map = newHashMap();
> for (int k = 0; k < 1000; k++) {
> RecordId stringId = 
> writer.writeString(String.valueOf(rnd.nextLong()));
> map.put(String.valueOf(rnd.nextLong()), stringId);
> }
> Stopwatch w = Stopwatch.createStarted();
> baseMap = writer.writeMap(baseMap, map);
> System.out.println(baseMap.size() + " " + w.elapsed());
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4435) checkpointDeduplicationTest sometimes fails on Jenkins

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4435:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> checkpointDeduplicationTest sometimes fails on Jenkins
> --
>
> Key: OAK-4435
> URL: https://issues.apache.org/jira/browse/OAK-4435
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>Priority: Critical
>  Labels: compaction, gc, test
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> {{CompactionAndCleanupIT.checkpointDeduplication}} irregularly 
> [fails|https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/938/jdk=latest1.7,label=Ubuntu,nsfixtures=SEGMENT_MK,profile=integrationTesting/console]
>  on Jenkins. 
> This might point to an issue with the de-duplication caches, which are 
> crucial in getting the checkpoints de-duplicated. 
> {code}
> checkpointDeduplicationTest(org.apache.jackrabbit.oak.segment.CompactionAndCleanupIT)
>   Time elapsed: 0.15 sec  <<< FAILURE!
> org.junit.ComparisonFailure: 
> expected:<[7211975a-04ce-45ff-aff5-16795ec2cc72]:261932> but 
> was:<[11083c4b-9b2e-4d17-a8c0-8f6b1f2a3173]:261932>
>   at org.junit.Assert.assertEquals(Assert.java:115)
>   at org.junit.Assert.assertEquals(Assert.java:144)
>   at 
> org.apache.jackrabbit.oak.segment.CompactionAndCleanupIT.checkpointDeduplicationTest(CompactionAndCleanupIT.java:899)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4274) Memory-mapped files can't be explicitly unmapped

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4274:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Memory-mapped files can't be explicitly unmapped
> 
>
> Key: OAK-4274
> URL: https://issues.apache.org/jira/browse/OAK-4274
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar, segmentmk
>Reporter: Francesco Mari
>  Labels: gc, resilience
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> As described by [this JDK 
> bug|http://bugs.java.com/view_bug.do?bug_id=4724038], there is no way to 
> explicitly unmap memory mapped files. A memory mapped file is unmapped only 
> if the corresponding {{MappedByteBuffer}} is garbage collected by the JVM.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4292) Document Oak segment-tar

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4292:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Document Oak segment-tar
> 
>
> Key: OAK-4292
> URL: https://issues.apache.org/jira/browse/OAK-4292
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: doc, segment-tar
>Reporter: Michael Dürig
>  Labels: documentation, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Document Oak Segment Tar. Specifically:
> * New and changed configuration and monitoring options
> * Changes in gc (OAK-3348 et. all)
> * Changes in segment / tar format (OAK-3348)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4122) Replace the commit semaphore in the segment node store with a scheduler

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4122:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Replace the commit semaphore in the segment node store with a scheduler
> ---
>
> Key: OAK-4122
> URL: https://issues.apache.org/jira/browse/OAK-4122
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>  Labels: performance, scalability, throughput
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> {{SegmentNodeStore}} currently uses a semaphore to coordinate concurrent 
> commits thus relying on the scheduling algorithm of that implementation and 
> ultimately of the JVM for in what order commits are processed. 
> I think it would be beneficial to replace that semaphore with an explicit 
> queue of pending commit. This would allow us to implement a proper scheduler 
> optimising for e.g. minimal system load, maximal throughput or minimal 
> latency etc. A scheduler could e.g. give precedence to big commits and order 
> commits along the order of its base revisions, which would decrease the 
> amount of work to be done in rebasing. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3349) Partial compaction

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3349?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-3349:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Partial compaction
> --
>
> Key: OAK-3349
> URL: https://issues.apache.org/jira/browse/OAK-3349
> Project: Jackrabbit Oak
>  Issue Type: New Feature
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: compaction, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> On big repositories compaction can take quite a while to run as it needs to 
> create a full deep copy of the current root node state. For such cases it 
> could be beneficial if we could partially compact the repository thus 
> splitting full compaction over multiple cycles. 
> Partial compaction would run compaction on a sub-tree just like we now run it 
> on the full tree. Afterwards it would create a new root node state by 
> referencing the previous root node state replacing said sub-tree with the 
> compacted one. 
> Todo: Asses feasibility and impact, implement prototype.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4452) Consistently use the term segment-tar

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4452?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4452:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Consistently use the term segment-tar
> -
>
> Key: OAK-4452
> URL: https://issues.apache.org/jira/browse/OAK-4452
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: doc, segment-tar
>Reporter: Michael Dürig
>Priority: Minor
>  Labels: documentation, production
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> We should make an effort to consistently use the term "segment-tar" instead 
> of "SegmentMK", "TarMK", etc. in logging, exceptions, labels, descriptions, 
> documentation etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4445) Collect write statistics

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4445:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Collect write statistics 
> -
>
> Key: OAK-4445
> URL: https://issues.apache.org/jira/browse/OAK-4445
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>  Labels: compaction, gc, monitoring
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> We should come up with a good set of write statistics to collect like number 
> of records/nodes/properties/bytes. Additionally those statistics should be 
> collected for normal operation vs. compaction related operation. This would 
> allow us to more precisely analyse the effect of compaction on the overall 
> system. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2498) Root record references provide too little context for parsing a segment

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-2498:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Root record references provide too little context for parsing a segment
> ---
>
> Key: OAK-2498
> URL: https://issues.apache.org/jira/browse/OAK-2498
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: tools
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> According to the [documentation | 
> http://jackrabbit.apache.org/oak/docs/nodestore/segmentmk.html] the root 
> record references in a segment header provide enough context for parsing all 
> records within this segment without any external information. 
> Turns out this is not true: if a root record reference turns e.g. to a list 
> record. The items in that list are record ids of unknown type. So even though 
> those records might live in the same segment, we can't parse them as we don't 
> know their type. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4097) Add metric for FileStore journal writes

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4097:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Add metric for FileStore journal writes
> ---
>
> Key: OAK-4097
> URL: https://issues.apache.org/jira/browse/OAK-4097
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>Priority: Minor
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> TarMK flush thread should run every 5 secs and flush the current root head to 
> journal.log. It would be good to have a metric to capture the number of runs 
> per minute
> This would help in confirming if flush is working at expected frequency or 
> delay in acquiring locks is causing some delays



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3690) Decouple SegmentBufferWriter from SegmentStore

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-3690:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Decouple SegmentBufferWriter from SegmentStore
> --
>
> Key: OAK-3690
> URL: https://issues.apache.org/jira/browse/OAK-3690
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Currently {{SegmentBufferWriter.flush()}} directly calls 
> {{SegmentStore.writeSegment()}} once the current segment does not have enough 
> space for the next record. We should try to cut this dependency as 
> {{SegmentBufferWriter}} should only be concerned with providing buffers for 
> segments. Actually writing these to the store should be handled by a higher 
> level component. 
> A number of deadlock (e.g. (OAK-2560, OAK-3179, OAK-3264) we have seen is one 
> manifestation of this troublesome dependency. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3695) Expose ratio between waste and real data

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-3695:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Expose ratio between waste and real data
> 
>
> Key: OAK-3695
> URL: https://issues.apache.org/jira/browse/OAK-3695
> Project: Jackrabbit Oak
>  Issue Type: Story
>  Components: segment-tar
>Reporter: Valentin Olteanu
>  Labels: gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> As a user, I want to know the ratio (or precise absolute values) between 
> waste and real data on TarMK, so that I can decide if Revision GC needs to be 
> run. The measurement has to be done on a running repository and without 
> impacting the performance.
> This would also help measure the efficiency of Revision GC and see the effect 
> of improvements. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4281) Rework memory estimation for compaction

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4281:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Rework memory estimation for compaction
> ---
>
> Key: OAK-4281
> URL: https://issues.apache.org/jira/browse/OAK-4281
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Alex Parvulescu
>  Labels: compaction, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> As a result of OAK-3348 we need to partially rework the memory estimation 
> step done for deciding whether compaction can run or not. In {{oak-segment}} 
> there was a {{delta}} value derived from the compaction map. As the latter is 
> gone in {{oak-segment-next}} we need to decide whether there is another way 
> to derive this delta or whether we want to drop it entirely. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3693) Expose the internal state of the repository through indicators and checks

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-3693:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Expose the internal state of the repository through indicators and checks
> -
>
> Key: OAK-3693
> URL: https://issues.apache.org/jira/browse/OAK-3693
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: query, segment-tar
>Reporter: Valentin Olteanu
>  Labels: production
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> This container groups all the issues related to areas where we could improve 
> the monitoring of an OAK repository. This can be achieved by exposing 
> different indicators of the internal state and adding checks for certain 
> properties.
> Areas to improve:
> * Async Indexing
> * Revision GC



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4451) Implement a proper template cache

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4451?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4451:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Implement a proper template cache
> -
>
> Key: OAK-4451
> URL: https://issues.apache.org/jira/browse/OAK-4451
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: cache, monitoring, production
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> The template cache is currently just a map per segment. This is problematic 
> in various ways: 
> * A segment needs to be in memory and probably loaded first only to read 
> something from the cache. 
> * No monitoring, instrumentation of the cache
> * No control over memory consumption 
> We should there for come up with a proper template cache implementation in 
> the same way we have done for strings ({{StringCache}}) in OAK-3007. 
> Analogously that cache should be owned by the {{CachingSegmentReader}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-2849) Improve revision gc on SegmentMK

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-2849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-2849:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Improve revision gc on SegmentMK
> 
>
> Key: OAK-2849
> URL: https://issues.apache.org/jira/browse/OAK-2849
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: compaction, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
> Attachments: SegmentCompactionIT-conflicts.png
>
>
> This is a container issue for the ongoing effort to improve revision gc of 
> the SegmentMK. 
> I'm exploring 
> * ways to make the reference graph as exact as possible and necessary: it 
> should not contain segments that are not referenceable any more and but must 
> contain all segments that are referenceable. 
> * ways to segregate the reference graph reducing dependencies between certain 
> set of segments as much as possible. 
> * Reducing the number of in memory references and their impact on gc as much 
> as possible.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4309) Align property labels and descriptions in SegmentNodeStoreService

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4309?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4309:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Align property labels and descriptions in SegmentNodeStoreService
> -
>
> Key: OAK-4309
> URL: https://issues.apache.org/jira/browse/OAK-4309
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: production
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> We need to align / improve the labels and descriptions in 
> {{SegmentNodeStoreService}} to match their actual purpose. At the same time I 
> would opt for changing "compaction" to "revision gc" in all places where it 
> is used synonymously for the latter. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4287) Disable / remove SegmentBufferWriter#checkGCGen

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4287?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4287:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Disable / remove SegmentBufferWriter#checkGCGen
> ---
>
> Key: OAK-4287
> URL: https://issues.apache.org/jira/browse/OAK-4287
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: assertion, compaction, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> {{SegmentBufferWriter#checkGCGen}} is an after the fact check for back 
> references (see OAK-3348), logging a warning if detects any. As this check 
> loads the segment it checks the reference for, it is somewhat expensive. We 
> should either come up with a cheaper way for this check or remove it (at 
> least disable it by default). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4103) Replace journal.log with an in place journal

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4103?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4103:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Replace journal.log with an in place journal
> 
>
> Key: OAK-4103
> URL: https://issues.apache.org/jira/browse/OAK-4103
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Michael Dürig
>Priority: Minor
>  Labels: resilience
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> Instead of writing the current head revision to the {{journal.log}} file we 
> could make it an integral part of the node states: as OAK-3804 demonstrates 
> we already have very good heuristics to reconstruct a lost journal. If we add 
> the right annotations to the root node states this could replace the current 
> approach. The latter is problematic as it relies on the flush thread properly 
> and timely updating {{journal.log}}. See e.g. OAK-3303. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4015) Expedite commits from the compactor

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4015:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Expedite commits from the compactor
> ---
>
> Key: OAK-4015
> URL: https://issues.apache.org/jira/browse/OAK-4015
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: compaction, gc, perfomance
> Fix For: 1.6, Segment Tar 0.0.6
>
> Attachments: OAK-4015-histo.png, OAK-4015-wait-time.png
>
>
> Concurrent commits during compaction cause those to be re-compacted. 
> Currently it seems that the compaction thread can end up waiting for some 
> time to acquire the commit lock [1], which in turn causes more commits to 
> pile up to be re-compacted. I think this could be improved by tweaking the 
> lock such that the compactor could jump ahead of the queue. I.e. use a lock 
> which can be acquired in expedited mode. 
> [1] SegmentNodeStore#commitSemaphore



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4371) Overly zealous warning about checkpoints on compaction

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4371?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4371:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Overly zealous warning about checkpoints on compaction 
> ---
>
> Key: OAK-4371
> URL: https://issues.apache.org/jira/browse/OAK-4371
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar, segmentmk
>Reporter: Michael Dürig
>  Labels: compaction, gc, logging
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> {{FileStore.compact}} logs a warning {{TarMK GC #{}: compaction found {} 
> checkpoints, you might need to run checkpoint cleanup}} if there is more than 
> a single checkpoints. 
> AFIK this is now the norm as async indexing has uses 2 checkpoints 
> ([~chetanm], [~edivad] please clarify). 
> In any case should we improve this and not hard code any number of expected 
> checkpoints. Maybe make the threshold configurable?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4277) Finalise de-duplication caches

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4277:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Finalise de-duplication caches
> --
>
> Key: OAK-4277
> URL: https://issues.apache.org/jira/browse/OAK-4277
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
>  Labels: caching, compaction, gc, monitoring
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> OAK-3348 "promoted" the record cache to a de-duplication cache, which is 
> heavily relied upon during compaction. Now also node states go through this 
> cache, which can seen as one concern of the former compaction map (the other 
> being equality). 
> The current implementation of these caches is quite simple and served its 
> purpose for a POC for getting rid of the "back references" (OAK-3348). Before 
> we are ready for a release we need to finalise a couple of things though:
> * Implement cache monitoring and management
> * Make cache parameters now hard coded configurable
> * Implement proper UTs 
> * Add proper Javadoc
> * Fine tune eviction logic and move it into the caches themselves (instead of 
> relying on the client to evict items pro-actively)
> * Fine tune caching strategies: For the node state cache the cost of the item 
> is determined just by its position in the tree. We might want to take further 
> things into account (e.g. number of child nodes). Also we might want to 
> implement pinning so e.g. checkpoints would never be evicted. 
> * Finally we need to decide who should own this cache. It currently lives 
> with the {{SegmentWriter}}. However this is IMO not the correct location as 
> during compaction there is dedicated segment writer whose cache need to be 
> shared with the primary's segment writer upon successful completion. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4105) Implement FileStore.size through FileStore.approximateSize

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4105?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4105:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Implement FileStore.size through FileStore.approximateSize
> --
>
> Key: OAK-4105
> URL: https://issues.apache.org/jira/browse/OAK-4105
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Andrei Dulceanu
>Priority: Minor
>  Labels: resilience
> Fix For: Segment Tar 0.0.6
>
>
> {{FileStore.size()}} is prone to lock contention and should not be called too 
> often. As OAK-2879 already introduced an approach for tracking the current 
> size of the file store without having to lock, we might as well promote his 
> to be "the official" implementation. 
> [~frm] WDYT?



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4465) Remove the read-only concern from TarRevisions

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4465:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Remove the read-only concern from TarRevisions
> --
>
> Key: OAK-4465
> URL: https://issues.apache.org/jira/browse/OAK-4465
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> {{TarRevisions}} shouldn't be concerned with the (non) readability issues. 
> This should be the concern of the store alone. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4295) Proper versioning of storage format

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4295:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Proper versioning of storage format
> ---
>
> Key: OAK-4295
> URL: https://issues.apache.org/jira/browse/OAK-4295
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>  Labels: resilience, technical_debt
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> OAK-3348 introduced changes to the segment format (which has been bumped to 
> 12 with OAK-4232). However it also changes the format of the tar files (the 
> gc generation of the segments is written to the index file) which would also 
> require proper versioning.
> In a offline discussion [~frm] brought up the idea of adding a manifest file 
> to the store that would specify the format versions of the individual 
> components. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4014) The segment store should merge small TAR files into bigger ones

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4014?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4014:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> The segment store should merge small TAR files into bigger ones
> ---
>
> Key: OAK-4014
> URL: https://issues.apache.org/jira/browse/OAK-4014
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> The cleanup process removes unused segments from TAR files and writes new 
> generations of those TAR files without the removed segments.
> In the long run, the size of some TAR file might be smaller than the maximum 
> size allowed for a TAR file. At the time this issue was created the default 
> maximum size of a TAR file is 256 MiB.
> If there are many small TAR files, it should be possible to merge them in 
> bigger files. This way, we can reduce the total number of TAR files in the 
> segment store, and thus the number of open file descriptors that Oak has to 
> maintain.
> A possible implementation for the merge operation is the following:
> # Sort the list of TAR files by size, ascending.
> # Pick TAR files for the sorted list until the sum of their sizes after the 
> merge is less than 256 MiB.
> # Merge the picked up files into a new TAR file and marked the picked up 
> files for deletion.
> # Continue picking up TAR files from the sorted list until the list is 
> exhausted or until it's only possible to pick a single TAR file.
> The merge process can run in a background thread but it is important that it 
> doesn't conflict with the cleanup operation, since merge and cleanup both 
> change the representation of TAR files on the file system. Two possible 
> solutions to avoid conflicts are:
> # Use a global lock for the whole set of TAR files.
> # Use a lock per TAR file. The cleanup and merge processes have to agree on 
> the order to use when acquiring the lock.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4474) Finalise SegmentCache

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4474?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4474:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Finalise SegmentCache
> -
>
> Key: OAK-4474
> URL: https://issues.apache.org/jira/browse/OAK-4474
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: cache, monitoring, production
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> {{SegmentCache}} needs documentation, management instrumentation and 
> monitoring tests and logging. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-3036) DocumentRootBuilder: revisit update.limit default

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-3036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-3036:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> DocumentRootBuilder: revisit update.limit default
> -
>
> Key: OAK-3036
> URL: https://issues.apache.org/jira/browse/OAK-3036
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk, rdbmk, segment-tar
>Reporter: Julian Reschke
>  Labels: performance
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> update.limit decides whether a commit is persisted using a branch or not. The 
> default is 1 (and can be overridden using the system property).
> A typical call pattern in JCR is to persist batches of ~1024 nodes. These 
> translate to more than 1 changes (see PackageImportIT), due to JCR 
> properties, and also indexing commit hooks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4243) Oak Segment Tar Module

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4243:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Oak Segment Tar Module
> --
>
> Key: OAK-4243
> URL: https://issues.apache.org/jira/browse/OAK-4243
> Project: Jackrabbit Oak
>  Issue Type: Epic
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Michael Dürig
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> There is a couple of issues requiring us to change the segment format in a 
> non compatible way (OAK-3348, OAK-2896, OAK-2498, OAK-4201). 
> We should introduce a new module here
> * to minimise ripple effect on concurrent development work in other parts of 
> Oak and upstream projects
> * to be able to cleanly migrate existing repositories via a side grading
> * to cleanly separate breaking changes from the existing code base
> The plan is roughly to:
> * Create new module (called {{oak-segment-next}} for now, will discuss names 
> later)
> * Apply patch prepared for OAK-3348
> * Discuss and decide on final name and refactor accordingly 
> * Reactor affected tooling such that the targeted segment store can be 
> specified via an option. Keep default at {{oak-segment}}.
> * Once sufficiently stabilised, deprecate {{oak-segment}}. Make this one the 
> default in switch the default target for tooling.
> * Define and implement migration path
> I will create respective issues as needed. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4283) Align GCMonitor API with implementation

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4283?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4283:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Align GCMonitor API with implementation 
> 
>
> Key: OAK-4283
> URL: https://issues.apache.org/jira/browse/OAK-4283
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar
>Reporter: Michael Dürig
>Assignee: Francesco Mari
>  Labels: api-change, compaction, gc
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> The argument taken by {{GCMonitor.compacted}} related to parameters of the 
> compaction map. The latter has gone with OAK-3348. We need to come up with a 
> way to adjust this API accordingly. Also it might make sense to broaden the 
> scope of {{GCMonitor}} from its initial intent (logging) to a more general 
> one as this is how it is already used e.g. by the {{RefreshOnGC}} 
> implementation and for OAK-4096. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4138) Decouple revision cleanup from the flush thread

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4138?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4138:

Fix Version/s: (was: Segment Tar 0.0.4)
   Segment Tar 0.0.6

> Decouple revision cleanup from the flush thread
> ---
>
> Key: OAK-4138
> URL: https://issues.apache.org/jira/browse/OAK-4138
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: segment-tar
>Reporter: Michael Dürig
>  Labels: resilience
> Fix For: 1.6, Segment Tar 0.0.6
>
>
> I suggest we decouple revision cleanup from the flush thread. With large 
> repositories where cleanup can take several minutes to complete it blocks the 
> flush thread from updating the journal and the persisted head thus resulting 
> in larger then necessary data loss in case of a crash. 
> /cc [~alex.parvulescu]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377101#comment-15377101
 ] 

Julian Reschke edited comment on OAK-4562 at 7/14/16 3:48 PM:
--

trunk: [r1752672|http://svn.apache.org/r1752672]
1.4: [r1752676|http://svn.apache.org/r1752676]
1.2: [r1752678|http://svn.apache.org/r1752678]
1.0: [r1752680|http://svn.apache.org/r1752680]



was (Author: reschke):
trunk: [r1752672|http://svn.apache.org/r1752672]
1.4: [r1752676|http://svn.apache.org/r1752676]
1.2: [r1752678|http://svn.apache.org/r1752678]


> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.0.33, 1.4.6, 1.2.18
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4562:

Fix Version/s: 1.0.33

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.0.33, 1.4.6, 1.2.18
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-4477.
-
Resolution: Fixed

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.0.33, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Labels:   (was: candidate_oak_1_0)

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.0.33, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376889#comment-15376889
 ] 

Julian Reschke edited comment on OAK-4477 at 7/14/16 3:42 PM:
--

trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1752662|http://svn.apache.org/r1752662] 
[r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]
1.2: [r1752664|http://svn.apache.org/r1752664]
1.0: [r1752679|http://svn.apache.org/r1752679]



was (Author: reschke):
trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1752662|http://svn.apache.org/r1752662] 
[r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]
1.2: [r1752664|http://svn.apache.org/r1752664]


> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.0.33, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Fix Version/s: 1.0.33

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0
> Fix For: 1.5.6, 1.0.33, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4562:

Fix Version/s: 1.2.18

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.4.6, 1.2.18
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377101#comment-15377101
 ] 

Julian Reschke edited comment on OAK-4562 at 7/14/16 3:33 PM:
--

trunk: [r1752672|http://svn.apache.org/r1752672]
1.4: [r1752676|http://svn.apache.org/r1752676]
1.2: [r1752678|http://svn.apache.org/r1752678]



was (Author: reschke):
trunk: [r1752672|http://svn.apache.org/r1752672]

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.4.6, 1.2.18
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4562:

Fix Version/s: 1.5.6

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4562:

Fix Version/s: 1.4.6

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6, 1.4.6
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377101#comment-15377101
 ] 

Julian Reschke commented on OAK-4562:
-

trunk: [r1752672|http://svn.apache.org/r1752672]

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4562?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke resolved OAK-4562.
-
Resolution: Fixed

> BasicDocumentStore max id test might return misleading results
> --
>
> Key: OAK-4562
> URL: https://issues.apache.org/jira/browse/OAK-4562
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: documentmk
>Affects Versions: 1.0.32, 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
> Fix For: 1.5.6
>
>
> The test assumes that the {{DocumentStore}} will return false when trying to 
> insert a document with an ID that is too long. However, if the test was 
> aborted before and the persistence wasn't cleared, the insert might fail 
> because the document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4562) BasicDocumentStore max id test might return misleading results

2016-07-14 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-4562:
---

 Summary: BasicDocumentStore max id test might return misleading 
results
 Key: OAK-4562
 URL: https://issues.apache.org/jira/browse/OAK-4562
 Project: Jackrabbit Oak
  Issue Type: Technical task
  Components: documentmk
Affects Versions: 1.4.5, 1.5.5, 1.0.32, 1.2.17
Reporter: Julian Reschke
Assignee: Julian Reschke


The test assumes that the {{DocumentStore}} will return false when trying to 
insert a document with an ID that is too long. However, if the test was aborted 
before and the persistence wasn't cleared, the insert might fail because the 
document is already there. Thus, delete it first.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4560) The stable record ID is not maintained across two or more generations

2016-07-14 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377068#comment-15377068
 ] 

Francesco Mari commented on OAK-4560:
-

I made a small modification at r1752668 by removing some duplicated logic. Now 
{{getStableId()}} is implemented on top of {{getStableIdBytes()}}.

> The stable record ID is not maintained across two or more generations
> -
>
> Key: OAK-4560
> URL: https://issues.apache.org/jira/browse/OAK-4560
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.6
>
>
> {{SegmentWriter}} should make sure that, for every node compacted onto a new 
> generation, the stable ID remans the same. This should be guaranteed for 
> every node and for every amount of generations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376889#comment-15376889
 ] 

Julian Reschke edited comment on OAK-4477 at 7/14/16 2:33 PM:
--

trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1752662|http://svn.apache.org/r1752662] 
[r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]
1.2: [r1752664|http://svn.apache.org/r1752664]



was (Author: reschke):
trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1752662|http://svn.apache.org/r1752662] 
[r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0
> Fix For: 1.5.6, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Fix Version/s: 1.2.18

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0
> Fix For: 1.5.6, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Labels: candidate_oak_1_0  (was: candidate_oak_1_0 candidate_oak_1_2)

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0
> Fix For: 1.5.6, 1.4.6, 1.2.18
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Fix Version/s: 1.4.6

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2
> Fix For: 1.5.6, 1.4.6
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3777) Multiplexing support in default PermissionStore implementation

2016-07-14 Thread angela (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376902#comment-15376902
 ] 

angela edited comment on OAK-3777 at 7/14/16 2:04 PM:
--

we just had a discussion about the multiplexing authorization models in a 
meeting. it turned out that I was mistaken about the fact that the physical 
location of the permission store entries associated with a secondary (private) 
store which according to the explanation of [~chetanm] is _not_ being written 
back to the shared (global) nodestore. instead the additional private store 
keeps it's own 'permission store' and it's only upon read and evaluation that a 
multiplexing aware permission-entry-reader needs to be aware of the 
multiplexing.

so, other authorization models (such as for example {{oak-authorization-cug}} 
can make a conscious decision on whether to support multiplexed setups by using 
and following the {{MountProvider}} API contract.

with that information at hand my concerns have been addressed and we decided on 
the following next steps:
- [~chetanm] will write the documentation on how to write multiplexing aware 
authorization modules
- I will then use that docu to adjust {{oak-authorization-cug}}, which will 
allow us to verify that the concept works for more implementations as well (or 
even possibly refine the concept in case it was needed).


was (Author: anchela):
we just had a discussion about the multiplexing authorization models in a 
meeting. it turned out that I was mistaken about the fact that the physical 
location of the permission store entries associated with a secondary (private) 
store which according to the explanation of [~chetanm] is _not_ being written 
back to the shared (global) nodestore. instead the additional private store 
keeps it's own 'permission store' and it's only upon read and evaluation that a 
multiplexing aware permission-entry-reader needs to be aware of the 
multiplexing.

so, other authorization models (such as for example {{oak-authorization-cug}} 
can make a conscious decision on whether to support multiplexed setups by using 
and following the {{MountProvider}} API contract.

with that information at hand my concerns have been addressed and we decided on 
the following next steps:
- [~chetanm] will write the documentation on how to write multiplexing aware 
authorization modules
- I will then use that docu to adjust {{oak-authorization-cug}}, which will 
allow us the verify that the concept works for more implementations as well.

> Multiplexing support in default PermissionStore implementation
> --
>
> Key: OAK-3777
> URL: https://issues.apache.org/jira/browse/OAK-3777
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>
> Similar to other parts we need to prototype support for multiplexing in 
> default permission store



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376889#comment-15376889
 ] 

Julian Reschke edited comment on OAK-4477 at 7/14/16 2:04 PM:
--

trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1752662|http://svn.apache.org/r1752662] 
[r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]


was (Author: reschke):
trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]


> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2
> Fix For: 1.5.6, 1.4.6
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Labels: candidate_oak_1_0 candidate_oak_1_2  (was: candidate_oak_1_0 
candidate_oak_1_2 candidate_oak_1_4)

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2
> Fix For: 1.5.6, 1.4.6
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4559) RDB*Store: failures with Tomcat JDBC pool's StatementCache interceptor

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376909#comment-15376909
 ] 

Julian Reschke commented on OAK-4559:
-

to reproduce:
{noformat}
svn co https://svn.apache.org/repos/asf/jackrabbit/oak/trunk
cd trunk
svn up -r 1752659
mvn clean install -DskipTests
cd oak-core
mvn clean install 
-Dorg.apache.jackrabbit.oak.plugins.document.rdb.RDBDataSourceFactory.jdbcInterceptors="StatementCache"
 -Prdb-derby -Dtest=RDBBlobStoreTest
{noformat}

> RDB*Store: failures with Tomcat JDBC pool's StatementCache interceptor
> --
>
> Key: OAK-4559
> URL: https://issues.apache.org/jira/browse/OAK-4559
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>
> See .
> When the StatementCache interceptor is enabled, {{ResultSet}} objects do not 
> get closed automatically anymore. In {RDBBlobStore}}, this leads to deadlocks 
> (Derby) and hard-to-debug exceptions (DB2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4559) RDB*Store: failures with Tomcat JDBC pool's StatementCache interceptor

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4559?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376909#comment-15376909
 ] 

Julian Reschke edited comment on OAK-4559 at 7/14/16 1:31 PM:
--

to reproduce:
{noformat}
svn co https://svn.apache.org/repos/asf/jackrabbit/oak/trunk
cd trunk
svn up -r 1752659
mvn clean install -DskipTests
cd oak-core
mvn clean install 
-Dorg.apache.jackrabbit.oak.plugins.document.rdb.RDBDataSourceFactory.jdbcInterceptors="StatementCache"
 -Prdb-derby -Dtest=RDBBlobStoreTest
{noformat}

(this will reproduce the deadlock with Derby)


was (Author: reschke):
to reproduce:
{noformat}
svn co https://svn.apache.org/repos/asf/jackrabbit/oak/trunk
cd trunk
svn up -r 1752659
mvn clean install -DskipTests
cd oak-core
mvn clean install 
-Dorg.apache.jackrabbit.oak.plugins.document.rdb.RDBDataSourceFactory.jdbcInterceptors="StatementCache"
 -Prdb-derby -Dtest=RDBBlobStoreTest
{noformat}

> RDB*Store: failures with Tomcat JDBC pool's StatementCache interceptor
> --
>
> Key: OAK-4559
> URL: https://issues.apache.org/jira/browse/OAK-4559
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.2.17, 1.5.5, 1.4.5
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>
> See .
> When the StatementCache interceptor is enabled, {{ResultSet}} objects do not 
> get closed automatically anymore. In {RDBBlobStore}}, this leads to deadlocks 
> (Derby) and hard-to-debug exceptions (DB2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-3777) Multiplexing support in default PermissionStore implementation

2016-07-14 Thread angela (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376902#comment-15376902
 ] 

angela edited comment on OAK-3777 at 7/14/16 1:27 PM:
--

we just had a discussion about the multiplexing authorization models in a 
meeting. it turned out that I was mistaken about the fact that the physical 
location of the permission store entries associated with a secondary (private) 
store which according to the explanation of [~chetanm] is _not_ being written 
back to the shared (global) nodestore. instead the additional private store 
keeps it's own 'permission store' and it's only upon read and evaluation that a 
multiplexing aware permission-entry-reader needs to be aware of the 
multiplexing.

so, other authorization models (such as for example {{oak-authorization-cug}} 
can make a conscious decision on whether to support multiplexed setups by using 
and following the {{MountProvider}} API contract.

with that information at hand my concerns have been addressed and we decided on 
the following next steps:
- [~chetanm] will write the documentation on how to write multiplexing aware 
authorization modules
- I will then use that docu to adjust {{oak-authorization-cug}}, which will 
allow us the verify that the concept works for more implementations as well.


was (Author: anchela):
we just had a discussion about the multiplexing authorization models in a 
meeting. it turned out that I was mistaken about the fact that the physical 
location of the permission store entries associated with a secondary (private) 
store which according to the explanation of [~chetanm] is _not_ being written 
back to the shared (global) nodestore. instead the additional private store 
keeps it's own 'permission store' and it's only upon read and evaluation that a 
multiplexing aware permission-entry-reader needs to be aware of the 
multiplexing.

so, other authorization models (such as for example {{oak-authorization-cug}} 
can make a conscious decision on whether to support multiplexed setups by using 
and following the {{MountProvider}} API contract.

with that information at hand my concerns have been addressed and we decided on 
the following next steps:
- [~chetanm] will write the documentation on how to write multiplexing aware 
authorization modules
- I will then use that docu to adjust {{oak-authorization-cug}}, which will 
allow us the verify that the concept works for others

> Multiplexing support in default PermissionStore implementation
> --
>
> Key: OAK-3777
> URL: https://issues.apache.org/jira/browse/OAK-3777
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>
> Similar to other parts we need to prototype support for multiplexing in 
> default permission store



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3777) Multiplexing support in default PermissionStore implementation

2016-07-14 Thread angela (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376902#comment-15376902
 ] 

angela commented on OAK-3777:
-

we just had a discussion about the multiplexing authorization models in a 
meeting. it turned out that I was mistaken about the fact that the physical 
location of the permission store entries associated with a secondary (private) 
store which according to the explanation of [~chetanm] is _not_ being written 
back to the shared (global) nodestore. instead the additional private store 
keeps it's own 'permission store' and it's only upon read and evaluation that a 
multiplexing aware permission-entry-reader needs to be aware of the 
multiplexing.

so, other authorization models (such as for example {{oak-authorization-cug}} 
can make a conscious decision on whether to support multiplexed setups by using 
and following the {{MountProvider}} API contract.

with that information at hand my concerns have been addressed and we decided on 
the following next steps:
- [~chetanm] will write the documentation on how to write multiplexing aware 
authorization modules
- I will then use that docu to adjust {{oak-authorization-cug}}, which will 
allow us the verify that the concept works for others

> Multiplexing support in default PermissionStore implementation
> --
>
> Key: OAK-3777
> URL: https://issues.apache.org/jira/browse/OAK-3777
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: core
>Reporter: Chetan Mehrotra
>Assignee: Chetan Mehrotra
>
> Similar to other parts we need to prototype support for multiplexing in 
> default permission store



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376889#comment-15376889
 ] 

Julian Reschke commented on OAK-4477:
-

trunk: [r1752659|http://svn.apache.org/r1752659] 
[r1748747|http://svn.apache.org/r1748747] 
[r1748714|http://svn.apache.org/r1748714]
1.4: [r1748748|http://svn.apache.org/r1748748] 
[r1748737|http://svn.apache.org/r1748737]


> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.5.6
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Reschke updated OAK-4477:

Fix Version/s: (was: 1.6)
   1.5.6

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.5.6
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4247) Deprecate oak-segment

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4247?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4247:

Priority: Critical  (was: Blocker)

> Deprecate oak-segment
> -
>
> Key: OAK-4247
> URL: https://issues.apache.org/jira/browse/OAK-4247
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, segmentmk
>Reporter: Michael Dürig
>Priority: Critical
>  Labels: technical_debt
> Fix For: 1.6, Segment Tar 0.0.4
>
>
> Before the next major release we need to deprecate {{oak-segment}} and make 
> {{oak-segment-tar}} the new default implementation:
> * Deprecate all classes in {{oak-segment}}
> * Update documentation to reflect this change
> * Update tooling to target {{oak-segment-tar}} (See OAK-4246). 
> * Update dependencies of upstream modules / projects from {{oak-segment}} to 
> {{oak-segment-tar}}. 
> * Ensure {{oak-segment-tar}} gets properly released (See OAK-4258). 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4560) The stable record ID is not maintained across two or more generations

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-4560.
-
Resolution: Fixed

Fixed at r1752637.

> The stable record ID is not maintained across two or more generations
> -
>
> Key: OAK-4560
> URL: https://issues.apache.org/jira/browse/OAK-4560
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.6
>
>
> {{SegmentWriter}} should make sure that, for every node compacted onto a new 
> generation, the stable ID remans the same. This should be guaranteed for 
> every node and for every amount of generations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4560) The stable record ID is not maintained across two or more generations

2016-07-14 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4560?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376847#comment-15376847
 ] 

Francesco Mari commented on OAK-4560:
-

I added a failing test at r1752636. The test shows that if the same node is 
rewritten across three generations, its stable ID changes.

> The stable record ID is not maintained across two or more generations
> -
>
> Key: OAK-4560
> URL: https://issues.apache.org/jira/browse/OAK-4560
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.6
>
>
> {{SegmentWriter}} should make sure that, for every node compacted onto a new 
> generation, the stable ID remans the same. This should be guaranteed for 
> every node and for every amount of generations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4561) Avoid embedding Apache Commons Math in Segment Tar

2016-07-14 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-4561:
---

 Summary: Avoid embedding Apache Commons Math in Segment Tar
 Key: OAK-4561
 URL: https://issues.apache.org/jira/browse/OAK-4561
 Project: Jackrabbit Oak
  Issue Type: Improvement
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: Segment Tar 0.0.6


Apache Commons Math is a relatively large dependency. If possible, embedding it 
should be avoided in order not to increase the size of the Segment Tar 
considerably.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4260) Define and implement migration from oak-segment to oak-segment-tar

2016-07-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4260?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek resolved OAK-4260.

Resolution: Fixed

> Define and implement migration from oak-segment to oak-segment-tar
> --
>
> Key: OAK-4260
> URL: https://issues.apache.org/jira/browse/OAK-4260
> Project: Jackrabbit Oak
>  Issue Type: Task
>  Components: segment-tar, segmentmk, upgrade
>Reporter: Michael Dürig
>Assignee: Tomek Rękawek
>  Labels: migration
> Fix For: 1.6, Segment Tar 0.0.4
>
>
> We need to come up with a plan, implementation and documentation for how we 
> deal with migrating from {{oak-segment}} to {{oak-segment-next}}. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4560) The stable record ID is not maintained across two or more generations

2016-07-14 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-4560:
---

 Summary: The stable record ID is not maintained across two or more 
generations
 Key: OAK-4560
 URL: https://issues.apache.org/jira/browse/OAK-4560
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: Segment Tar 0.0.6


{{SegmentWriter}} should make sure that, for every node compacted onto a new 
generation, the stable ID remans the same. This should be guaranteed for every 
node and for every amount of generations.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Comment Edited] (OAK-4477) RDBDatasourceFactory should use pool config similar to sling datasource defaults

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4477?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15374828#comment-15374828
 ] 

Julian Reschke edited comment on OAK-4477 at 7/14/16 12:20 PM:
---

Seems to be caused by https://bz.apache.org/bugzilla/show_bug.cgi?id=59850 -- 
the StatementCache appears to break the JDBC contract about when ResultSets are 
automatically closed - see OAK-4559.

Resolve *this* bug by defaulting to all interceptors expect StatementCache, and 
allowing to override this using a system property for testing.




was (Author: reschke):
Seems to be caused by https://bz.apache.org/bugzilla/show_bug.cgi?id=59850 -- 
the StatementCache appears to break the JDBC contract about when ResultSets are 
automatically closed.

> RDBDatasourceFactory should use pool config similar to sling datasource 
> defaults
> 
>
> Key: OAK-4477
> URL: https://issues.apache.org/jira/browse/OAK-4477
> Project: Jackrabbit Oak
>  Issue Type: Technical task
>  Components: rdbmk
>Affects Versions: 1.0.31, 1.4.3, 1.5.3, 1.2.16
>Reporter: Julian Reschke
>Assignee: Julian Reschke
>  Labels: candidate_oak_1_0, candidate_oak_1_2, candidate_oak_1_4
> Fix For: 1.6
>
> Attachments: OAK-4477.diff
>
>
> {{RDBDataSourceFactory}} is used only for testing, and creates instances of 
> {{org.apache.tomcat.jdbc.pool.DataSource}}.
> These are currently created with default config, while 
> {{org.apache.sling.datasource}} (which is likely used in production) uses 
> it's own defaults. In particular, it configures three interceptors -- 
> {{StatementCache;SlowQueryReport(threshold=1);ConnectionState}} -- which 
> we do not, thus they aren't getting unit test coverage.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4559) RDB*Store: failures with Tomcat JDBC pool's StatementCache interceptor

2016-07-14 Thread Julian Reschke (JIRA)
Julian Reschke created OAK-4559:
---

 Summary: RDB*Store: failures with Tomcat JDBC pool's 
StatementCache interceptor
 Key: OAK-4559
 URL: https://issues.apache.org/jira/browse/OAK-4559
 Project: Jackrabbit Oak
  Issue Type: Technical task
  Components: rdbmk
Affects Versions: 1.4.5, 1.5.5, 1.2.17
Reporter: Julian Reschke
Assignee: Julian Reschke


See .

When the StatementCache interceptor is enabled, {{ResultSet}} objects do not 
get closed automatically anymore. In {RDBBlobStore}}, this leads to deadlocks 
(Derby) and hard-to-debug exceptions (DB2).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4169) Make the bulk createOrUpdate retry count configurable in Mongo

2016-07-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-4169:
---
Fix Version/s: 1.4.6

> Make the bulk createOrUpdate retry count configurable in Mongo
> --
>
> Key: OAK-4169
> URL: https://issues.apache.org/jira/browse/OAK-4169
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
> Fix For: 1.6, 1.5.2, 1.4.6
>
> Attachments: OAK-4169.patch
>
>
> The bulk createOrUpdate() introduced in OAK-2066 retries the bulk request up 
> to 3 times if there are conflicts. However, after performing tests in AEM it 
> seems that if the 1st fails, the remaining documents causes the 2nd and the 
> 3rd round to fail as well. Therefore we shouldn't retry the bulk requests by 
> default. Let's add a new parameter {{oak.mongo.bulkRetries}}, set to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4169) Make the bulk createOrUpdate retry count configurable in Mongo

2016-07-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-4169?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376785#comment-15376785
 ] 

Tomek Rękawek commented on OAK-4169:


Backported to 1.4 in [r1752619|https://svn.apache.org/r1752619].

> Make the bulk createOrUpdate retry count configurable in Mongo
> --
>
> Key: OAK-4169
> URL: https://issues.apache.org/jira/browse/OAK-4169
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
> Fix For: 1.6, 1.5.2, 1.4.6
>
> Attachments: OAK-4169.patch
>
>
> The bulk createOrUpdate() introduced in OAK-2066 retries the bulk request up 
> to 3 times if there are conflicts. However, after performing tests in AEM it 
> seems that if the 1st fails, the remaining documents causes the 2nd and the 
> 3rd round to fail as well. Therefore we shouldn't retry the bulk requests by 
> default. Let's add a new parameter {{oak.mongo.bulkRetries}}, set to 0.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4168) Replace the massive lock acquire with cache tracker in bulk createOrUpdate()

2016-07-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-4168:
---
Fix Version/s: 1.4.6

> Replace the massive lock acquire with cache tracker in bulk createOrUpdate()
> 
>
> Key: OAK-4168
> URL: https://issues.apache.org/jira/browse/OAK-4168
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
>Priority: Minor
> Fix For: 1.6, 1.5.2, 1.4.6
>
> Attachments: OAK-4168.patch
>
>
> OAK-4112 introduces an experimental cache tracker mechanism, that can be used 
> instead of bulk Lock.acquire() operation. Investigate whether the same 
> mechanism may be used in the bulk createOrUpdate() method in MongoMK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4168) Replace the massive lock acquire with cache tracker in bulk createOrUpdate()

2016-07-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-4168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376784#comment-15376784
 ] 

Tomek Rękawek commented on OAK-4168:


Backported to 1.4 in [r1752618|https://svn.apache.org/r1752618].

> Replace the massive lock acquire with cache tracker in bulk createOrUpdate()
> 
>
> Key: OAK-4168
> URL: https://issues.apache.org/jira/browse/OAK-4168
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
>Priority: Minor
> Fix For: 1.6, 1.5.2, 1.4.6
>
> Attachments: OAK-4168.patch
>
>
> OAK-4112 introduces an experimental cache tracker mechanism, that can be used 
> instead of bulk Lock.acquire() operation. Investigate whether the same 
> mechanism may be used in the bulk createOrUpdate() method in MongoMK.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-3683) BasicDocumentStore.testInterestingStrings failure on MongoDB after OAK-3651 with Java 8

2016-07-14 Thread Julian Reschke (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-3683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376782#comment-15376782
 ] 

Julian Reschke commented on OAK-3683:
-

trunk: [r1752616|http://svn.apache.org/r1752616] 
[r1752447|http://svn.apache.org/r1752447]
1.4: [r1752617|http://svn.apache.org/r1752617]


> BasicDocumentStore.testInterestingStrings failure on MongoDB after OAK-3651 
> with Java 8
> ---
>
> Key: OAK-3683
> URL: https://issues.apache.org/jira/browse/OAK-3683
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: mongomk
>Affects Versions: 1.4
> Environment: MongoDB 2.6.9, MongoDB 3.0.2
> Java 8
>Reporter: Robert Munteanu
>
> On Java 8 only the following test fails:
> {noformat}Failed tests:   testInterestingStrings[MongoFixture: 
> MongoDB](org.apache.jackrabbit.oak.plugins.document.BasicDocumentStoreTest): 
> failure to round-trip brokensurrogate through MongoDB expected:<[?]> but 
> was:<[�]>{noformat}
> According to git bisect the commit which started showing this error was 
> [r1715092|http://svn.apache.org/viewvc?view=revision&revision=r1715092]: 
> OAK-3651 - Remove HierrachialCacheInvalidator
> The command I used to run the tests was {{mvn -am -pl oak-core clean package 
> -Dtest=BasicDocumentStoreTest -DfailIfNoTests=false}}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4112) Replace the query exclusive lock with a cache tracker

2016-07-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-4112:
---
Fix Version/s: 1.4.6

> Replace the query exclusive lock with a cache tracker
> -
>
> Key: OAK-4112
> URL: https://issues.apache.org/jira/browse/OAK-4112
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
>  Labels: performance
> Fix For: 1.6, 1.5.2, 1.4.6
>
> Attachments: OAK-4112-1.patch, OAK-4112-2.patch, OAK-4112-3.patch, 
> OAK-4112-4.patch, OAK-4112-putifnewer.patch, OAK-4112.patch
>
>
> The {{MongoDocumentStore#query()}} method uses an expensive 
> {{TreeLock#acquireExclusive}} method, introduced in OAK-1897 to avoid caching 
> outdated documents.
> It should be possible to avoid acquiring the exclusive lock, by tracking the 
> cache changes that occurs during the Mongo find() operation. When the find() 
> is done, we can update the cache with the received documents if they haven't 
> been invalidated in the meantime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4112) Replace the query exclusive lock with a cache tracker

2016-07-14 Thread JIRA

[ 
https://issues.apache.org/jira/browse/OAK-4112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376734#comment-15376734
 ] 

Tomek Rękawek commented on OAK-4112:


Backported to 1.4 in [r1752612|https://svn.apache.org/r1752612].

> Replace the query exclusive lock with a cache tracker
> -
>
> Key: OAK-4112
> URL: https://issues.apache.org/jira/browse/OAK-4112
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: documentmk, mongomk
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
>  Labels: performance
> Fix For: 1.6, 1.5.2, 1.4.6
>
> Attachments: OAK-4112-1.patch, OAK-4112-2.patch, OAK-4112-3.patch, 
> OAK-4112-4.patch, OAK-4112-putifnewer.patch, OAK-4112.patch
>
>
> The {{MongoDocumentStore#query()}} method uses an expensive 
> {{TreeLock#acquireExclusive}} method, introduced in OAK-1897 to avoid caching 
> outdated documents.
> It should be possible to avoid acquiring the exclusive lock, by tracking the 
> cache changes that occurs during the Mongo find() operation. When the find() 
> is done, we can update the cache with the received documents if they haven't 
> been invalidated in the meantime.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4558) SegmentNodeState.fastEquals() can trigger two I/O operations

2016-07-14 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4558?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376635#comment-15376635
 ] 

Francesco Mari commented on OAK-4558:
-

[~alex.parvulescu], this is the issue we were talking about yesterday.

> SegmentNodeState.fastEquals() can trigger two I/O operations
> 
>
> Key: OAK-4558
> URL: https://issues.apache.org/jira/browse/OAK-4558
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Reporter: Francesco Mari
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.6
>
>
> The implementation of {{SegmentNodeState.fastEquals()}} compares the stable 
> IDs of two instances of {{SegmentNodeState}}. In some cases, reading the 
> stable ID would trigger a read of an additional record, the block record 
> containing the serialized version of the segment ID.
> This issue is about evaluating the performance implications of this strategy 
> and, in particular, if it would be better to store the serialized stable ID 
> in the node record itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (OAK-4558) SegmentNodeState.fastEquals() can trigger two I/O operations

2016-07-14 Thread Francesco Mari (JIRA)
Francesco Mari created OAK-4558:
---

 Summary: SegmentNodeState.fastEquals() can trigger two I/O 
operations
 Key: OAK-4558
 URL: https://issues.apache.org/jira/browse/OAK-4558
 Project: Jackrabbit Oak
  Issue Type: Bug
  Components: segment-tar
Reporter: Francesco Mari
Assignee: Francesco Mari
 Fix For: Segment Tar 0.0.6


The implementation of {{SegmentNodeState.fastEquals()}} compares the stable IDs 
of two instances of {{SegmentNodeState}}. In some cases, reading the stable ID 
would trigger a read of an additional record, the block record containing the 
serialized version of the segment ID.

This issue is about evaluating the performance implications of this strategy 
and, in particular, if it would be better to store the serialized stable ID in 
the node record itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Resolved] (OAK-4555) Deadlock in CompactionStats.writeNode during online compaction

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari resolved OAK-4555.
-
   Resolution: Fixed
Fix Version/s: Segment Tar 0.0.6

Fixed at r1752601.

> Deadlock in CompactionStats.writeNode during online compaction
> --
>
> Key: OAK-4555
> URL: https://issues.apache.org/jira/browse/OAK-4555
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: Segment Tar 0.0.2
>Reporter: Valentin Olteanu
>Assignee: Francesco Mari
> Fix For: Segment Tar 0.0.6
>
> Attachments: stacktrace.log
>
>
> While running online revision cleanup under high load, the instance reached a 
> deadlock (or possible infinite loop).
> Full thread dump is attached, but it seems that:
> 1. Thread-21 tries to commit -> CompactionStats.writeNode -> locked by 
> compaction thread
> 2. TarMK compaction thread tries to compact -> print statistics -> infinite 
> loop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4555) Deadlock in CompactionStats.writeNode during online compaction

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4555:

Fix Version/s: Segment Tar 0.0.4

> Deadlock in CompactionStats.writeNode during online compaction
> --
>
> Key: OAK-4555
> URL: https://issues.apache.org/jira/browse/OAK-4555
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: Segment Tar 0.0.2
>Reporter: Valentin Olteanu
>Assignee: Francesco Mari
> Attachments: stacktrace.log
>
>
> While running online revision cleanup under high load, the instance reached a 
> deadlock (or possible infinite loop).
> Full thread dump is attached, but it seems that:
> 1. Thread-21 tries to commit -> CompactionStats.writeNode -> locked by 
> compaction thread
> 2. TarMK compaction thread tries to compact -> print statistics -> infinite 
> loop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4555) Deadlock in CompactionStats.writeNode during online compaction

2016-07-14 Thread Francesco Mari (JIRA)

 [ 
https://issues.apache.org/jira/browse/OAK-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Francesco Mari updated OAK-4555:

Fix Version/s: (was: Segment Tar 0.0.4)

> Deadlock in CompactionStats.writeNode during online compaction
> --
>
> Key: OAK-4555
> URL: https://issues.apache.org/jira/browse/OAK-4555
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: Segment Tar 0.0.2
>Reporter: Valentin Olteanu
>Assignee: Francesco Mari
> Attachments: stacktrace.log
>
>
> While running online revision cleanup under high load, the instance reached a 
> deadlock (or possible infinite loop).
> Full thread dump is attached, but it seems that:
> 1. Thread-21 tries to commit -> CompactionStats.writeNode -> locked by 
> compaction thread
> 2. TarMK compaction thread tries to compact -> print statistics -> infinite 
> loop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4412) Lucene hybrid index

2016-07-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-4412:
---
Attachment: OAK-4412.patch

> Lucene hybrid index
> ---
>
> Key: OAK-4412
> URL: https://issues.apache.org/jira/browse/OAK-4412
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
> Fix For: 1.6
>
> Attachments: OAK-4412.patch
>
>
> When running Oak in a cluster, each write operation is expensive. After 
> performing some stress-tests with a geo-distributed Mongo cluster, we've 
> found out that updating property indexes is a large part of the overall 
> traffic.
> The asynchronous index would be an answer here (as the index update won't be 
> made in the client request thread), but the AEM requires the updates to be 
> visible immediately in order to work properly.
> The idea here is to enhance the existing asynchronous Lucene index with a 
> synchronous, locally-stored counterpart that will persist only the data since 
> the last Lucene background reindexing job.
> The new index can be stored in memory or (if necessary) in MMAPed local 
> files. Once the "main" Lucene index is being updated, the local index will be 
> purged.
> Queries will use an union of results from the {{lucene}} and 
> {{lucene-memory}} indexes.
> The {{lucene-memory}} index, as a local stored entity, will be updated using 
> an observer, so it'll get both local and remote changes.
> The original idea has been suggested by [~chetanm] in the discussion for the 
> OAK-4233.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (OAK-4412) Lucene hybrid index

2016-07-14 Thread JIRA

 [ 
https://issues.apache.org/jira/browse/OAK-4412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tomek Rękawek updated OAK-4412:
---
Attachment: (was: OAK-4412.patch)

> Lucene hybrid index
> ---
>
> Key: OAK-4412
> URL: https://issues.apache.org/jira/browse/OAK-4412
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: lucene
>Reporter: Tomek Rękawek
>Assignee: Tomek Rękawek
> Fix For: 1.6
>
> Attachments: OAK-4412.patch
>
>
> When running Oak in a cluster, each write operation is expensive. After 
> performing some stress-tests with a geo-distributed Mongo cluster, we've 
> found out that updating property indexes is a large part of the overall 
> traffic.
> The asynchronous index would be an answer here (as the index update won't be 
> made in the client request thread), but the AEM requires the updates to be 
> visible immediately in order to work properly.
> The idea here is to enhance the existing asynchronous Lucene index with a 
> synchronous, locally-stored counterpart that will persist only the data since 
> the last Lucene background reindexing job.
> The new index can be stored in memory or (if necessary) in MMAPed local 
> files. Once the "main" Lucene index is being updated, the local index will be 
> purged.
> Queries will use an union of results from the {{lucene}} and 
> {{lucene-memory}} indexes.
> The {{lucene-memory}} index, as a local stored entity, will be updated using 
> an observer, so it'll get both local and remote changes.
> The original idea has been suggested by [~chetanm] in the discussion for the 
> OAK-4233.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4528) diff calculation in DocumentNodeStore should try to re-use journal info on diff cache miss

2016-07-14 Thread Marcel Reutegger (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376510#comment-15376510
 ] 

Marcel Reutegger commented on OAK-4528:
---

Implemented in trunk: http://svn.apache.org/r1752596

There is a feature flag {{-Doak.disableJournalDiff=true}} to disable diff 
calculation based on the journal. The fallback is the previous implementation.

I will perform some more tests before I resolve this issue.

> diff calculation in DocumentNodeStore should try to re-use journal info on 
> diff cache miss
> --
>
> Key: OAK-4528
> URL: https://issues.apache.org/jira/browse/OAK-4528
> Project: Jackrabbit Oak
>  Issue Type: Improvement
>  Components: core, documentmk
>Reporter: Vikas Saurabh
>Assignee: Marcel Reutegger
>Priority: Minor
>  Labels: observation, resilience
> Fix For: 1.6
>
>
> Currently, diff information is filled into caches actively (local commits 
> pushed in local_diff, externally read changes pushed into memory_diff). At 
> the time of event processing though, the entries could have already been 
> evicted.
> In that case, we fall back to computing diff by comparing 2 node-states which 
> becomes more and more expensive (and eventually fairly non-recoverable 
> leading to OAK-2683).
> To improve the situation somewhat, we can probably try to consult journal 
> entries to read a smaller-superset of changed paths before falling down to 
> comparison.
> /cc [~mreutegg], [~chetanm], [~egli]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (OAK-4555) Deadlock in CompactionStats.writeNode during online compaction

2016-07-14 Thread Francesco Mari (JIRA)

[ 
https://issues.apache.org/jira/browse/OAK-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15376456#comment-15376456
 ] 

Francesco Mari commented on OAK-4555:
-

>From the stack trace, and as first noted by [~volteanu] offline, the thread 
>that looks blocked is actually in a runnable state. This seems a consequence 
>of MATH-578. In Commons Math 2.2 there is a serious performance problem when 
>computing the percentile of data that is largely constant. The only solution 
>to this problem would be to upgrade to a version of Commons Math that fixes 
>this problem.

> Deadlock in CompactionStats.writeNode during online compaction
> --
>
> Key: OAK-4555
> URL: https://issues.apache.org/jira/browse/OAK-4555
> Project: Jackrabbit Oak
>  Issue Type: Bug
>  Components: segment-tar
>Affects Versions: Segment Tar 0.0.2
>Reporter: Valentin Olteanu
>Assignee: Francesco Mari
> Attachments: stacktrace.log
>
>
> While running online revision cleanup under high load, the instance reached a 
> deadlock (or possible infinite loop).
> Full thread dump is attached, but it seems that:
> 1. Thread-21 tries to commit -> CompactionStats.writeNode -> locked by 
> compaction thread
> 2. TarMK compaction thread tries to compact -> print statistics -> infinite 
> loop



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)