[jira] [Commented] (OAK-2718) NodeStateSolrServersObserver performs complete diff synchronously causing slowness in dispatch
[ https://issues.apache.org/jira/browse/OAK-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482778#comment-14482778 ] Tommaso Teofili commented on OAK-2718: -- bq. Enable the observer via OSGi config i.e. it should work only if configured that could be done, but would make it possible to use persisted configurations for Solr index (OAK-2526) only upon configuration. bq. Wrap the observor with BackgroundObserver. See fix done for OAK-2570 for similar issue sure {quote} For proper fix (which might take some time) i.e. diff only selective part we can take approach similar to one in Lucene. The IndexTracker tracks which all Lucene index are in use and then diff is only performed for those paths. This keeps the diff logic short and precise. {quote} while working on OAK-2526 I had thought of such an approach however it would have meant reinventing / recreating kind of the same code you have been writing for the Lucene index, that would've required quite some changes in the way the Solr index works so I decided not to go that path. However I really think that we would really benefit from abstracting the generic part of the Lucene index in oak-core (definitions, aggregates, rules, etc.) to allow writing of implementation specific bits only for Lucene and Solr. That would allow to reduce the amount of per index maintenance and eventually have generic features implemented once instead of having to code them for each implementation. I am aware that this may require some time, but I think it's 100% worth the effort, at least for full text indexes. NodeStateSolrServersObserver performs complete diff synchronously causing slowness in dispatch -- Key: OAK-2718 URL: https://issues.apache.org/jira/browse/OAK-2718 Project: Jackrabbit Oak Issue Type: Bug Components: oak-solr Reporter: Chetan Mehrotra Assignee: Tommaso Teofili Fix For: 1.2 {{NodeStateSolrServersObserver}} is enabled by default and performs diff synchronously. Further it performs complete diff which might take time and would cause the dispatch thread to slowdown. This would cause issues at least with {{DocumentNodeStore}} as there the dispatch is done as part of background read and that call is time sensitive. As a fix the diff should performed asynchronously and also be selective. A similar fix was done for Lucene index as part of OAK-2570 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2717) Report maximum observation queue length in ObservationTest benchmark
[ https://issues.apache.org/jira/browse/OAK-2717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger resolved OAK-2717. --- Resolution: Fixed Implemented in trunk: http://svn.apache.org/r1671757 Report maximum observation queue length in ObservationTest benchmark Key: OAK-2717 URL: https://issues.apache.org/jira/browse/OAK-2717 Project: Jackrabbit Oak Issue Type: Sub-task Components: run Reporter: Marcel Reutegger Assignee: Marcel Reutegger Priority: Minor Fix For: 1.2 The stats printed during the ObservationTest benchmark should also include the observation queue maximum length. This makes it possible to see how the queues evolve during the test run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2720) Misleading traversal warning message while performing query
[ https://issues.apache.org/jira/browse/OAK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482703#comment-14482703 ] Thomas Mueller commented on OAK-2720: - The patch looks good to me. The risk is very low, so I'm OK to include it for Oak 1.2. of more than 1 (default). Yes, it's true. I thought the warning is at 1'000, but you are right, it's at 10'000 (see the patch). even if end result set is small say 100 but indexed paths are deep Are you sure? If the limit is at 10'000, then with 100 entries the paths would need to be extremely deep (100 on average, per node). Misleading traversal warning message while performing query --- Key: OAK-2720 URL: https://issues.apache.org/jira/browse/OAK-2720 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Chetan Mehrotra Fix For: 1.2 Attachments: OAK-2720.patch Currently {{ContentMirrorStoreStrategy}} logs a traversal warning if the property index performs node traversal of more than 1 (default). The intention here being to warn the end user that traversing so many nodes would cause performance issue. Traversal in itself might happen due to many reason like # Query not using right index. If the query has two property restriction and one of them is more broad while other is more selective and index is defined only for first then more traversal would be performed. The warning should help the user to create a new index for second property # Caller is fetching way more result - Query might end with say 50k result and caller is reading all. Such warning would help user to probably go for pagination So above are valid cases. However currently warning is also seen even if end result set is small say 100 but indexed paths are deep. As {{ContentMirrorStoreStrategy}} mirrors the path structure the current counting logic also counts the intermediate traversals. This warning is then misleading as how this internal structure is created is an implementation details of index which end user does not have any control. This leaves following options # Use different storage strategy which is more efficient in storage # Do not count the intermediate nodes traversed within index path and instead only count the matching node -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2720) Misleading traversal warning message while performing query
[ https://issues.apache.org/jira/browse/OAK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482723#comment-14482723 ] Chetan Mehrotra commented on OAK-2720: -- bq. Are you sure? If the limit is at 10'000, then with 100 entries the paths would need to be extremely deep (100 on average, per node). Agreed this is more like a test scenario to illustrate the problem. But in some case the path become really deep! Would apply the patch then Misleading traversal warning message while performing query --- Key: OAK-2720 URL: https://issues.apache.org/jira/browse/OAK-2720 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Chetan Mehrotra Fix For: 1.2 Attachments: OAK-2720.patch Currently {{ContentMirrorStoreStrategy}} logs a traversal warning if the property index performs node traversal of more than 1 (default). The intention here being to warn the end user that traversing so many nodes would cause performance issue. Traversal in itself might happen due to many reason like # Query not using right index. If the query has two property restriction and one of them is more broad while other is more selective and index is defined only for first then more traversal would be performed. The warning should help the user to create a new index for second property # Caller is fetching way more result - Query might end with say 50k result and caller is reading all. Such warning would help user to probably go for pagination So above are valid cases. However currently warning is also seen even if end result set is small say 100 but indexed paths are deep. As {{ContentMirrorStoreStrategy}} mirrors the path structure the current counting logic also counts the intermediate traversals. This warning is then misleading as how this internal structure is created is an implementation details of index which end user does not have any control. This leaves following options # Use different storage strategy which is more efficient in storage # Do not count the intermediate nodes traversed within index path and instead only count the matching node -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OAK-2720) Misleading traversal warning message while performing query
[ https://issues.apache.org/jira/browse/OAK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger reassigned OAK-2720: - Assignee: Chetan Mehrotra [~chetanm], assigning this issue to you because you created the patch and mentioned you want to apply it. Misleading traversal warning message while performing query --- Key: OAK-2720 URL: https://issues.apache.org/jira/browse/OAK-2720 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Fix For: 1.2 Attachments: OAK-2720.patch Currently {{ContentMirrorStoreStrategy}} logs a traversal warning if the property index performs node traversal of more than 1 (default). The intention here being to warn the end user that traversing so many nodes would cause performance issue. Traversal in itself might happen due to many reason like # Query not using right index. If the query has two property restriction and one of them is more broad while other is more selective and index is defined only for first then more traversal would be performed. The warning should help the user to create a new index for second property # Caller is fetching way more result - Query might end with say 50k result and caller is reading all. Such warning would help user to probably go for pagination So above are valid cases. However currently warning is also seen even if end result set is small say 100 but indexed paths are deep. As {{ContentMirrorStoreStrategy}} mirrors the path structure the current counting logic also counts the intermediate traversals. This warning is then misleading as how this internal structure is created is an implementation details of index which end user does not have any control. This leaves following options # Use different storage strategy which is more efficient in storage # Do not count the intermediate nodes traversed within index path and instead only count the matching node -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2468) Index binary only if some Tika parser can support the binaries mimeType
[ https://issues.apache.org/jira/browse/OAK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2468. - Bulk closing for 1.1.8 Index binary only if some Tika parser can support the binaries mimeType --- Key: OAK-2468 URL: https://issues.apache.org/jira/browse/OAK-2468 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Currently all binaries are passed to Tika for text extraction. However Tika can only parse those for which it has supported parser present. Therefore extraction logic should parse a binary only if the mimeType is supported by Tika. With this change {{jcr:mimeType}} would become a mandatory property JR2 had a similar check [1] [1] https://github.com/apache/jackrabbit/blob/trunk/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/lucene/NodeIndexer.java#L932 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2645) Remove DOCUMENT_MK fixture (and related)
[ https://issues.apache.org/jira/browse/OAK-2645?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2645. - Bulk closing for 1.1.8 Remove DOCUMENT_MK fixture (and related) Key: OAK-2645 URL: https://issues.apache.org/jira/browse/OAK-2645 Project: Jackrabbit Oak Issue Type: Improvement Reporter: Michael Dürig Assignee: Michael Dürig Fix For: 1.1.8 Since OAK-2365 that fixture is obsolete and should be removed. And our CI instances should be configured accordingly (see OAK-2476). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2540) Session operations null check
[ https://issues.apache.org/jira/browse/OAK-2540?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2540. - Bulk closing for 1.1.8 Session operations null check - Key: OAK-2540 URL: https://issues.apache.org/jira/browse/OAK-2540 Project: Jackrabbit Oak Issue Type: Bug Components: jcr Reporter: Alex Parvulescu Assignee: Alex Parvulescu Fix For: 1.1.8 Attachments: OAK-2540.patch Calling a _Session.getNode(null)_ throws an ugly NPE. We should add a few null checks and turn those illegal inputs into IAEs. For those wondering Jackrabbit doesn't fare better here. _Oak_ {code} java.lang.NullPointerException at org.apache.jackrabbit.oak.namepath.NamePathMapperImpl.needsFullMapping(NamePathMapperImpl.java:224) at org.apache.jackrabbit.oak.namepath.NamePathMapperImpl.getOakPath(NamePathMapperImpl.java:80) at org.apache.jackrabbit.oak.jcr.session.SessionContext.getOakPath(SessionContext.java:306) at org.apache.jackrabbit.oak.jcr.session.SessionContext.getOakPathOrThrow(SessionContext.java:325) at org.apache.jackrabbit.oak.jcr.session.SessionImpl.getOakPathOrThrow(SessionImpl.java:149) at org.apache.jackrabbit.oak.jcr.session.SessionImpl.access$1(SessionImpl.java:148) at org.apache.jackrabbit.oak.jcr.session.SessionImpl$1.perform(SessionImpl.java:188) at org.apache.jackrabbit.oak.jcr.session.SessionImpl$1.perform(SessionImpl.java:1) at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:238) at org.apache.jackrabbit.oak.jcr.session.SessionImpl.perform(SessionImpl.java:139) at org.apache.jackrabbit.oak.jcr.session.SessionImpl.getNodeOrNull(SessionImpl.java:184) at org.apache.jackrabbit.oak.jcr.session.SessionImpl.getNode(SessionImpl.java:315) {code} _Jackrabbit_ {code} java.lang.NullPointerException at org.apache.jackrabbit.spi.commons.conversion.CachingPathResolver.getQPath(CachingPathResolver.java:93) at org.apache.jackrabbit.spi.commons.conversion.CachingPathResolver.getQPath(CachingPathResolver.java:77) at org.apache.jackrabbit.spi.commons.conversion.DefaultNamePathResolver.getQPath(DefaultNamePathResolver.java:82) at org.apache.jackrabbit.core.SessionImpl.getQPath(SessionImpl.java:648) at org.apache.jackrabbit.core.session.SessionContext.getQPath(SessionContext.java:338) at org.apache.jackrabbit.core.session.SessionItemOperation.perform(SessionItemOperation.java:185) at org.apache.jackrabbit.core.session.SessionState.perform(SessionState.java:216) at org.apache.jackrabbit.core.SessionImpl.perform(SessionImpl.java:361) at org.apache.jackrabbit.core.SessionImpl.getNode(SessionImpl.java:) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2587) observation processing too eager/unfair under load
[ https://issues.apache.org/jira/browse/OAK-2587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2587. - Bulk closing for 1.1.8 observation processing too eager/unfair under load -- Key: OAK-2587 URL: https://issues.apache.org/jira/browse/OAK-2587 Project: Jackrabbit Oak Issue Type: Improvement Components: core Affects Versions: 1.0.12 Reporter: Stefan Egli Assignee: Michael Dürig Priority: Critical Labels: observation Fix For: 1.1.8 Attachments: OAK-2587.patch The current implementation of oak's observation event processing is too eager and thus unfair under load scenarios. Consider having many (eg 200) Eventlisteners but only a relatively small threadpool (eg 5 as is the default in sling) backing them. When processing changes for a particular BackgroundObserver, that one (in BackgroundObserver.completionHandler.call) currently processes *all changes irrespective of how many there are* - ie it is *eager*. Only once that BackgroundObserver processed all changes will it let go and 'pass the thread' to the next BackgroundObserver. Now if for some reason changes (ie commits) are coming in while a BackgroundObserver is busy processing an earlier change, this will lengthen that while loop. As a result the remaining (eg 195) *EventListeners will have to wait for a potentially long time* until it's their turn - thus *unfair*. Now combine the above pattern with a scenario where mongo is used as the underlying store. In that case in order to remain highly performant it is important that the diffs (for compareAgainstBaseState) are served from the MongoDiffCache for as many cases as possible to avoid doing a round-trip to mongoD. The unfairness in the BackgroundObservers can now result in a large delay between the 'first' observers getting the event and the 'last' one (of those 200). When this delay increases due to a burst in the load, there is a risk of the diffs to no longer be in the cache - those last observers are basically kicked out of the (diff) cache. Once this happens, *the situation gets even worse*, since now you have yet new commits coming in and old changes still having to be processed - all of which are being processed through in 'stripes of 5 listeners' before the next one gets a chance. This at some point results in a totally inefficient cache behavior, or in other words, at some point all diffs have to be read from mongoD. To avoid this there are probably a number of options - a few one that come to mind: * increase thread-pool to match or be closer to the number of listeners (but this has other disadvantages, eg cost of thread-switching) * make BackgroundObservers fairer by limiting the number of changes they process before they give others a chance to be served by the pool. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2083) Add metatype info for Document and Segment services
[ https://issues.apache.org/jira/browse/OAK-2083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2083. - Bulk closing for 1.1.8 Add metatype info for Document and Segment services --- Key: OAK-2083 URL: https://issues.apache.org/jira/browse/OAK-2083 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Currently the OSGi service {{SegmentNodeStoreService}} and {{DocumentNodeStoreService}} do not provide metatype information for the possible configuration options and those config options then need to be documented as part of Oak Docs. To simiplify that we should add required metatype information to the two services One implication of that would be that user would be editing it directly from within Felix Web Console and that might cause some stablity as the repository would be restarted due to such a change however the benifit of having all details about config options at one place is much higher. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2497) Range query with incorrectly formatted date
[ https://issues.apache.org/jira/browse/OAK-2497?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2497. - Bulk closing for 1.1.8 Range query with incorrectly formatted date --- Key: OAK-2497 URL: https://issues.apache.org/jira/browse/OAK-2497 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller Fix For: 1.1.8, 1.2 Range queries on Date properties, with incorrectly formatted date, return no results (instead of either failing, or returning the expected result). Example date: {{2015-01-22T17:10:05.666z}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-1826) Empty directories not cleaned up when gc run on FileDataStore
[ https://issues.apache.org/jira/browse/OAK-1826?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-1826. - Bulk closing for 1.1.8 Empty directories not cleaned up when gc run on FileDataStore - Key: OAK-1826 URL: https://issues.apache.org/jira/browse/OAK-1826 Project: Jackrabbit Oak Issue Type: Bug Components: blob Reporter: Amit Jain Assignee: Amit Jain Priority: Minor Fix For: 1.1.8 The garbage collection only deletes the particular files identified as garbage. Any empty directories remaining after this operation are not cleaned up. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2638) Use message from causing exception in DocumentStoreException.convert()
[ https://issues.apache.org/jira/browse/OAK-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2638. - Bulk closing for 1.1.8 Use message from causing exception in DocumentStoreException.convert() -- Key: OAK-2638 URL: https://issues.apache.org/jira/browse/OAK-2638 Project: Jackrabbit Oak Issue Type: Improvement Reporter: Marcel Reutegger Assignee: Marcel Reutegger Priority: Minor Fix For: 1.1.8 The method with just the causing exception currently uses 'null' as the message for the DocumentStoreException. Instead it should use the message from the given exception. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2563) Cleanup and document security related error codes
[ https://issues.apache.org/jira/browse/OAK-2563?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2563. - Bulk closing for 1.1.8 Cleanup and document security related error codes - Key: OAK-2563 URL: https://issues.apache.org/jira/browse/OAK-2563 Project: Jackrabbit Oak Issue Type: Improvement Components: core, doc Reporter: angela Assignee: angela Fix For: 1.1.8 marker issue for clean up and documentation of error codes in the various security related modules -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2664) Move ProgressNotificationEditor from upgrade module to core
[ https://issues.apache.org/jira/browse/OAK-2664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2664. - Bulk closing for 1.1.8 Move ProgressNotificationEditor from upgrade module to core --- Key: OAK-2664 URL: https://issues.apache.org/jira/browse/OAK-2664 Project: Jackrabbit Oak Issue Type: Task Components: upgrade Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Oak Upgrade module has a {{ProgressNotificationEditor}} to notify the progress as the editor traverses the repository. It would be useful to have this in oak-core and use this to wrap the editors performing reindexing. Currently in upgrade this editor emits logs like {noformat} 01:29:15.404 INFO [main] ProgressNotificationEditor.java:51 Checking node types: /content/kb/home/cq5/Development/SlingBootdelegation/jcr:content/par/text 01:29:15.640 INFO [main] ProgressNotificationEditor.java:51 Checking node types: /content/ddc/en/feeds/cq-forums/2013/05/validationifo_object 01:29:15.845 INFO [main] ProgressNotificationEditor.java:51 Checking node types: /content/ddc/lists/2011/01/_jira_created_jcr1 01:29:15.943 INFO [main] ProgressNotificationEditor.java:51 Checking node types: /content/ddc/lists/2010/08/davidgsoc2010_sugge/jcr:content/par/entry 01:29:16.032 INFO [main] ProgressNotificationEditor.java:51 Checking node types: /content/ddc/lists/2010/06/gsoc_report 01:29:16.165 INFO [main] ProgressNotificationEditor.java:51 Checking node types: /content/ddc/blog/2008/04/opensocialjcr {noformat} This provides some clue about progress however no such notification is done when editors are busy in reindexing. Further I would also like to emit the count -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2649) IndexCopier might create empty files in case of error occuring while copying
[ https://issues.apache.org/jira/browse/OAK-2649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2649. - Bulk closing for 1.1.8 IndexCopier might create empty files in case of error occuring while copying Key: OAK-2649 URL: https://issues.apache.org/jira/browse/OAK-2649 Project: Jackrabbit Oak Issue Type: Bug Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 On some of the setups following logs are seen {noformat} error.log:12.03.2015 03:53:59.785 *WARN* [pool-5-thread-90] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier Found local copy for _2uv.cfs in MMapDirectory@/mnt/installation/crx-quickstart/repository/index/e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293/1 lockFactory=NativeFSLockFactory@/mnt/installation/crx-quickstart/repository/index/e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293/1 but size of local 0 differs from remote 1070972. Content would be read from remote file only error.log:12.03.2015 03:54:02.883 *WARN* [pool-5-thread-125] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier Found local copy for _2rr.si in MMapDirectory@/mnt/installation/crx-quickstart/repository/index/43b36b107f8ce7e162c15b22508aa457ff6ae0083ed3e12d14a7dab67f886def/1 lockFactory=NativeFSLockFactory@/mnt/installation/crx-quickstart/repository/index/43b36b107f8ce7e162c15b22508aa457ff6ae0083ed3e12d14a7dab67f886def/1 but size of local 0 differs from remote 240. Content would be read from remote file only error.log:12.03.2015 03:54:03.467 *WARN* [pool-5-thread-132] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier Found local copy for _2ro_3.del in MMapDirectory@/mnt/installation/crx-quickstart/repository/index/43b36b107f8ce7e162c15b22508aa457ff6ae0083ed3e12d14a7dab67f886def/1 lockFactory=NativeFSLockFactory@/mnt/installation/crx-quickstart/repository/index/43b36b107f8ce7e162c15b22508aa457ff6ae0083ed3e12d14a7dab67f886def/1 but size of local 0 differs from remote 42. Content would be read from remote file only error.log:12.03.2015 03:54:03.737 *WARN* [pool-5-thread-135] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier Found local copy for _2rm_2.del in MMapDirectory@/mnt/installation/crx-quickstart/repository/index/43b36b107f8ce7e162c15b22508aa457ff6ae0083ed3e12d14a7dab67f886def/1 lockFactory=NativeFSLockFactory@/mnt/installation/crx-quickstart/repository/index/43b36b107f8ce7e162c15b22508aa457ff6ae0083ed3e12d14a7dab67f886def/1 but size of local 0 differs from remote 35. Content would be read from remote file only {noformat} They indicate that copier has created files of size 0. Looking at the code flow this can happen in case while starting copying some error occurs in between. {{org.apache.lucene.store.Directory#copy}} do take care of removing the file in case of error but that is done only for IOException and not for other cases. As a fix the logic should ensure that local file gets deleted if the copy was not successful -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2590) IndexCopier Error occurred while removing deleted files from Local
[ https://issues.apache.org/jira/browse/OAK-2590?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2590. - Bulk closing for 1.1.8 IndexCopier Error occurred while removing deleted files from Local -- Key: OAK-2590 URL: https://issues.apache.org/jira/browse/OAK-2590 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Under windows, with copy on read mode enabled for the lucene indexes, following WARN logs were seen: {code:java} 03.02.2015 17:40:58.952 *WARN* [pool-5-thread-2] org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier Error occurred while removing deleted files from Local MMapDirectory@D:\repository\index\e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293\1 lockFactory=NativeFSLockFactory@D:\repository\index\e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293\1, Remote OakDirectory@232fe713 lockFactory=org.apache.lucene.store.NoLockFactory@696097fc java.io.IOException: Cannot delete D:\repository\index\e5a943cdec3000bd8ce54924fd2070ab5d1d35b9ecf530963a3583d43bf28293\1\_wq4.cfs at org.apache.lucene.store.FSDirectory.deleteFile(FSDirectory.java:273) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier$CopyOnReadDirectory.removeDeletedFiles(IndexCopier.java:274) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier$CopyOnReadDirectory.access$1000(IndexCopier.java:113) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopier$CopyOnReadDirectory$2.run(IndexCopier.java:247) at java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source) at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) at java.lang.Thread.run(Unknown Source) {code} As the files are memory mapped and the handle is yet not GC its possible that windows would not allow to delete such file. In such a case the log message should be improved and possibly have a way where attempt is made to delete such files later -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2605) Support for additional encodings needed in ReversedLinesFileReader
[ https://issues.apache.org/jira/browse/OAK-2605?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2605. - Bulk closing for 1.1.8 Support for additional encodings needed in ReversedLinesFileReader -- Key: OAK-2605 URL: https://issues.apache.org/jira/browse/OAK-2605 Project: Jackrabbit Oak Issue Type: Bug Components: segmentmk Affects Versions: 1.1.7 Environment: Windows 2012 R2 Japanese Windows 2012 R2 Korean Windows 2012 R2 Simplified Chinese Windows 2012 R2 Traditional Chinese Reporter: Leandro Reis Assignee: Michael Dürig Fix For: 1.1.8 Attachments: OAK-2605.patch I'm working on a product that uses Commons IO via Jackrabbit Oak. In the process of testing the launch of such product on Japanese Windows 2012 Server R2, I came across the following exception: (java.io.UnsupportedEncodingException: Encoding windows-31j is not supported yet (feel free to submit a patch)) windows-31j is the IANA name for Windows code page 932 (Japanese), and is returned by Charset.defaultCharset(), used in org.apache.commons.io.input.ReversedLinesFileReader [0]. A patch for this issue was provided in https://issues.apache.org/jira/browse/IO-471 . It also includes changes needed to support Chinese Simplified, Chinese Traditional and Korean. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2500) checkDeepHistory/fixDeepHistory/prepareDeepHistory for oak-mongo.js
[ https://issues.apache.org/jira/browse/OAK-2500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2500. - Bulk closing for 1.1.8 checkDeepHistory/fixDeepHistory/prepareDeepHistory for oak-mongo.js --- Key: OAK-2500 URL: https://issues.apache.org/jira/browse/OAK-2500 Project: Jackrabbit Oak Issue Type: Improvement Components: run Affects Versions: 1.0.8 Reporter: Stefan Egli Assignee: Marcel Reutegger Fix For: 1.1.8 Attachments: oak-mongo-mod.js The oak-mongo.js currently contains checkHistory/fixHistory which are cleaning up 'dangling revisions/split-documents' on a particular path. Now it would be good to have a command that goes through the entire repository and checks/fixes these dangling revisions. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2580) Metatype info for DocumentNodeStoreService
[ https://issues.apache.org/jira/browse/OAK-2580?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2580. - Bulk closing for 1.1.8 Metatype info for DocumentNodeStoreService -- Key: OAK-2580 URL: https://issues.apache.org/jira/browse/OAK-2580 Project: Jackrabbit Oak Issue Type: Sub-task Components: mongomk Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Attachments: OAK-2580.patch, document-nodestore-config.png Sub task for adding metatype info for {{DocumentNodeStoreService}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2672) Possible null pointer dereferences in ExternalLoginModule
[ https://issues.apache.org/jira/browse/OAK-2672?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2672. - Bulk closing for 1.1.8 Possible null pointer dereferences in ExternalLoginModule --- Key: OAK-2672 URL: https://issues.apache.org/jira/browse/OAK-2672 Project: Jackrabbit Oak Issue Type: Bug Components: oak-auth-external Reporter: angela Assignee: angela Fix For: 1.1.8 sonar reports to following oossible null pointer dereferences in {{org.apache.jackrabbit.oak.spi.security.authentication.external.impl.ExternalLoginModule}}: - line 187 : {{sId = syncHandler.findIdentity(getUserManager(), userId);}} - line 195 : {{if (!sId.getExternalIdRef().getProviderName().equals(idp.getName()))}} - line 197 : {{log.debug(ignoring foreign identity: {} (idp={}), sId.getExternalIdRef().getString(), idp.getName());}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2585) Set pauseCompaction default to false
[ https://issues.apache.org/jira/browse/OAK-2585?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2585. - Bulk closing for 1.1.8 Set pauseCompaction default to false Key: OAK-2585 URL: https://issues.apache.org/jira/browse/OAK-2585 Project: Jackrabbit Oak Issue Type: Improvement Components: segmentmk Reporter: Michael Dürig Assignee: Michael Dürig Labels: compaction, gc Fix For: 1.1.8 As we start seeing good results with the current approach to compaction I'd like to have it running per default. This allows us to gather more information while we are running up towards the 1.2 release. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2420) DocumentNodeStore revision GC may lead to NPE
[ https://issues.apache.org/jira/browse/OAK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2420. - Bulk closing for 1.1.8 DocumentNodeStore revision GC may lead to NPE - Key: OAK-2420 URL: https://issues.apache.org/jira/browse/OAK-2420 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 1.0 Reporter: Marcel Reutegger Assignee: Marcel Reutegger Priority: Critical Fix For: 1.1.8 The DocumentNodeStore revision GC may cause a NPE in a reader thread when the GC deletes documents currently accessed by the reader. The {{docChildrenCache}} is invalidated in {{VersionGarbageCollector.collectDeletedDocuments()}} after documents are removed in the DocumentStore. The NPE may occur if removed documents are access in between. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2581) Metatype info for SegmentNodeStoreService
[ https://issues.apache.org/jira/browse/OAK-2581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2581. - Bulk closing for 1.1.8 Metatype info for SegmentNodeStoreService - Key: OAK-2581 URL: https://issues.apache.org/jira/browse/OAK-2581 Project: Jackrabbit Oak Issue Type: Sub-task Components: segmentmk Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Attachments: OAK-2581-1.patch, OAK-2581.patch, segment-nodestore-config.png Sub task for adding metatype info for {{SegmentNodeStoreService}} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2399) Custom scorer for modifying score per documents
[ https://issues.apache.org/jira/browse/OAK-2399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2399. - Bulk closing for 1.1.8 Custom scorer for modifying score per documents --- Key: OAK-2399 URL: https://issues.apache.org/jira/browse/OAK-2399 Project: Jackrabbit Oak Issue Type: New Feature Components: oak-lucene Reporter: Rishabh Maurya Assignee: Thomas Mueller Fix For: 1.1.8, 1.2 Attachments: OAK-2399_1.patch, OAK-2399_scorer.patch We have search enhancements requests based on search results relevance like - 1. boosting score of recently modified documents. 2. boosting documents which are created/last updated by current session user.(OR boosting on basis specific field value). 3. boosting documents with a field value in certain range. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2691) Blob GC throws NPE
[ https://issues.apache.org/jira/browse/OAK-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2691. - Bulk closing for 1.1.8 Blob GC throws NPE -- Key: OAK-2691 URL: https://issues.apache.org/jira/browse/OAK-2691 Project: Jackrabbit Oak Issue Type: Bug Components: blob Reporter: Amit Jain Assignee: Amit Jain Priority: Blocker Fix For: 1.1.8 Blob GC when registered without a shared data store throws NPE. The {{ClusterRepositoryInfo#getId}} method should check if clusterId is registered or not. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2694) Avoid unneeded unboxing in PropertiesUtil
[ https://issues.apache.org/jira/browse/OAK-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2694. - Bulk closing for 1.1.8 Avoid unneeded unboxing in PropertiesUtil - Key: OAK-2694 URL: https://issues.apache.org/jira/browse/OAK-2694 Project: Jackrabbit Oak Issue Type: Bug Components: commons Affects Versions: 1.1.7 Reporter: Robert Munteanu Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Attachments: 0001-OAK-2694-Avoid-unneeded-unboxing-in-PropertiesUtil.patch, 0001-OAK-2694-Avoid-unneeded-unboxing-in-PropertiesUtil.patch PropertiesUtil use the {{valueOf}} method to convert from String to int/long/boolean. Using the {{parseXXX}} variant means that no object creation + unboxing happens. ( Boolean is a special case, but that should be avoided anyway ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2234) Support property existence query (for Lucene)
[ https://issues.apache.org/jira/browse/OAK-2234?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2234. - Bulk closing for 1.1.8 Support property existence query (for Lucene) - Key: OAK-2234 URL: https://issues.apache.org/jira/browse/OAK-2234 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Add support for property existence query like {noformat} select [jcr:path] from [nt:base] where propa is not null {noformat} Opposite of this cannot be be supported i.e. {{propa is null}} though! With OAK-1208 query creation logic explicitly ignores such queries but with recent Lucene changes it appears to be possible LUCENE-995 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2588) MultiDocumentStoreTest.testInvalidateCache failing for Mongo
[ https://issues.apache.org/jira/browse/OAK-2588?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2588. - Bulk closing for 1.1.8 MultiDocumentStoreTest.testInvalidateCache failing for Mongo Key: OAK-2588 URL: https://issues.apache.org/jira/browse/OAK-2588 Project: Jackrabbit Oak Issue Type: Bug Components: mongomk Reporter: Chetan Mehrotra Assignee: Julian Reschke Priority: Minor Fix For: 1.1.8 {{MultiDocumentStoreTest.testInvalidateCache}} failing for Mongo {noformat} Tests run: 6, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 1.255 sec FAILURE! testInvalidateCache[0](org.apache.jackrabbit.oak.plugins.document.MultiDocumentStoreTest) Time elapsed: 0.343 sec FAILURE! java.lang.AssertionError: modcount should have incremented again expected:3 but was:2 at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.failNotEquals(Assert.java:647) at org.junit.Assert.assertEquals(Assert.java:128) at org.junit.Assert.assertEquals(Assert.java:472) at org.apache.jackrabbit.oak.plugins.document.MultiDocumentStoreTest.testInvalidateCache(MultiDocumentStoreTest.java:184) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2692) Add description annotation to RepositoryManagementMBean#startDataStoreGC
[ https://issues.apache.org/jira/browse/OAK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2692. - Bulk closing for 1.1.8 Add description annotation to RepositoryManagementMBean#startDataStoreGC - Key: OAK-2692 URL: https://issues.apache.org/jira/browse/OAK-2692 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.1.8 Attachments: OAK-2692.patch, gc-mbean-op.png Currently the {{RepositoryManagementMBean#startDataStoreGC}} takes a boolean parameter to indicate markOnly invocation. However looking at the JMX operation in GUI its not possible to determine what this parameter is to be used for. To avoid confusion it would be better to add {{Description}} and {{Name}} annotation to the operation -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-1641) Mongo: Un-/CheckedExecutionException on replica-primary crash
[ https://issues.apache.org/jira/browse/OAK-1641?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-1641. - Bulk closing for 1.1.8 Mongo: Un-/CheckedExecutionException on replica-primary crash - Key: OAK-1641 URL: https://issues.apache.org/jira/browse/OAK-1641 Project: Jackrabbit Oak Issue Type: Bug Components: mongomk Affects Versions: 0.19 Environment: 0.20-SNAPSHOT as of March 28, 2014 Reporter: Stefan Egli Assignee: Marcel Reutegger Fix For: 1.1.8 Attachments: ReplicaCrashResilienceTest.java, ReplicaCrashResilienceTest.java, mongoUrl_fixture_patch_oak1641.diff Testing with a mongo replicaSet setup: 1 primary, 1 secondary and 1 secondary-arbiter-only. Running a simple test which has 2 threads: a writer thread and a reader thread. The following exception occurs when crashing mongo primary {code} com.google.common.util.concurrent.UncheckedExecutionException: com.google.common.util.concurrent.UncheckedExecutionException: com.mongodb.MongoException$Network: Read operation to server localhost/127.0.0.1:12322 failed on database resilienceTest at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2199) at com.google.common.cache.LocalCache.get(LocalCache.java:3932) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.getNode(DocumentNodeStore.java:593) at org.apache.jackrabbit.oak.plugins.document.DocumentNodeState.hasChildNode(DocumentNodeState.java:164) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.hasChildNode(MemoryNodeBuilder.java:301) at org.apache.jackrabbit.oak.core.SecureNodeBuilder.hasChildNode(SecureNodeBuilder.java:299) at org.apache.jackrabbit.oak.plugins.tree.AbstractTree.hasChild(AbstractTree.java:267) at org.apache.jackrabbit.oak.core.MutableTree.getChild(MutableTree.java:147) at org.apache.jackrabbit.oak.util.TreeUtil.getTree(TreeUtil.java:171) at org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.getTree(NodeDelegate.java:865) at org.apache.jackrabbit.oak.jcr.delegate.NodeDelegate.getChild(NodeDelegate.java:339) at org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:274) at org.apache.jackrabbit.oak.jcr.session.NodeImpl$5.perform(NodeImpl.java:1) at org.apache.jackrabbit.oak.jcr.delegate.SessionDelegate.perform(SessionDelegate.java:308) at org.apache.jackrabbit.oak.jcr.session.ItemImpl.perform(ItemImpl.java:113) at org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:253) at org.apache.jackrabbit.oak.jcr.session.NodeImpl.addNode(NodeImpl.java:238) at org.apache.jackrabbit.oak.run.ReplicaCrashResilienceTest$1.run(ReplicaCrashResilienceTest.java:103) at java.lang.Thread.run(Thread.java:695) Caused by: com.google.common.util.concurrent.UncheckedExecutionException: com.mongodb.MongoException$Network: Read operation to server localhost/127.0.0.1:12322 failed on database resilienceTest at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2199) at com.google.common.cache.LocalCache.get(LocalCache.java:3932) at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:4721) at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.find(MongoDocumentStore.java:267) at org.apache.jackrabbit.oak.plugins.document.mongo.MongoDocumentStore.find(MongoDocumentStore.java:234) at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore.readNode(DocumentNodeStore.java:802) at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$3.call(DocumentNodeStore.java:596) at org.apache.jackrabbit.oak.plugins.document.DocumentNodeStore$3.call(DocumentNodeStore.java:1) at com.google.common.cache.LocalCache$LocalManualCache$1.load(LocalCache.java:4724) at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3522) at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2315) at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2278) at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2193) ... 19 more Caused by: com.mongodb.MongoException$Network: Read operation to server localhost/127.0.0.1:12322 failed on database resilienceTest at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:253) at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:264) at com.mongodb.DBTCPConnector.innerCall(DBTCPConnector.java:264) at
[jira] [Closed] (OAK-2301) QueryEngine should not tokenize fulltext expression by default
[ https://issues.apache.org/jira/browse/OAK-2301?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2301. - Bulk closing for 1.1.8 QueryEngine should not tokenize fulltext expression by default -- Key: OAK-2301 URL: https://issues.apache.org/jira/browse/OAK-2301 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Fix For: 1.1.8 Attachments: OAK-2301-b.patch, OAK-2301.patch QueryEngine currently parses the fulltext expression on its own. This would cause issue with index implementation like Lucene which use a different analysis logic. For fulltext search to work properly it should be possible for LuceneIndex to get access to non tokenized text For more details refer to http://markmail.org/thread/syoha44std3fm4j2 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2693) Retire oak-mk-remote
[ https://issues.apache.org/jira/browse/OAK-2693?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2693. - Bulk closing for 1.1.8 Retire oak-mk-remote Key: OAK-2693 URL: https://issues.apache.org/jira/browse/OAK-2693 Project: Jackrabbit Oak Issue Type: Task Reporter: angela Assignee: angela Fix For: 1.1.8, 1.2 see http://markmail.org/message/z536la5eenh3xhve -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2653) Deprecate ordered index
[ https://issues.apache.org/jira/browse/OAK-2653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2653. - Bulk closing for 1.1.8 Deprecate ordered index --- Key: OAK-2653 URL: https://issues.apache.org/jira/browse/OAK-2653 Project: Jackrabbit Oak Issue Type: Task Components: doc Reporter: Davide Giannella Assignee: Davide Giannella Priority: Minor Fix For: 1.1.8 As we have lucene property index and proved to be scalable rather than the ordered index we should deprecate the usage of the latter. update the website providing - information that is deprecated - use lucene property index instead - provide an example of migration *optional* - change the ordered index implementation by providing a WARN in the logs, once per JVM, about the deprecation aspect. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2625) Copy Jackrabbit 2 S3 related classes
[ https://issues.apache.org/jira/browse/OAK-2625?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2625. - Bulk closing for 1.1.8 Copy Jackrabbit 2 S3 related classes Key: OAK-2625 URL: https://issues.apache.org/jira/browse/OAK-2625 Project: Jackrabbit Oak Issue Type: Task Components: blob Reporter: Amit Jain Assignee: Amit Jain Fix For: 1.1.8 As discussed in http://markmail.org/thread/wuccswpehsybat4v it'll be good to have the S3 related classes in Oak itself to make it easier to make improvements. The classes will be moved to a new module oak-blob-cloud. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-1941) RDB: decide on table layout
[ https://issues.apache.org/jira/browse/OAK-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-1941. - Bulk closing for 1.1.8 RDB: decide on table layout --- Key: OAK-1941 URL: https://issues.apache.org/jira/browse/OAK-1941 Project: Jackrabbit Oak Issue Type: Sub-task Components: rdbmk Reporter: Julian Reschke Assignee: Julian Reschke Fix For: 1.1.8 Attachments: OAK-1941-cmodcount.diff, utf8measure.diff, with-modified-index.diff, with-modified-index.diff The current approach is to serialize the Document using JSON, and then to store either (a) the full JSON in a VARCHAR column, or, if that column isn't wide enough, (b) to store it in a BLOB (optionally gzipped). For debugging purposes, the inline VARCHAR always gets populated with the start of the JSON serialization. However, with Oracle we are limited to 4000 bytes (which may be way less characters due to non-ASCII overhead), so many document instances will use what was initially thought to be the exception case. Questions: 1) Do we stick with JSON or do we attempt a different serialization? It might make sense both wrt to length and performance. There might be also some code to borrow from the off-heap serialization code. 2) Do we get rid of the dual strategy, and just always use the BLOB? The indirection might make things more expensive, but then the total column width would drop considerably. -- How can we do good benchmarks on this? (This all assumes that we stick with a model where all code is the same between database types, except for the DDL statements; of course it's also conceivable add more vendor-specific special cases into the Java code) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2494) Shared DataStore GC support for S3DataStore
[ https://issues.apache.org/jira/browse/OAK-2494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2494. - Bulk closing for 1.1.8 Shared DataStore GC support for S3DataStore --- Key: OAK-2494 URL: https://issues.apache.org/jira/browse/OAK-2494 Project: Jackrabbit Oak Issue Type: Sub-task Components: core Reporter: Amit Jain Assignee: Amit Jain Fix For: 1.1.8 Attachments: OAK-2494.patch -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2262) Add metadata about the changed value to a PROPERTY_CHANGED event on a multivalued property
[ https://issues.apache.org/jira/browse/OAK-2262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2262. - Bulk closing for 1.1.8 Add metadata about the changed value to a PROPERTY_CHANGED event on a multivalued property -- Key: OAK-2262 URL: https://issues.apache.org/jira/browse/OAK-2262 Project: Jackrabbit Oak Issue Type: Improvement Components: core, jcr Affects Versions: 1.1.2 Reporter: Tommaso Teofili Assignee: Michael Dürig Labels: observation Fix For: 1.1.8 When getting _PROPERTY_CHANGED_ events on non-multivalued properties only one value can have actually changed so that handlers of such events do not need any further information to process it and eventually work on the changed value; on the other hand _PROPERTY_CHANGED_ events on multivalued properties (e.g. String[]) may relate to any of the values and that brings a source of uncertainty on event handlers processing such changes because there's no mean to understand which property value had been changed and therefore to them to react accordingly. A workaround for that is to create Oak specific _Observers_ which can deal with the diff between before and after state and create a specific event containing the diff, however this would add a non trivial load to the repository because of the _Observer_ itself and because of the additional events being generated while it'd be great if the 'default' events would have metadata e.g. of the changed value index or similar information that can help understanding which value has been changed (added, deleted, updated). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-1666) FileDataStore inUse map causes contention in concurrent env
[ https://issues.apache.org/jira/browse/OAK-1666?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-1666. - Bulk closing for 1.1.8 FileDataStore inUse map causes contention in concurrent env --- Key: OAK-1666 URL: https://issues.apache.org/jira/browse/OAK-1666 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Labels: concurrency Fix For: 1.1.8 JR2 FileDataStore#inUseMap [1] is currently a synchronized map and that at times causes contention concurrent env. This map is used for supporting the Blob GC logic for JR2. With Oak this map content is not used. As a fix we can either # Set inUseMap to a Guava Cache Map which has weak keys and value # Set inUseMap to a no op map where all put calls are ignored # Modify FDS to disable use of inUseMap or make {{usesIdentifier}} protected #3 would be a proper fix and #2 can be used as temp workaround untill FDS gets fixed [1] https://github.com/apache/jackrabbit/blob/trunk/jackrabbit-data/src/main/java/org/apache/jackrabbit/core/data/FileDataStore.java#L118 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2680) Report a full observation queue situation to the logfile
[ https://issues.apache.org/jira/browse/OAK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2680. - Bulk closing for 1.1.8 Report a full observation queue situation to the logfile Key: OAK-2680 URL: https://issues.apache.org/jira/browse/OAK-2680 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-core Affects Versions: 1.1.7 Reporter: Marc Pfaff Assignee: Michael Dürig Priority: Minor Fix For: 1.1.8 Attachments: OAK-2680.patch This is a improvement request for having an explicit warning in the log file, when the BackgroundObserver's queue maximum is reached. Currently, in that case, a warning is logged from the ChangeProcessor observer only. But as each observer has it's own queue, a warning from a more central place covering all observers would be helpful. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2596) more (jmx) instrumentation for observation queue
[ https://issues.apache.org/jira/browse/OAK-2596?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2596. - Bulk closing for 1.1.8 more (jmx) instrumentation for observation queue Key: OAK-2596 URL: https://issues.apache.org/jira/browse/OAK-2596 Project: Jackrabbit Oak Issue Type: Improvement Components: core Affects Versions: 1.0.12 Reporter: Stefan Egli Assignee: Michael Dürig Priority: Blocker Labels: monitoring, observation Fix For: 1.1.8 While debugging issues with the observation queue it would be handy to have more detailed information available. At the moment you can only see one value wrt length of the queue: that is the maximum of all queues. It is unclear if the queue is that long for only one or many listeners. And it is unclear from that if the listener is slow or the engine that produces the events for the listener. So I'd suggest to add the following details - possible exposed via JMX? : # add queue length details to each of the observation listeners # have a history of the last, eg 1000 events per listener showing a) how long the event took to be created/generated and b) how long the listener took to process. Sometimes averages are not detailed enough so such a in-depth information might become useful. (Not sure about the feasibility of '1000' here - maybe that could be configurable though - just putting the idea out here). # have some information about whether a listener is currently 'reading events from the cache' or whether it has to go to eg mongo # maybe have a 'top 10' listeners that have the largest queue at the moment to easily allow navigation instead of having to go through all (eg 200) listeners manually each time. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2597) expose mongo's clusterNodes info more prominently
[ https://issues.apache.org/jira/browse/OAK-2597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2597. - Bulk closing for 1.1.8 expose mongo's clusterNodes info more prominently - Key: OAK-2597 URL: https://issues.apache.org/jira/browse/OAK-2597 Project: Jackrabbit Oak Issue Type: Improvement Components: mongomk Affects Versions: 1.0.12 Reporter: Stefan Egli Assignee: Marcel Reutegger Fix For: 1.1.8 Suggestion: {{db.clusterNodes}} contains very useful information wrt how many instances are currently (and have been) active in the oak-mongo-cluster. While this should in theory match the topology reported via sling's discovery api, it might differ. It could be very helpful if this information was exposed very prominently in a UI (assuming this is not yet the case) - eg in a /system/console page -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2633) Log document as debug message on conflict
[ https://issues.apache.org/jira/browse/OAK-2633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2633. - Bulk closing for 1.1.8 Log document as debug message on conflict - Key: OAK-2633 URL: https://issues.apache.org/jira/browse/OAK-2633 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Marcel Reutegger Assignee: Marcel Reutegger Priority: Minor Fix For: 1.1.8 The implementation currently appends the complete document to the exception message when a conflict occurs on commit. It would be better to only log the document details on DEBUG. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2632) Upgrade Jackrabbit dependency to 2.10.0
[ https://issues.apache.org/jira/browse/OAK-2632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2632. - Bulk closing for 1.1.8 Upgrade Jackrabbit dependency to 2.10.0 --- Key: OAK-2632 URL: https://issues.apache.org/jira/browse/OAK-2632 Project: Jackrabbit Oak Issue Type: Task Reporter: Michael Dürig Assignee: Marcel Reutegger Priority: Blocker Fix For: 1.1.8 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2413) Clarify Editor.childNodeChanged()
[ https://issues.apache.org/jira/browse/OAK-2413?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2413. - Bulk closing for 1.1.8 Clarify Editor.childNodeChanged() - Key: OAK-2413 URL: https://issues.apache.org/jira/browse/OAK-2413 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Marcel Reutegger Assignee: angela Priority: Minor Fix For: 1.1.8 The current contract for {{Editor.childNodeChanged()}} does not specify if this method may also be called when the child node did not actually change. The method {{NodeStateDiff.childNodeChanged()}} explicitly states that there may be such calls. Looking at the implementation connecting the two classes, {{EditorDiff.childNodeChange()}} simply calls the editor without checking whether the child node did in fact change. I think we either have to change the {{EditorDiff}} or update the contract for the Editor and adjust implementations. E.g. right now, PrivilegeValidator (implements Editor), assumes a call to {{childNodeChange()}} indeed means the child node changed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-1849) DataStore GC support for heterogeneous deployments using a shared datastore
[ https://issues.apache.org/jira/browse/OAK-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-1849. - Bulk closing for 1.1.8 DataStore GC support for heterogeneous deployments using a shared datastore --- Key: OAK-1849 URL: https://issues.apache.org/jira/browse/OAK-1849 Project: Jackrabbit Oak Issue Type: New Feature Components: blob Reporter: Amit Jain Assignee: Amit Jain Fix For: 1.1.8 Attachments: OAK-1849-PART-MBEAN.patch, OAK-1849-PART-TEST.patch, OAK-1849-PART1.patch, OAK-1849-v2.patch, OAK-1849.patch If the deployment is such that there are 2 or more different instances with a shared datastore, triggering Datastore GC from one instance will result in blobs used by another instance getting deleted, causing data loss. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2612) Findbugs plugin version should depend on JDK version
[ https://issues.apache.org/jira/browse/OAK-2612?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2612. - Bulk closing for 1.1.8 Findbugs plugin version should depend on JDK version Key: OAK-2612 URL: https://issues.apache.org/jira/browse/OAK-2612 Project: Jackrabbit Oak Issue Type: Bug Components: parent Affects Versions: 1.1.7 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 1.1.8 Looking at the current CI builds the JDK 8 one is always failing because _maven-findbugs-plugin-2.5.3_ doesn't work with it, on the other hand upgrading to 3.0.0 will make the JDK 6 build failing, therefore the version used has to be dependent on the used JDK. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2640) export org.apache.jackrabbit.oak.plugins.atomic
[ https://issues.apache.org/jira/browse/OAK-2640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2640. - Bulk closing for 1.1.8 export org.apache.jackrabbit.oak.plugins.atomic --- Key: OAK-2640 URL: https://issues.apache.org/jira/browse/OAK-2640 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 1.1.6, 1.1.7 Reporter: Davide Giannella Assignee: Davide Giannella Priority: Blocker Fix For: 1.1.8 The package {{org.apache.jackrabbit.oak.plugins.atomic}} is not exported for OSGi environments. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2574) Update mongo-java-driver to 2.13.0
[ https://issues.apache.org/jira/browse/OAK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2574. - Bulk closing for 1.1.8 Update mongo-java-driver to 2.13.0 -- Key: OAK-2574 URL: https://issues.apache.org/jira/browse/OAK-2574 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Marcel Reutegger Assignee: Marcel Reutegger Fix For: 1.1.8 MongoDB 3.0 was released yesterday and the current java driver (2.12.2) used by Oak is marked as untested with 3.0: http://docs.mongodb.org/ecosystem/drivers/java/#compatibility We should update to 2.13.0, which is compatible with 3.0. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2611) Lucene suggester should only be updated if the index is used for suggestions
[ https://issues.apache.org/jira/browse/OAK-2611?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2611. - Bulk closing for 1.1.8 Lucene suggester should only be updated if the index is used for suggestions Key: OAK-2611 URL: https://issues.apache.org/jira/browse/OAK-2611 Project: Jackrabbit Oak Issue Type: Bug Components: oak-lucene Affects Versions: 1.1.7 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 1.1.8 As mentioned on [oak-dev@|http://markmail.org/message/vp6qg7v5j3oxev73] Lucene suggester should be updated only in indexes that are used for suggestions (_useForSuggestions = true_ in index definition). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2721) LogDumper rule to dump logs as part of system out in case of test failure
Chetan Mehrotra created OAK-2721: Summary: LogDumper rule to dump logs as part of system out in case of test failure Key: OAK-2721 URL: https://issues.apache.org/jira/browse/OAK-2721 Project: Jackrabbit Oak Issue Type: New Feature Components: it Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Fix For: 1.2 I would like to add a JUnit rule which would dump the logs generated during the execution of a given test in case of a failure. That should help in troubleshooting the test failures on CI The impl is modelled on similar impl but for remote logs done in SLING-4280. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2721) LogDumper rule to dump logs as part of system out in case of test failure
[ https://issues.apache.org/jira/browse/OAK-2721?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra resolved OAK-2721. -- Resolution: Fixed Done * trunk - http://svn.apache.org/r1671773 With this the rule can be used like below {code} import import org.apache.jackrabbit.oak.commons.junit.LogDumper; public class PropertyIndexTest { @Rule public final LogDumper dumper = new LogDumper(); {code} Upon any test failure the output would be like below {noformat} === Logs for [org.apache.jackrabbit.oak.plugins.index.property.PropertyIndexTest#traversalWarning]=== java.lang.AssertionError at org.junit.Assert.fail(Assert.java:92) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertTrue(Assert.java:54) at org.apache.jackrabbit.oak.plugins.index.property.PropertyIndexTest.traversalWarning(PropertyIndexTest.java:545) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at org.junit.runners.model.FrameworkMethod$1.runReflectiveCall(FrameworkMethod.java:45) at org.junit.internal.runners.model.ReflectiveCallable.run(ReflectiveCallable.java:15) at org.junit.runners.model.FrameworkMethod.invokeExplosively(FrameworkMethod.java:42) at org.junit.internal.runners.statements.InvokeMethod.evaluate(InvokeMethod.java:20) at org.junit.rules.TestWatcher$1.evaluate(TestWatcher.java:47) at org.junit.rules.RunRules.evaluate(RunRules.java:18) at org.junit.runners.ParentRunner.runLeaf(ParentRunner.java:263) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:68) at org.junit.runners.BlockJUnit4ClassRunner.runChild(BlockJUnit4ClassRunner.java:47) at org.junit.runners.ParentRunner$3.run(ParentRunner.java:231) at org.junit.runners.ParentRunner$1.schedule(ParentRunner.java:60) at org.junit.runners.ParentRunner.runChildren(ParentRunner.java:229) at org.junit.runners.ParentRunner.access$000(ParentRunner.java:50) at org.junit.runners.ParentRunner$2.evaluate(ParentRunner.java:222) at org.junit.runners.ParentRunner.run(ParentRunner.java:300) at org.junit.runner.JUnitCore.run(JUnitCore.java:157) at com.intellij.junit4.JUnit4IdeaTestRunner.startRunnerWithArgs(JUnit4IdeaTestRunner.java:78) at com.intellij.rt.execution.junit.JUnitStarter.prepareStreamsAndStart(JUnitStarter.java:212) at com.intellij.rt.execution.junit.JUnitStarter.main(JUnitStarter.java:68) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at com.intellij.rt.execution.application.AppMain.main(AppMain.java:140) 07.04.2015 12:47:27.972 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing will be performed for following indexes: [/oak:index/nodetype, /oak:index/foo, /oak:index/uuid] 07.04.2015 12:47:28.158 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #1 /n2992/c0/c1 07.04.2015 12:47:28.279 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #2 /n353/c0/c1/c2/c3/c4/c5 07.04.2015 12:47:28.383 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #3 /n4769/c0/c1/c2/c3/c4/c5/c6/c7/c8/c9 07.04.2015 12:47:28.678 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #4 /n2502/c0/c1 07.04.2015 12:47:28.777 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #5 /n2865/c0/c1/c2/c3/c4/c5 07.04.2015 12:47:28.863 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #6 /n2234/c0/c1/c2/c3/c4/c5/c6/c7/c8/c9 07.04.2015 12:47:28.906 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #7 /n7800/c0/c1 07.04.2015 12:47:28.951 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #8 /n5096/c0/c1/c2/c3/c4/c5 07.04.2015 12:47:28.987 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #9 /n5328/c0/c1/c2/c3/c4/c5/c6/c7/c8/c9 07.04.2015 12:47:29.038 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #10 /n9434/c0/c1/c2/c3/c4/c5/c6/c7 07.04.2015 12:47:29.106 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #11 /n2036 07.04.2015 12:47:29.173 *INFO* [main] org.apache.jackrabbit.oak.plugins.index.IndexUpdate Reindexing Traversed #12 /n6142/c0/c1/c2/c3 07.04.2015
[jira] [Created] (OAK-2722) IndexCopier fails to delete older index directory upon reindex
Chetan Mehrotra created OAK-2722: Summary: IndexCopier fails to delete older index directory upon reindex Key: OAK-2722 URL: https://issues.apache.org/jira/browse/OAK-2722 Project: Jackrabbit Oak Issue Type: Bug Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.3.0 {{IndexCopier}} tries to remove the older index directory incase of reindex. This might fails on platform like Windows if the files are still memory mapped or are locked. For deleting directories we would need to take similar approach like being done with deleting old index files i.e. do retries later. Due to this following test fails on Windows (Per [~julian.resc...@gmx.de] ) {noformat} Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec FAILURE! deleteOldPostReindex(org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest) Time elapsed: 0.02 sec FAILURE! java.lang.AssertionError: Old index directory should have been removed at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.deleteOldPostReindex(IndexCopierTest.java:160) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2224) Increase the threshold for warning in PathIterator
[ https://issues.apache.org/jira/browse/OAK-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra updated OAK-2224: - Fix Version/s: 1.0.13 Increase the threshold for warning in PathIterator -- Key: OAK-2224 URL: https://issues.apache.org/jira/browse/OAK-2224 Project: Jackrabbit Oak Issue Type: Bug Components: core, query Reporter: Davide Giannella Assignee: Davide Giannella Priority: Trivial Fix For: 1.1.2, 1.0.13 Increase the threshold for tracking the warning on traversing of the PathIterator from 1,000 to 10,000. Discussion in: http://markmail.org/thread/apmrq45m65n6wuwo -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2224) Increase the threshold for warning in PathIterator
[ https://issues.apache.org/jira/browse/OAK-2224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482974#comment-14482974 ] Chetan Mehrotra commented on OAK-2224: -- Merged to 1.0 with http://svn.apache.org/r1671793 Increase the threshold for warning in PathIterator -- Key: OAK-2224 URL: https://issues.apache.org/jira/browse/OAK-2224 Project: Jackrabbit Oak Issue Type: Bug Components: core, query Reporter: Davide Giannella Assignee: Davide Giannella Priority: Trivial Fix For: 1.1.2, 1.0.13 Increase the threshold for tracking the warning on traversing of the PathIterator from 1,000 to 10,000. Discussion in: http://markmail.org/thread/apmrq45m65n6wuwo -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2720) Misleading traversal warning message while performing query
[ https://issues.apache.org/jira/browse/OAK-2720?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra resolved OAK-2720. -- Resolution: Fixed Fix Version/s: 1.0.13 Done and also merged to 1.0 branch * trunk - http://svn.apache.org/r1671787 * 1.0 - http://svn.apache.org/r1671793 Also updated the log message to include traversal count within index nodes bq. Traversed 1 nodes (110001 index entries) using index foo with filter Filter(query=SELECT * FROM [nt:base], path=*) Also merged OAK-2224 to 1.0 to increase the threshold to 1 to reduce the noise Misleading traversal warning message while performing query --- Key: OAK-2720 URL: https://issues.apache.org/jira/browse/OAK-2720 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Fix For: 1.0.13, 1.2 Attachments: OAK-2720.patch Currently {{ContentMirrorStoreStrategy}} logs a traversal warning if the property index performs node traversal of more than 1 (default). The intention here being to warn the end user that traversing so many nodes would cause performance issue. Traversal in itself might happen due to many reason like # Query not using right index. If the query has two property restriction and one of them is more broad while other is more selective and index is defined only for first then more traversal would be performed. The warning should help the user to create a new index for second property # Caller is fetching way more result - Query might end with say 50k result and caller is reading all. Such warning would help user to probably go for pagination So above are valid cases. However currently warning is also seen even if end result set is small say 100 but indexed paths are deep. As {{ContentMirrorStoreStrategy}} mirrors the path structure the current counting logic also counts the intermediate traversals. This warning is then misleading as how this internal structure is created is an implementation details of index which end user does not have any control. This leaves following options # Use different storage strategy which is more efficient in storage # Do not count the intermediate nodes traversed within index path and instead only count the matching node -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2722) IndexCopier fails to delete older index directory upon reindex
[ https://issues.apache.org/jira/browse/OAK-2722?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14482991#comment-14482991 ] Chetan Mehrotra commented on OAK-2722: -- For now disabled the test with http://svn.apache.org/r1671795 Would fix this post 1.2 as worst case that directory does not get removed and would need to be removed manually. And then reindex is not done very frequently in prod systems. IndexCopier fails to delete older index directory upon reindex -- Key: OAK-2722 URL: https://issues.apache.org/jira/browse/OAK-2722 Project: Jackrabbit Oak Issue Type: Bug Components: oak-lucene Reporter: Chetan Mehrotra Assignee: Chetan Mehrotra Priority: Minor Fix For: 1.3.0 {{IndexCopier}} tries to remove the older index directory incase of reindex. This might fails on platform like Windows if the files are still memory mapped or are locked. For deleting directories we would need to take similar approach like being done with deleting old index files i.e. do retries later. Due to this following test fails on Windows (Per [~julian.resc...@gmx.de] ) {noformat} Tests run: 9, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 0.07 sec FAILURE! deleteOldPostReindex(org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest) Time elapsed: 0.02 sec FAILURE! java.lang.AssertionError: Old index directory should have been removed at org.junit.Assert.fail(Assert.java:93) at org.junit.Assert.assertTrue(Assert.java:43) at org.junit.Assert.assertFalse(Assert.java:68) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.deleteOldPostReindex(IndexCopierTest.java:160) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (OAK-2593) Release Oak 1.1.8
[ https://issues.apache.org/jira/browse/OAK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella closed OAK-2593. - Release Oak 1.1.8 - Key: OAK-2593 URL: https://issues.apache.org/jira/browse/OAK-2593 Project: Jackrabbit Oak Issue Type: Task Reporter: Davide Giannella Assignee: Davide Giannella Priority: Minor - release oak - update website - update javadoc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2593) Release Oak 1.1.8
[ https://issues.apache.org/jira/browse/OAK-2593?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Davide Giannella resolved OAK-2593. --- Resolution: Fixed Release Oak 1.1.8 - Key: OAK-2593 URL: https://issues.apache.org/jira/browse/OAK-2593 Project: Jackrabbit Oak Issue Type: Task Reporter: Davide Giannella Assignee: Davide Giannella Priority: Minor - release oak - update website - update javadoc -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2723) FileStore does not scale because of precomputed graph on TarReader
[ https://issues.apache.org/jira/browse/OAK-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483073#comment-14483073 ] Andrei Dulvac commented on OAK-2723: If you need a memory dump, I can provide that offline, it's not possible (or desirable) to attach it in jira. FileStore does not scale because of precomputed graph on TarReader -- Key: OAK-2723 URL: https://issues.apache.org/jira/browse/OAK-2723 Project: Jackrabbit Oak Issue Type: Bug Components: oak-core Affects Versions: 1.1.8 Reporter: Andrei Dulvac Attachments: 0001-TarReader-fix-for-precomputed-graph.patch The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2626) Optimize binary comparison for merge during upgrade
[ https://issues.apache.org/jira/browse/OAK-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Sedding updated OAK-2626: Attachment: incremental-upgrade-no-changes.png Optimize binary comparison for merge during upgrade Key: OAK-2626 URL: https://issues.apache.org/jira/browse/OAK-2626 Project: Jackrabbit Oak Issue Type: Improvement Components: upgrade Affects Versions: 1.1.7 Reporter: Julian Sedding Priority: Minor Attachments: OAK-2626.patch, incremental-upgrade-no-changes.png In OAK-2619 I propose to support repeated upgrades into the same NodeStore. This issue does not optimizate the first run, but any subsequent run benefits from the proposed changes. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2626) Optimize binary comparison for merge during upgrade
[ https://issues.apache.org/jira/browse/OAK-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Sedding updated OAK-2626: Attachment: (was: incremental-upgrade-no-changes.png) Optimize binary comparison for merge during upgrade Key: OAK-2626 URL: https://issues.apache.org/jira/browse/OAK-2626 Project: Jackrabbit Oak Issue Type: Improvement Components: upgrade Affects Versions: 1.1.7 Reporter: Julian Sedding Priority: Minor Attachments: OAK-2626.patch, incremental-upgrade-no-changes.png In OAK-2619 I propose to support repeated upgrades into the same NodeStore. This issue does not optimizate the first run, but any subsequent run benefits from the proposed changes. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2626) Optimize binary comparison for merge during upgrade
[ https://issues.apache.org/jira/browse/OAK-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Sedding updated OAK-2626: Description: In OAK-2619 I propose to support repeated upgrades into the same NodeStore. This issue does not optimizate the first run, but any subsequent run benefits from the proposed changes. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. was: In OAK-2619 I propose to support multiple upgrades into the same NodeStore. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. Optimize binary comparison for merge during upgrade Key: OAK-2626 URL: https://issues.apache.org/jira/browse/OAK-2626 Project: Jackrabbit Oak Issue Type: Improvement Components: upgrade Affects Versions: 1.1.7 Reporter: Julian Sedding Priority: Minor Attachments: OAK-2626.patch In OAK-2619 I propose to support repeated upgrades into the same NodeStore. This issue does not optimizate the first run, but any subsequent run benefits from the proposed changes. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2723) FileStore does not scale because of precomputed graph on TarReader
Andrei Dulvac created OAK-2723: -- Summary: FileStore does not scale because of precomputed graph on TarReader Key: OAK-2723 URL: https://issues.apache.org/jira/browse/OAK-2723 Project: Jackrabbit Oak Issue Type: Bug Components: oak-core Affects Versions: 1.1.8 Reporter: Andrei Dulvac The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2626) Optimize binary comparison for merge during upgrade
[ https://issues.apache.org/jira/browse/OAK-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julian Sedding updated OAK-2626: Attachment: incremental-upgrade-no-changes.png Optimize binary comparison for merge during upgrade Key: OAK-2626 URL: https://issues.apache.org/jira/browse/OAK-2626 Project: Jackrabbit Oak Issue Type: Improvement Components: upgrade Affects Versions: 1.1.7 Reporter: Julian Sedding Priority: Minor Attachments: OAK-2626.patch, incremental-upgrade-no-changes.png In OAK-2619 I propose to support repeated upgrades into the same NodeStore. This issue does not optimizate the first run, but any subsequent run benefits from the proposed changes. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2723) FileStore does not scale because of precomputed graph on TarReader
[ https://issues.apache.org/jira/browse/OAK-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Dulvac updated OAK-2723: --- Description: The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. was: The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. FileStore does not scale because of precomputed graph on TarReader -- Key: OAK-2723 URL: https://issues.apache.org/jira/browse/OAK-2723 Project: Jackrabbit Oak Issue Type: Bug Components: oak-core Affects Versions: 1.1.8 Reporter: Andrei Dulvac Attachments: 0001-TarReader-fix-for-precomputed-graph.patch The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2723) FileStore does not scale because of precomputed graph on TarReader
[ https://issues.apache.org/jira/browse/OAK-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-2723: --- Fix Version/s: 1.3.0 FileStore does not scale because of precomputed graph on TarReader -- Key: OAK-2723 URL: https://issues.apache.org/jira/browse/OAK-2723 Project: Jackrabbit Oak Issue Type: Bug Components: oak-core Affects Versions: 1.1.8 Reporter: Andrei Dulvac Fix For: 1.3.0 Attachments: 0001-TarReader-fix-for-precomputed-graph.patch The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The construction of {fileStore}} is from oak-run: bq. FileStore store = new FileStore(directory, 256, TAR_STORAGE_MEMORY_MAPPED); The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2626) Optimize binary comparison for merge during upgrade
[ https://issues.apache.org/jira/browse/OAK-2626?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483081#comment-14483081 ] Julian Sedding commented on OAK-2626: - The time taken for an upgrade can be split into the time taken to *copy the data* and the time taken to *execute the commit hooks*. This optimization relies on comparing blobs by reference. The optimizations are designed to avoid file-system access where possible, i.e. reference calculation is reduced to a string operation. For the initial step of *copying the data*, the attached patch is sufficient, as the {{JackrabbitNodeState}}'s anonymous {{AbstractBlob}} implements {{equals}} with a reference comparison, before falling back to {{AbstractBlob#equals()}}. In order to benefit from the optimization during the *execution of commot hooks*, the patch from OAK-2627, which adds a reference comparison to {{AbstractBlob}} itself, needs to be applied as well. This is because when the commit hooks are executed, any compared {{NodeState}}s are off the same type (e.g. SegmentNodeState or DocumentNodeState). To activate the optimization, {{ReferenceOptimizedBlobStore}} needs to be used (as a drop-in replacement) instead of {{DataStoreBlobStore}}. !incremental-upgrade-no-changes.png! The graph shows four scenarios run with TarMK + FDS (500k nodes copied from an AEM instance, ~2/3 are digital assets, ~1/3 are websites). Each time the source repository is copied a second time without any changes. # copy: No optimizations. Essentially, the entire repository is copied again (34 sec) and then compared for the commit-hooks. No NodeStates are shared, so a full repository traversal is done for the comparison (63 sec). # copy + binary-optimization: Optimized blob comparison by reference. Again, the entire repository is copied again (34 sec) and then compared for the commit-hooks. A full repository traversal is done for the comparison, but blob comparison is optimized (14 sec). # recursive-copy: Content is copied recursively (43 sec). All properties are compared during copy and set only if changed (see OAK-2619). Since there are no changes, no time is required to execute commit-hooks (0 sec). # recursive-copy + blob-optimization: As above, but the recursive copy benefits from optimized binary comparison (15 sec). Again, no changes were made, hence commit hooks require no time (0 sec). For reference, the first run for all four scenarios is very uniform: 33-36 sec for copy and 19 sec for the commit hooks (comparing against EmptyNodeState is fast), i.e. a total of 52-57 sec. Optimize binary comparison for merge during upgrade Key: OAK-2626 URL: https://issues.apache.org/jira/browse/OAK-2626 Project: Jackrabbit Oak Issue Type: Improvement Components: upgrade Affects Versions: 1.1.7 Reporter: Julian Sedding Priority: Minor Attachments: OAK-2626.patch, incremental-upgrade-no-changes.png In OAK-2619 I propose to support repeated upgrades into the same NodeStore. This issue does not optimizate the first run, but any subsequent run benefits from the proposed changes. One use-case for this feature is to import all content several days before the upgrade and then copy only the delta on the day of the upgrade. Assuming that both the source and target repositories use the same FileDataStore, binaries could be efficiently compared by their references. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2723) FileStore does not scale because of precomputed graph on TarReader
[ https://issues.apache.org/jira/browse/OAK-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrei Dulvac updated OAK-2723: --- Description: The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The construction of {{FileStore}} is from oak-run: bq. FileStore store = new FileStore(directory, 256, TAR_STORAGE_MEMORY_MAPPED); The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. was: The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The construction of {fileStore}} is from oak-run: bq. FileStore store = new FileStore(directory, 256, TAR_STORAGE_MEMORY_MAPPED); The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. FileStore does not scale because of precomputed graph on TarReader -- Key: OAK-2723 URL: https://issues.apache.org/jira/browse/OAK-2723 Project: Jackrabbit Oak Issue Type: Bug Components: oak-core Affects Versions: 1.1.8 Reporter: Andrei Dulvac Fix For: 1.3.0 Attachments: 0001-TarReader-fix-for-precomputed-graph.patch The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The construction of {{FileStore}} is from oak-run: bq. FileStore store = new FileStore(directory, 256, TAR_STORAGE_MEMORY_MAPPED); The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (OAK-2723) FileStore does not scale because of precomputed graph on TarReader
[ https://issues.apache.org/jira/browse/OAK-2723?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483072#comment-14483072 ] Andrei Dulvac edited comment on OAK-2723 at 4/7/15 12:21 PM: - Attached patch created with {{git format-patch HEAD~1}} Apply with {{patch -p1 -i 0001-TarReader-fix-for-precomputed-graph.patch}} was (Author: andrei.dulvac): Attached patch created with {{git format-patch HEAD~1}} Apply with {{patch -p1 -i 0001-fixes-FACILITIES-97.patch}} FileStore does not scale because of precomputed graph on TarReader -- Key: OAK-2723 URL: https://issues.apache.org/jira/browse/OAK-2723 Project: Jackrabbit Oak Issue Type: Bug Components: oak-core Affects Versions: 1.1.8 Reporter: Andrei Dulvac Fix For: 1.3.0 Attachments: 0001-TarReader-fix-for-precomputed-graph.patch The {{FileStore}} keeps a reference to all {{TarReader}} object, one per each file. In my test, for an ~350 Gb repository, that was ~1100 tar files, with a {{TarReader}} for each. The problem is {{TarReader}} keeps a reference to a precomputed _graph_ {{ByteBuffer}}, which is not really used that much. That means that through the {{readers}} field, there's a reference to these _graphs_, which means they can't be GC'ed. The construction of {fileStore}} is from oak-run: bq. FileStore store = new FileStore(directory, 256, TAR_STORAGE_MEMORY_MAPPED); The effect is you need more that 6GB of Ram just to instantiate the {{FileStore}} object. The attached patch fixes this issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2682) Introduce time difference detection for mongoMk
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483105#comment-14483105 ] Robert Munteanu commented on OAK-2682: -- [~egli] - I've looked into this issue briefly, as I'm intereted in contributing a patch. You mention in the issue description 'all nodes of the cluster'. I assume that you mean an Oak cluster, not a MongoDB cluster. When talking about clock skew in MongoDB, we actually have two situations: - replica sets - sharded clusters For replica sets, the different MongoDB instances are actually visible to the DocumentNodeStore as cluster members. For sharded clusters, Oak would connect only to a {{mongos}} instance. We can of course find out the shards from the config database, and connect separately to those {{mongod}} instances to run the {{serverStatus}} command, but I find it unnecessarily cumbersome. Furthermore, I see that MongoDB has its own clock skew detection for both replica sets ( each replica set member does this check ) and for clustered shards ( the monogos instances perform the check ). MongoDB is also tolerant of some clock skew, but not too much ( [Mongos throwing clock skew error?|https://groups.google.com/forum/#!topic/mongodb-user/SPi4Kqox16I]) . TBH I see this more of an operations issue rather than something that can/should be done into Oak and would rather suggesting dropping this. Thoughts? /CC [~chetanm], [~mreutegg] Introduce time difference detection for mongoMk --- Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2619) Repeated upgrades
[ https://issues.apache.org/jira/browse/OAK-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483111#comment-14483111 ] Julian Sedding commented on OAK-2619: - I have done some more measurements using both TarNS and MongoNS. My data for both backends shows that the initial upgrade (i.e. into an empty NodeStore) is equally fast with and without my changes. See initial-upgrade-tar.png and initial-upgrade-mongo.png respectively. My measurements also show that an incremental upgrade can help save time on the critical path of an upgrade. Provided that the optimizations from OAK-2626 (and ideally OAK-2627) are applied. In my tests, an initial upgrade of 500k nodes (~2/3 digital assets, ~1/3 websites) to TarNS took 52-55 sec. The incremental upgrade (with no changes in the source repo, i.e. best-case) with optimizations only took 15 sec. The work-load is equivalent to a single repository traversal with no writes and hence no additional traversal for the commit hooks. On Mongo the results are similar, albeit more dramatic: the initial upgrade took 605-623 sec, while the incremental upgrade with optimizations took only 80 sec. Note that for the not fully optimized cases incremental upgrades are slower (or not much faster) than initial upgrades. Repeated upgrades - Key: OAK-2619 URL: https://issues.apache.org/jira/browse/OAK-2619 Project: Jackrabbit Oak Issue Type: New Feature Components: upgrade Affects Versions: 1.1.7 Reporter: Julian Sedding Priority: Minor Attachments: OAK-2619.patch, incremental-upgrade-no-changes-mongo.png, incremental-upgrade-no-changes-tar.png, initial-upgrade-mongo.png, initial-upgrade-tar.png When upgrading from Jackrabbit 2 to Oak there are several scenarios that could benefit from the ability to upgrade repeatedly into one target repository. E.g. a migration process might look as follows: # upgrade a backup of a large repository a week before go-live # run the upgrade again every night (commit-hooks only handle delta) # run the upgrade one final time before go-live (commit-hooks only handle delta) In this scenario each upgrade would require a full traversal of the source repository. However, if done right, only the delta needs to be written and the commit-hooks also only need to process the delta. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2682) Introduce time difference detection for mongoMk
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483117#comment-14483117 ] Marcel Reutegger commented on OAK-2682: --- This issue is not about clock skew on the machines were MongoDB is running, but rather where Oak instances are running. These may be running on the same machine as a mongos process, but this is not a requirement. I still think it would be useful to have a detection built into Oak or more specifically the DocumentNodeStore implementation. The cluster lease functionality in DocumentNodeStore does depend on machines with somewhat synchronized clocks. See also [~egli]'s description of this issue. Introduce time difference detection for mongoMk --- Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2724) Export SessionImpl#getItemOrNull in JackrabbitSession
Joel Richard created OAK-2724: - Summary: Export SessionImpl#getItemOrNull in JackrabbitSession Key: OAK-2724 URL: https://issues.apache.org/jira/browse/OAK-2724 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-jcr Affects Versions: 1.1.8 Reporter: Joel Richard Priority: Critical getItemOrNull should be exported in JackrabbitSession. This would allow to combine itemExists and getItem in Sling which would reduce the rendering time by 8%. See the following mail thread for more information: http://mail-archives.apache.org/mod_mbox/jackrabbit-oak-dev/201504.mbox/browser -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2714) Test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-2714: --- Description: This issue is for tracking test failures seen at our Jenkins instance that might yet be transient. Once a failure happens too often we should remove it here and create a dedicated issue for it. || Test || Builds || Fixture || JVM || | org.apache.jackrabbit.oak.plugins.index.solr.configuration.DefaultAnalyzersConfigurationTest | 61, 63 | ?| ? | | org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.reuseLocalDir | 81 | DOCUMENT_RDB | 1.7 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.orderableFolder | 81, 87 | DOCUMENT_NS, DOCUMENT_RDB | 1.6, 1.7 | | org.apache.jackrabbit.oak.jcr.OrderedIndexIT.oak2035 | 76 | SEGMENT_MK | 1.6 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.setPrimaryType | 69, 83 | DOCUMENT_RDB | 1.6 | | org.apache.jackrabbit.oak.plugins.segment.standby.StandbyTestIT.testSyncLoop | 64 | ?| ? | | org.apache.jackrabbit.oak.jcr.observation.ObservationRefreshTest.observation | 48, 55 | ?| ? | | org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore_CustomBlobStore | 52 | ? | ? | | org.apache.jackrabbit.oak.run.osgi.JsonConfigRepFactoryTest.testRepositoryTar | 41 | ?| ? | | org.apache.jackrabbit.oak.jcr.AutoCreatedItemsTest.autoCreatedItems | 41 | ?| ? | | org.apache.jackrabbit.test.api.observation.PropertyAddedTest.testMultiPropertyAdded | 29 | ?| ? | | org.apache.jackrabbit.oak.plugins.segment.HeavyWriteIT.heavyWrite | 35 | SEGMENT_MK | ? | | | | | | was: This issue is for tracking test failures seen at our Jenkins instance that might yet be transient. Once a failure happens too often we should remove it here and create a dedicated issue for it. || Test || Builds || Fixture || JVM || | org.apache.jackrabbit.oak.plugins.index.solr.configuration.DefaultAnalyzersConfigurationTest | 61, 63 | ?| ? | | org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.reuseLocalDir | 81 | DOCUMENT_RDB | 1.7 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.orderableFolder | 81 | DOCUMENT_NS | 1.6 | | org.apache.jackrabbit.oak.jcr.OrderedIndexIT.oak2035 | 76 | SEGMENT_MK | 1.6 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.setPrimaryType | 69, 83 | DOCUMENT_RDB | 1.6 | | org.apache.jackrabbit.oak.plugins.segment.standby.StandbyTestIT.testSyncLoop | 64 | ?| ? | | org.apache.jackrabbit.oak.jcr.observation.ObservationRefreshTest.observation | 48, 55 | ?| ? | | org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore_CustomBlobStore | 52 | ? | ? | | org.apache.jackrabbit.oak.run.osgi.JsonConfigRepFactoryTest.testRepositoryTar | 41 | ?| ? | | org.apache.jackrabbit.oak.jcr.AutoCreatedItemsTest.autoCreatedItems | 41 | ?| ? | | org.apache.jackrabbit.test.api.observation.PropertyAddedTest.testMultiPropertyAdded | 29 | ?| ? | | org.apache.jackrabbit.oak.plugins.segment.HeavyWriteIT.heavyWrite | 35 | SEGMENT_MK | ? | | | | | | Test failures on Jenkins Key: OAK-2714 URL: https://issues.apache.org/jira/browse/OAK-2714 Project: Jackrabbit Oak Issue Type: Bug Environment: Jenkins, Ubuntu: https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ Reporter: Michael Dürig Labels: CI, Jenkins Fix For: 1.3.0 This issue is for tracking test failures seen at our Jenkins instance that might yet be transient. Once a failure happens too often we should remove it here and create a dedicated issue for it. || Test || Builds || Fixture || JVM || | org.apache.jackrabbit.oak.plugins.index.solr.configuration.DefaultAnalyzersConfigurationTest | 61, 63 | ?| ? | |
[jira] [Commented] (OAK-2718) NodeStateSolrServersObserver performs complete diff synchronously causing slowness in dispatch
[ https://issues.apache.org/jira/browse/OAK-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483228#comment-14483228 ] Tommaso Teofili commented on OAK-2718: -- in r1671853 I've made the {{NodeStateSolrServersObserver}} be wrapped by a {{BackgroundObserver}} in {{NodeStateSolrServersObserverService}} to make the observer work asynchronously. NodeStateSolrServersObserver performs complete diff synchronously causing slowness in dispatch -- Key: OAK-2718 URL: https://issues.apache.org/jira/browse/OAK-2718 Project: Jackrabbit Oak Issue Type: Bug Components: oak-solr Reporter: Chetan Mehrotra Assignee: Tommaso Teofili Fix For: 1.2 {{NodeStateSolrServersObserver}} is enabled by default and performs diff synchronously. Further it performs complete diff which might take time and would cause the dispatch thread to slowdown. This would cause issues at least with {{DocumentNodeStore}} as there the dispatch is done as part of background read and that call is time sensitive. As a fix the diff should performed asynchronously and also be selective. A similar fix was done for Lucene index as part of OAK-2570 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2725) Wrong indexed query estimates exceed more than double the actual index entries
[ https://issues.apache.org/jira/browse/OAK-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Iordache updated OAK-2725: - Attachment: OAK-2725-test.patch Wrong indexed query estimates exceed more than double the actual index entries -- Key: OAK-2725 URL: https://issues.apache.org/jira/browse/OAK-2725 Project: Jackrabbit Oak Issue Type: Bug Components: query Affects Versions: 1.1.8 Reporter: Florin Iordache Priority: Critical Fix For: 1.2 Attachments: OAK-2725-test.patch The {{ApproximateCounter.adjustCountSync}} public method that is used by the indexing engine will sometimes produce very unrealistic cost estimates. The problem is that it can produce an estimated cost that exceeds the estimated cost of the full traversal query, thus causing the index to be bypassed altogether, resulting in a full traversal rather than the use of the existing index. Problem resides in the way the property counts are updated: * The count property update goes through if two randoms are not zero: random(100) and random({1, 2, 4, 8, 16, ...}). * Same static pseudo random generator for all invocations. Even if #1 might seem improbable, it is statistically possible to reach a very high count with only a handful of invocations. In practice I've found that running 100 tests with 1000 invocations if the adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2725) Wrong indexed query estimates exceed more than double the actual index entries
[ https://issues.apache.org/jira/browse/OAK-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483334#comment-14483334 ] Florin Iordache commented on OAK-2725: -- cc: [~tmueller] Wrong indexed query estimates exceed more than double the actual index entries -- Key: OAK-2725 URL: https://issues.apache.org/jira/browse/OAK-2725 Project: Jackrabbit Oak Issue Type: Bug Components: query Affects Versions: 1.1.8 Reporter: Florin Iordache Priority: Critical Fix For: 1.2 Attachments: OAK-2725-test.patch The {{ApproximateCounter.adjustCountSync}} public method that is used by the indexing engine will sometimes produce very unrealistic cost estimates. The problem is that it can produce an estimated cost that exceeds the estimated cost of the full traversal query, thus causing the index to be bypassed altogether, resulting in a full traversal rather than the use of the existing index. Problem resides in the way the property counts are updated: * The count property update goes through if two randoms are not zero: random(100) and random({1, 2, 4, 8, 16, ...}). * Same static pseudo random generator for all invocations. Even if #1 might seem improbable, it is statistically possible to reach a very high count with only a handful of invocations. In practice I've found that running 100 tests with 1000 invocations if the adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2726) Avoid repository traversal for trivial node type changes
Marcel Reutegger created OAK-2726: - Summary: Avoid repository traversal for trivial node type changes Key: OAK-2726 URL: https://issues.apache.org/jira/browse/OAK-2726 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Marcel Reutegger The {{TypeEditor}} in oak-core checks the repository content when a node type changes to make sure the content still conforms to the updated node type. For a repository with a lot of nodes, this check can take quite a bit of time. Jackrabbit has an optimization in place for trivial node type changes. E.g. it will not check the content if a non-mandatory item is added to an existing node type definition. This optimization could be implemented in Oak as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2727) NodeStateSolrServersObserver should be filtering path selectively
Tommaso Teofili created OAK-2727: Summary: NodeStateSolrServersObserver should be filtering path selectively Key: OAK-2727 URL: https://issues.apache.org/jira/browse/OAK-2727 Project: Jackrabbit Oak Issue Type: Improvement Affects Versions: 1.1.8 Reporter: Tommaso Teofili Assignee: Tommaso Teofili Fix For: 1.3.0 As discussed in OAK-2718 it'd be good to be able to be able to selectively find Solr indexes by path, as done in Lucene index, see also OAK-2570. This would avoid having to do full diffs. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2726) Avoid repository traversal for trivial node type changes
[ https://issues.apache.org/jira/browse/OAK-2726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-2726: -- Fix Version/s: 1.4 Avoid repository traversal for trivial node type changes Key: OAK-2726 URL: https://issues.apache.org/jira/browse/OAK-2726 Project: Jackrabbit Oak Issue Type: Improvement Components: core Reporter: Marcel Reutegger Fix For: 1.4 The {{TypeEditor}} in oak-core checks the repository content when a node type changes to make sure the content still conforms to the updated node type. For a repository with a lot of nodes, this check can take quite a bit of time. Jackrabbit has an optimization in place for trivial node type changes. E.g. it will not check the content if a non-mandatory item is added to an existing node type definition. This optimization could be implemented in Oak as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2682) Introduce time difference detection for mongoMk
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483217#comment-14483217 ] Robert Munteanu commented on OAK-2682: -- Right, now it makes more sense to me, thanks for clarifying Introduce time difference detection for mongoMk --- Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-2725) Wrong indexed query estimates exceed more than double the actual index entries
Florin Iordache created OAK-2725: Summary: Wrong indexed query estimates exceed more than double the actual index entries Key: OAK-2725 URL: https://issues.apache.org/jira/browse/OAK-2725 Project: Jackrabbit Oak Issue Type: Bug Components: query Affects Versions: 1.1.8 Reporter: Florin Iordache Priority: Critical Fix For: 1.2 The {{ApproximateCounter.adjustCountSync}} public method that is used by the indexing engine will sometimes produce very unrealistic cost estimates. The problem is that it can produce an estimated cost that exceeds the estimated cost of the full traversal query, thus causing the index to be bypassed altogether, resulting in a full traversal rather than the use of the existing index. Problem resides in the way the property counts are updated: * The count property update goes through if two randoms are not zero: random(100) and random({1, 2, 4, 8, 16, ...}). * Same static pseudo random generator for all invocations. Even if #1 might seem improbable, it is statistically possible to reach a very high count with only a handful of invocations. In practice I've found that running 100 tests with 1000 invocations if the adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (OAK-2036) getPlan() output for NodeTypeIndex doesn't indicate the index type used
[ https://issues.apache.org/jira/browse/OAK-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chetan Mehrotra reassigned OAK-2036: Assignee: Chetan Mehrotra getPlan() output for NodeTypeIndex doesn't indicate the index type used --- Key: OAK-2036 URL: https://issues.apache.org/jira/browse/OAK-2036 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Justin Edelson Assignee: Chetan Mehrotra Fix For: 1.3.0 NodeTypeIndex's getPlan() method simply does this: {code} return filter.toString(); {code} whereas all the other index implementation output their name. This should be changed to, at minimum: {code} return nodetype + filter.toString(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2718) NodeStateSolrServersObserver performs complete diff synchronously causing slowness in dispatch
[ https://issues.apache.org/jira/browse/OAK-2718?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483345#comment-14483345 ] Chetan Mehrotra commented on OAK-2718: -- [~teofili] Can we also disable it by default and let it be registered only if some OSGi config is provided. That can be simply achieved by marking the component with ConfigurationPolicy REQUIRED. As most of the deployments would not have Solr configured this should be fine and not cause much inconvenience. An observer doing the complete diff would still add to the cost. Later once we make it precise we can let it be enabled by default NodeStateSolrServersObserver performs complete diff synchronously causing slowness in dispatch -- Key: OAK-2718 URL: https://issues.apache.org/jira/browse/OAK-2718 Project: Jackrabbit Oak Issue Type: Bug Components: oak-solr Reporter: Chetan Mehrotra Assignee: Tommaso Teofili Fix For: 1.2 {{NodeStateSolrServersObserver}} is enabled by default and performs diff synchronously. Further it performs complete diff which might take time and would cause the dispatch thread to slowdown. This would cause issues at least with {{DocumentNodeStore}} as there the dispatch is done as part of background read and that call is time sensitive. As a fix the diff should performed asynchronously and also be selective. A similar fix was done for Lucene index as part of OAK-2570 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2682) Introduce time difference detection for mongoMk
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483336#comment-14483336 ] Robert Munteanu commented on OAK-2682: -- As far as I can tell there is no method of executing commands between Oak cluster nodes, so we need to compare the current time values to another source. For MongoDB the best bet would be the database itself. I see two ways: 1. Run a command on mongoDB to get the current time, e.g. [hostInfo|http://docs.mongodb.org/manual/reference/command/hostInfo/] and compare it to the local time 1. Use the {{clusterNodes}} collection to store the timestamps. Each cluster node would include a timestamp it generated using System.currentTimeMillis and the mongo server time ( [$currentDate|http://docs.mongodb.org/manual/reference/operator/update/currentDate/] ). I'm not sure what the trade-offs are here, but storing the data in the clusterNodes collection makes it much simpler to inspect outside of Oak, but we probably have to write-then-read to generate warnings from Oak. At any rate, would it make sense to return this information as part of the ClusterNodeInfo? Introduce time difference detection for mongoMk --- Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2171) oak-upgrade should support RDB persistence
[ https://issues.apache.org/jira/browse/OAK-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483422#comment-14483422 ] Manfred Baedke commented on OAK-2171: - oak-upgrade doesn't care about persistence implementations. The only missing thing is the command line support of the relevant parameters in oak-run. We might simply move to 1.3. oak-upgrade should support RDB persistence -- Key: OAK-2171 URL: https://issues.apache.org/jira/browse/OAK-2171 Project: Jackrabbit Oak Issue Type: Sub-task Components: mongomk, rdbmk, upgrade Reporter: Julian Reschke Assignee: Manfred Baedke Fix For: 1.2 The upgrade (oak-upgrade/oak-run) should support a DocumentMK/RDB target. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2725) Wrong indexed query estimates exceed more than double the actual index entries
[ https://issues.apache.org/jira/browse/OAK-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Florin Iordache updated OAK-2725: - Description: The {{ApproximateCounter.adjustCountSync}} public method that is used by the indexing engine will sometimes produce very unrealistic cost estimates. The problem is that it can produce an estimated cost that exceeds the estimated cost of the full traversal query, thus causing the index to be bypassed altogether, resulting in a full traversal rather than the use of the existing index. Problem resides in the way the property counts are updated: * The count property update goes through if two randoms are equal to zero: random(100) and random({1, 2, 4, 8, 16, ...}). * Same static pseudo random generator for all invocations. Even if #1 might seem improbable, it is statistically possible to reach a very high count with only a handful of invocations. In practice I've found that running 100 tests with 1000 invocations if the adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. was: The {{ApproximateCounter.adjustCountSync}} public method that is used by the indexing engine will sometimes produce very unrealistic cost estimates. The problem is that it can produce an estimated cost that exceeds the estimated cost of the full traversal query, thus causing the index to be bypassed altogether, resulting in a full traversal rather than the use of the existing index. Problem resides in the way the property counts are updated: * The count property update goes through if two randoms are not zero: random(100) and random({1, 2, 4, 8, 16, ...}). * Same static pseudo random generator for all invocations. Even if #1 might seem improbable, it is statistically possible to reach a very high count with only a handful of invocations. In practice I've found that running 100 tests with 1000 invocations if the adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. Wrong indexed query estimates exceed more than double the actual index entries -- Key: OAK-2725 URL: https://issues.apache.org/jira/browse/OAK-2725 Project: Jackrabbit Oak Issue Type: Bug Components: query Affects Versions: 1.1.8 Reporter: Florin Iordache Priority: Critical Fix For: 1.2 Attachments: OAK-2725-test.patch The {{ApproximateCounter.adjustCountSync}} public method that is used by the indexing engine will sometimes produce very unrealistic cost estimates. The problem is that it can produce an estimated cost that exceeds the estimated cost of the full traversal query, thus causing the index to be bypassed altogether, resulting in a full traversal rather than the use of the existing index. Problem resides in the way the property counts are updated: * The count property update goes through if two randoms are equal to zero: random(100) and random({1, 2, 4, 8, 16, ...}). * Same static pseudo random generator for all invocations. Even if #1 might seem improbable, it is statistically possible to reach a very high count with only a handful of invocations. In practice I've found that running 100 tests with 1000 invocations if the adjustCountSync method will yield costs exceeding value 2000 in 4-10% of the tests. Attaching a patch for {{ApproximateCounterTest}} with this test case. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2714) Test failures on Jenkins
[ https://issues.apache.org/jira/browse/OAK-2714?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Dürig updated OAK-2714: --- Description: This issue is for tracking test failures seen at our Jenkins instance that might yet be transient. Once a failure happens too often we should remove it here and create a dedicated issue for it. || Test || Builds || Fixture || JVM || | org.apache.jackrabbit.oak.plugins.index.solr.configuration.DefaultAnalyzersConfigurationTest | 61, 63 | ?| ? | | org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.reuseLocalDir | 81 | DOCUMENT_RDB | 1.7 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.orderableFolder | 81, 87 | DOCUMENT_NS, DOCUMENT_RDB | 1.6, 1.7 | | org.apache.jackrabbit.oak.jcr.OrderedIndexIT.oak2035 | 76 | SEGMENT_MK | 1.6 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.setPrimaryType | 69, 83 | DOCUMENT_RDB | 1.6 | | org.apache.jackrabbit.oak.plugins.segment.standby.StandbyTestIT.testSyncLoop | 64 | ?| ? | | org.apache.jackrabbit.oak.jcr.observation.ObservationRefreshTest.observation | 48, 55 | ?| ? | | org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore_CustomBlobStore | 52 | ? | ? | | org.apache.jackrabbit.oak.run.osgi.JsonConfigRepFactoryTest.testRepositoryTar | 41 | ?| ? | | org.apache.jackrabbit.oak.jcr.AutoCreatedItemsTest.autoCreatedItems | 41, 88 | DOCUMENT_RDB | 1.7 | | org.apache.jackrabbit.test.api.observation.PropertyAddedTest.testMultiPropertyAdded | 29 | ?| ? | | org.apache.jackrabbit.oak.plugins.segment.HeavyWriteIT.heavyWrite | 35 | SEGMENT_MK | ? | | | | | | was: This issue is for tracking test failures seen at our Jenkins instance that might yet be transient. Once a failure happens too often we should remove it here and create a dedicated issue for it. || Test || Builds || Fixture || JVM || | org.apache.jackrabbit.oak.plugins.index.solr.configuration.DefaultAnalyzersConfigurationTest | 61, 63 | ?| ? | | org.apache.jackrabbit.oak.plugins.index.lucene.IndexCopierTest.reuseLocalDir | 81 | DOCUMENT_RDB | 1.7 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.orderableFolder | 81, 87 | DOCUMENT_NS, DOCUMENT_RDB | 1.6, 1.7 | | org.apache.jackrabbit.oak.jcr.OrderedIndexIT.oak2035 | 76 | SEGMENT_MK | 1.6 | | org.apache.jackrabbit.oak.jcr.OrderableNodesTest.setPrimaryType | 69, 83 | DOCUMENT_RDB | 1.6 | | org.apache.jackrabbit.oak.plugins.segment.standby.StandbyTestIT.testSyncLoop | 64 | ?| ? | | org.apache.jackrabbit.oak.jcr.observation.ObservationRefreshTest.observation | 48, 55 | ?| ? | | org.apache.jackrabbit.oak.run.osgi.DocumentNodeStoreConfigTest.testRDBDocumentStore_CustomBlobStore | 52 | ? | ? | | org.apache.jackrabbit.oak.run.osgi.JsonConfigRepFactoryTest.testRepositoryTar | 41 | ?| ? | | org.apache.jackrabbit.oak.jcr.AutoCreatedItemsTest.autoCreatedItems | 41 | ?| ? | | org.apache.jackrabbit.test.api.observation.PropertyAddedTest.testMultiPropertyAdded | 29 | ?| ? | | org.apache.jackrabbit.oak.plugins.segment.HeavyWriteIT.heavyWrite | 35 | SEGMENT_MK | ? | | | | | | Test failures on Jenkins Key: OAK-2714 URL: https://issues.apache.org/jira/browse/OAK-2714 Project: Jackrabbit Oak Issue Type: Bug Environment: Jenkins, Ubuntu: https://builds.apache.org/job/Apache%20Jackrabbit%20Oak%20matrix/ Reporter: Michael Dürig Labels: CI, Jenkins Fix For: 1.3.0 This issue is for tracking test failures seen at our Jenkins instance that might yet be transient. Once a failure happens too often we should remove it here and create a dedicated issue for it. || Test || Builds || Fixture || JVM || | org.apache.jackrabbit.oak.plugins.index.solr.configuration.DefaultAnalyzersConfigurationTest | 61, 63 | ?| ? | |
[jira] [Commented] (OAK-2682) Introduce time difference detection for mongoMk
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483380#comment-14483380 ] Stefan Egli commented on OAK-2682: -- Not sure how the mechanism via the second option would in detail look like - depending on that it seems more robust but I fear it would need more coordination and be more complex to implement. The first option however sounds quite simple. It would only have one potential flaw: if the oak clients would be connected to different mongo servers (if that's a feasible deployment option). Then they would rely on those mongo servers to have their clocks in sync. Overall I think the mechanism should be as KISS as possible.. Introduce time difference detection for mongoMk --- Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2036) getPlan() output for NodeTypeIndex doesn't indicate the index type used
[ https://issues.apache.org/jira/browse/OAK-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483470#comment-14483470 ] Chetan Mehrotra commented on OAK-2036: -- [~justinedelson] Looks like this was fixed in trunk sometime back by Thomas [1] [~tmueller] Just to confirm would it be fine to include this in branch? [1] https://github.com/apache/jackrabbit-oak/blob/trunk/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/nodetype/NodeTypeIndex.java#L72-75 getPlan() output for NodeTypeIndex doesn't indicate the index type used --- Key: OAK-2036 URL: https://issues.apache.org/jira/browse/OAK-2036 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Justin Edelson Assignee: Chetan Mehrotra Fix For: 1.3.0 NodeTypeIndex's getPlan() method simply does this: {code} return filter.toString(); {code} whereas all the other index implementation output their name. This should be changed to, at minimum: {code} return nodetype + filter.toString(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2705) DefaultSyncHandler should use the principalName as a fallback when no externalId is available
[ https://issues.apache.org/jira/browse/OAK-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483487#comment-14483487 ] Tobias Bocanegra commented on OAK-2705: --- you wrote in the description: user nodes lack the property rep:externalId ... using the principalName instead would work fine. which is not 100% correct. as the externalId also contains the name of the IDP. DefaultSyncHandler should use the principalName as a fallback when no externalId is available - Key: OAK-2705 URL: https://issues.apache.org/jira/browse/OAK-2705 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-auth-external, upgrade Reporter: Manfred Baedke After a crx2oak repository migration, user nodes lack the property rep:externalId, which is needed for the DefaultSyncHandler to work properly. In the majority of cases (when there is only one ExternalIdentityProvider) using the principalName instead would work fine, so we should implement this as a fallback when rep:externalId is missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2705) DefaultSyncHandler should use the principalName as a fallback when no externalId is available
[ https://issues.apache.org/jira/browse/OAK-2705?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Manfred Baedke resolved OAK-2705. - Resolution: Invalid Closing as invalid because it only applies to specific scenarios that are not generally applicable to Oak. DefaultSyncHandler should use the principalName as a fallback when no externalId is available - Key: OAK-2705 URL: https://issues.apache.org/jira/browse/OAK-2705 Project: Jackrabbit Oak Issue Type: Improvement Components: oak-auth-external, upgrade Reporter: Manfred Baedke After a crx2oak repository migration, user nodes lack the property rep:externalId, which is needed for the DefaultSyncHandler to work properly. In the majority of cases (when there is only one ExternalIdentityProvider) using the principalName instead would work fine, so we should implement this as a fallback when rep:externalId is missing. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2682) Introduce time difference detection for DocumentNodeStore
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483794#comment-14483794 ] Marcel Reutegger commented on OAK-2682: --- Hmm, thinking more about where the detection should be implemented, we could also introduce it in the lower DocumentStore layer. That way, the detection would be implementation dependent but hidden from the DocumentNodeStore. This means the MongoDocumentStore could use MongoDB specific and the RDB implementation SQL specific features. Introduce time difference detection for DocumentNodeStore - Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2171) oak-upgrade should support RDB persistence
[ https://issues.apache.org/jira/browse/OAK-2171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-2171: -- Fix Version/s: (was: 1.2) 1.3.0 Sounds good to me. Moving to 1.3.0. oak-upgrade should support RDB persistence -- Key: OAK-2171 URL: https://issues.apache.org/jira/browse/OAK-2171 Project: Jackrabbit Oak Issue Type: Sub-task Components: mongomk, rdbmk, upgrade Reporter: Julian Reschke Assignee: Manfred Baedke Fix For: 1.3.0 The upgrade (oak-upgrade/oak-run) should support a DocumentMK/RDB target. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (OAK-2036) getPlan() output for NodeTypeIndex doesn't indicate the index type used
[ https://issues.apache.org/jira/browse/OAK-2036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger resolved OAK-2036. --- Resolution: Fixed Fix Version/s: (was: 1.3.0) 1.2 Resolving as fixed in 1.2 based on above comment. The getPlan() method was change as suggested by Justin as part of OAK-1907 (http://svn.apache.org/r1643807). getPlan() output for NodeTypeIndex doesn't indicate the index type used --- Key: OAK-2036 URL: https://issues.apache.org/jira/browse/OAK-2036 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Justin Edelson Assignee: Chetan Mehrotra Fix For: 1.2 NodeTypeIndex's getPlan() method simply does this: {code} return filter.toString(); {code} whereas all the other index implementation output their name. This should be changed to, at minimum: {code} return nodetype + filter.toString(); {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2682) Introduce time difference detection for DocumentNodeStore
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-2682: -- Summary: Introduce time difference detection for DocumentNodeStore (was: Introduce time difference detection for mongoMk) Introduce time difference detection for DocumentNodeStore - Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-2682) Introduce time difference detection for mongoMk
[ https://issues.apache.org/jira/browse/OAK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14483778#comment-14483778 ] Marcel Reutegger commented on OAK-2682: --- Keep in mind the low level API is the DocumentStore interface and we shouldn't depend on backend specific features. MongoDB is just one of the backend implementations. There is also the RDB implementation and the in-memory one for testing. Ideally, the solution would work for all of them. I'll update the title of this issue. The term MongoMK is a bit misleading in there. Introduce time difference detection for mongoMk --- Key: OAK-2682 URL: https://issues.apache.org/jira/browse/OAK-2682 Project: Jackrabbit Oak Issue Type: Improvement Components: core, mongomk Reporter: Stefan Egli Fix For: 1.3.0 Currently the lease mechanism in DocumentNodeStore/mongoMk is based on the assumption that the clocks are in perfect sync between all nodes of the cluster. The lease is valid for 60sec with a timeout of 30sec. If clocks are off by too much, and background operations happen to take couple seconds, you run the risk of timing out a lease. So introducing a check which WARNs if the clocks in a cluster are off by too much (1st threshold, eg 5sec?) would help increase awareness. Further drastic measure could be to prevent a startup of Oak at all if the difference is for example higher than a 2nd threshold (optional I guess, but could be 20sec?). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2418) int overflow with orderby causing huge slowdown
[ https://issues.apache.org/jira/browse/OAK-2418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-2418: -- Fix Version/s: (was: 1.2) int overflow with orderby causing huge slowdown --- Key: OAK-2418 URL: https://issues.apache.org/jira/browse/OAK-2418 Project: Jackrabbit Oak Issue Type: Bug Components: core Affects Versions: 1.0.9 Reporter: Stefan Egli Assignee: Thomas Mueller Priority: Critical Fix For: 1.0.10, 1.1.6 Attachments: oak-2418.patch Consider the following query: {code} //element(*,slingevent:Job) order by @slingevent:created ascending {code} this query - when running with a large number of slingevent:Job around - will take a very long time due to the fact, that FilterIterators.SortIterator.init() in the following loop: {code} if (list.size() max * 2) { // remove tail entries right now, to save memory Collections.sort(list, orderBy); keepFirst(list, max); } {code} does a multiplication with 'max', which is by default set to Integer.MAX_VALUE (see FilterIterators.newCombinedFilter). This results in max *2 to overflow (result is -2) - thus that init-loop will sort the list for every additional entry. Which is definitely not the intention. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (OAK-2354) Support comments anywhere in a SQL-2 statement
[ https://issues.apache.org/jira/browse/OAK-2354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marcel Reutegger updated OAK-2354: -- Fix Version/s: (was: 1.2) Support comments anywhere in a SQL-2 statement -- Key: OAK-2354 URL: https://issues.apache.org/jira/browse/OAK-2354 Project: Jackrabbit Oak Issue Type: Bug Components: query Reporter: Thomas Mueller Assignee: Thomas Mueller Priority: Minor Fix For: 1.1.4 Currently, /* C-style comments */ are supported at the end of SQL-2 statements. They should be supported anywhere in the query. XPath queries can't support such comments, as /*/ is valid syntax. -- This message was sent by Atlassian JIRA (v6.3.4#6332)