[jira] [Resolved] (OAK-8950) DataStore: FileCache should use one cache segment
[ https://issues.apache.org/jira/browse/OAK-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8950. - Resolution: Fixed > DataStore: FileCache should use one cache segment > - > > Key: OAK-8950 > URL: https://issues.apache.org/jira/browse/OAK-8950 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > The FileCache in the caching data store (Azure, S3) uses the default segment > count of 16. The effect of that is: > * if the maximum cache size is e.g. 16 GB > * and there are e.g. 15 files 1 GB each (total 15 GB), > * it can happen that some files are evicted, > * because internally the cache is using 16 segments of 1 GB each, > * and by chance 2 files could be in the same segment, > * so that one of those files is evicted > The workaround is to use a really large cache size (e.g. 100 GB if you only > want 15 GB of cache size), but the drawback is that, if most files are very > small, that the cache size could become actually 100 GB. > The best solution is probably to use only 1 segment. There is tiny a > concurrency issue: right now, deleting files is synchronized on the segment. > But I think that's not a big problem (to be tested). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8950) DataStore: FileCache should use one cache segment
[ https://issues.apache.org/jira/browse/OAK-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17058695#comment-17058695 ] Thomas Mueller commented on OAK-8950: - http://svn.apache.org/r1875151 (trunk) > DataStore: FileCache should use one cache segment > - > > Key: OAK-8950 > URL: https://issues.apache.org/jira/browse/OAK-8950 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > The FileCache in the caching data store (Azure, S3) uses the default segment > count of 16. The effect of that is: > * if the maximum cache size is e.g. 16 GB > * and there are e.g. 15 files 1 GB each (total 15 GB), > * it can happen that some files are evicted, > * because internally the cache is using 16 segments of 1 GB each, > * and by chance 2 files could be in the same segment, > * so that one of those files is evicted > The workaround is to use a really large cache size (e.g. 100 GB if you only > want 15 GB of cache size), but the drawback is that, if most files are very > small, that the cache size could become actually 100 GB. > The best solution is probably to use only 1 segment. There is tiny a > concurrency issue: right now, deleting files is synchronized on the segment. > But I think that's not a big problem (to be tested). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8950) DataStore: FileCache should use one cache segment
[ https://issues.apache.org/jira/browse/OAK-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8950: Fix Version/s: 1.26.0 > DataStore: FileCache should use one cache segment > - > > Key: OAK-8950 > URL: https://issues.apache.org/jira/browse/OAK-8950 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.26.0 > > > The FileCache in the caching data store (Azure, S3) uses the default segment > count of 16. The effect of that is: > * if the maximum cache size is e.g. 16 GB > * and there are e.g. 15 files 1 GB each (total 15 GB), > * it can happen that some files are evicted, > * because internally the cache is using 16 segments of 1 GB each, > * and by chance 2 files could be in the same segment, > * so that one of those files is evicted > The workaround is to use a really large cache size (e.g. 100 GB if you only > want 15 GB of cache size), but the drawback is that, if most files are very > small, that the cache size could become actually 100 GB. > The best solution is probably to use only 1 segment. There is tiny a > concurrency issue: right now, deleting files is synchronized on the segment. > But I think that's not a big problem (to be tested). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8950) DataStore: FileCache should use one cache segment
[ https://issues.apache.org/jira/browse/OAK-8950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17057929#comment-17057929 ] Thomas Mueller commented on OAK-8950: - Patch for review: [https://github.com/oak-indexing/jackrabbit-oak/pull/63] > DataStore: FileCache should use one cache segment > - > > Key: OAK-8950 > URL: https://issues.apache.org/jira/browse/OAK-8950 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > The FileCache in the caching data store (Azure, S3) uses the default segment > count of 16. The effect of that is: > * if the maximum cache size is e.g. 16 GB > * and there are e.g. 15 files 1 GB each (total 15 GB), > * it can happen that some files are evicted, > * because internally the cache is using 16 segments of 1 GB each, > * and by chance 2 files could be in the same segment, > * so that one of those files is evicted > The workaround is to use a really large cache size (e.g. 100 GB if you only > want 15 GB of cache size), but the drawback is that, if most files are very > small, that the cache size could become actually 100 GB. > The best solution is probably to use only 1 segment. There is tiny a > concurrency issue: right now, deleting files is synchronized on the segment. > But I think that's not a big problem (to be tested). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OAK-8950) DataStore: FileCache should use one cache segment
Thomas Mueller created OAK-8950: --- Summary: DataStore: FileCache should use one cache segment Key: OAK-8950 URL: https://issues.apache.org/jira/browse/OAK-8950 Project: Jackrabbit Oak Issue Type: Improvement Components: blob Reporter: Thomas Mueller Assignee: Thomas Mueller The FileCache in the caching data store (Azure, S3) uses the default segment count of 16. The effect of that is: * if the maximum cache size is e.g. 16 GB * and there are e.g. 15 files 1 GB each (total 15 GB), * it can happen that some files are evicted, * because internally the cache is using 16 segments of 1 GB each, * and by chance 2 files could be in the same segment, * so that one of those files is evicted The workaround is to use a really large cache size (e.g. 100 GB if you only want 15 GB of cache size), but the drawback is that, if most files are very small, that the cache size could become actually 100 GB. The best solution is probably to use only 1 segment. There is tiny a concurrency issue: right now, deleting files is synchronized on the segment. But I think that's not a big problem (to be tested). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8898) On querying, IndexReader failed with AlreadyClosedException
[ https://issues.apache.org/jira/browse/OAK-8898?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17056052#comment-17056052 ] Thomas Mueller commented on OAK-8898: - [~mkataria] I created a branch here: [https://github.com/oak-indexing/jackrabbit-oak/tree/OAK-8898] This allows to reproduce the issue (it is based on your test case). I also found the root cause, and a possible solution (see LucenePropertyIndex.OLD_FACET_PROVIDER). The problem seems to be that the reader is used after it is closed, by leaking the reference to the searcher to the LuceneFacetProvider in loadDocs(). I created a DelayedLuceneFacetProvider that opens acquires and releases the searcher when needed (acquireIndexNode, release in finally). It would be good if the test can reproduce the issue even without the delays; we can discuss this. > On querying, IndexReader failed with AlreadyClosedException > --- > > Key: OAK-8898 > URL: https://issues.apache.org/jira/browse/OAK-8898 > Project: Jackrabbit Oak > Issue Type: Bug >Reporter: Mohit Kataria >Priority: Major > > This is an intermittent issue, where on querying the code throws > AlreadyClosedException. > > {code:java} > Caused by: org.apache.lucene.store.AlreadyClosedException: this IndexReader > is closed > at org.apache.lucene.index.IndexReader.ensureOpen(IndexReader.java:262) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.lucene.index.BaseCompositeReader.document(BaseCompositeReader.java:108) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at org.apache.lucene.index.IndexReader.document(IndexReader.java:446) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.jackrabbit.oak.plugins.index.lucene.util.StatisticalSortedSetDocValuesFacetCounts.getAccessibleSampleCount(StatisticalSortedSetDocValuesFacetCounts.java:169) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.jackrabbit.oak.plugins.index.lucene.util.StatisticalSortedSetDocValuesFacetCounts.getTopChildren0(StatisticalSortedSetDocValuesFacetCounts.java:104) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.jackrabbit.oak.plugins.index.lucene.util.StatisticalSortedSetDocValuesFacetCounts.getTopChildren(StatisticalSortedSetDocValuesFacetCounts.java:70) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.lucene.facet.MultiFacets.getTopChildren(MultiFacets.java:52) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LuceneFacetProvider.getFacets(LucenePropertyIndex.java:1547) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextResultRow.getFacets(FulltextIndex.java:353) > [org.apache.jackrabbit.oak-lucene:1.10.2] > at > org.apache.jackrabbit.oak.plugins.index.search.spi.query.FulltextIndex$FulltextPathCursor$2.getValue(FulltextIndex.java:472) > [org.apache.jackrabbit.oak-lucene:1.10.2] > ... 237 common frames omitted > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Comment Edited] (OAK-8934) Indexing: filter entries with a regular expression
[ https://issues.apache.org/jira/browse/OAK-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051214#comment-17051214 ] Thomas Mueller edited comment on OAK-8934 at 3/4/20, 1:11 PM: -- [http://svn.apache.org/r1874786|http://svn.apache.org/r1874786] was (Author: tmueller): [http://svn.apache.org/r1874786|http://svn/] > Indexing: filter entries with a regular expression > -- > > Key: OAK-8934 > URL: https://issues.apache.org/jira/browse/OAK-8934 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Labels: amrit > > We should provide a way to filter the index using a regular expression. For > example, only index nodes that contain a reference to another node. (Not a > JCR reference, but a reference within the value itself). For example, index a > node if one of the properties contains: > * /content/abc > * > * and so on > This will allow to run a query to find if /content/abc is referenced. The > index and the query will probably need to use a tag, and the cost of the > index needs to be high. Otherwise the query engine can't know when this index > should be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8934) Indexing: filter entries with a regular expression
[ https://issues.apache.org/jira/browse/OAK-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8934. - Resolution: Fixed > Indexing: filter entries with a regular expression > -- > > Key: OAK-8934 > URL: https://issues.apache.org/jira/browse/OAK-8934 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Labels: amrit > Fix For: 1.26.0 > > > We should provide a way to filter the index using a regular expression. For > example, only index nodes that contain a reference to another node. (Not a > JCR reference, but a reference within the value itself). For example, index a > node if one of the properties contains: > * /content/abc > * > * and so on > This will allow to run a query to find if /content/abc is referenced. The > index and the query will probably need to use a tag, and the cost of the > index needs to be high. Otherwise the query engine can't know when this index > should be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8934) Indexing: filter entries with a regular expression
[ https://issues.apache.org/jira/browse/OAK-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8934: Fix Version/s: 1.26.0 > Indexing: filter entries with a regular expression > -- > > Key: OAK-8934 > URL: https://issues.apache.org/jira/browse/OAK-8934 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Labels: amrit > Fix For: 1.26.0 > > > We should provide a way to filter the index using a regular expression. For > example, only index nodes that contain a reference to another node. (Not a > JCR reference, but a reference within the value itself). For example, index a > node if one of the properties contains: > * /content/abc > * > * and so on > This will allow to run a query to find if /content/abc is referenced. The > index and the query will probably need to use a tag, and the cost of the > index needs to be high. Otherwise the query engine can't know when this index > should be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8934) Indexing: filter entries with a regular expression
[ https://issues.apache.org/jira/browse/OAK-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17051214#comment-17051214 ] Thomas Mueller commented on OAK-8934: - [http://svn.apache.org/r1874786|http://svn/] > Indexing: filter entries with a regular expression > -- > > Key: OAK-8934 > URL: https://issues.apache.org/jira/browse/OAK-8934 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Labels: amrit > > We should provide a way to filter the index using a regular expression. For > example, only index nodes that contain a reference to another node. (Not a > JCR reference, but a reference within the value itself). For example, index a > node if one of the properties contains: > * /content/abc > * > * and so on > This will allow to run a query to find if /content/abc is referenced. The > index and the query will probably need to use a tag, and the cost of the > index needs to be high. Otherwise the query engine can't know when this index > should be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OAK-8934) Indexing: filter entries with a regular expression
Thomas Mueller created OAK-8934: --- Summary: Indexing: filter entries with a regular expression Key: OAK-8934 URL: https://issues.apache.org/jira/browse/OAK-8934 Project: Jackrabbit Oak Issue Type: Improvement Components: indexing Reporter: Thomas Mueller Assignee: Thomas Mueller We should provide a way to filter the index using a regular expression. For example, only index nodes that contain a reference to another node. (Not a JCR reference, but a reference within the value itself). For example, index a node if one of the properties contains: * /content/abc * * and so on This will allow to run a query to find if /content/abc is referenced. The index and the query will probably need to use a tag, and the cost of the index needs to be high. Otherwise the query engine can't know when this index should be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8934) Indexing: filter entries with a regular expression
[ https://issues.apache.org/jira/browse/OAK-8934?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8934: Labels: amrit (was: ) > Indexing: filter entries with a regular expression > -- > > Key: OAK-8934 > URL: https://issues.apache.org/jira/browse/OAK-8934 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Labels: amrit > > We should provide a way to filter the index using a regular expression. For > example, only index nodes that contain a reference to another node. (Not a > JCR reference, but a reference within the value itself). For example, index a > node if one of the properties contains: > * /content/abc > * > * and so on > This will allow to run a query to find if /content/abc is referenced. The > index and the query will probably need to use a tag, and the cost of the > index needs to be high. Otherwise the query engine can't know when this index > should be used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8910) Improve OAK Lucene Index Documentation
[ https://issues.apache.org/jira/browse/OAK-8910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8910. - Fix Version/s: 1.26.0 Resolution: Fixed > Improve OAK Lucene Index Documentation > -- > > Key: OAK-8910 > URL: https://issues.apache.org/jira/browse/OAK-8910 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Assignee: Thomas Mueller >Priority: Minor > Labels: amrit > Fix For: 1.26.0 > > Attachments: OAK-8910.patch > > > Improve [http://jackrabbit.apache.org/oak/docs/query/lucene.html] with the > following: > * Extend the *analyzers* section including a reference on how to support > *stemming* ([http://jackrabbit.apache.org/oak/docs/query/lucene.html]) > * *supersedes* - does not seem to be documented** > * *functionName (string)* & *useIfExists (string)* are not listed in the > canonical *Index Definition* structure. > * *function (string)* is not listed in the canonical *Property Definitions* > structure > * *weight* - in the canonical structure the default value is -1, but the > actual default is 5 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8910) Improve OAK Lucene Index Documentation
[ https://issues.apache.org/jira/browse/OAK-8910?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046769#comment-17046769 ] Thomas Mueller commented on OAK-8910: - http://svn.apache.org/r1874582 (trunk) > Improve OAK Lucene Index Documentation > -- > > Key: OAK-8910 > URL: https://issues.apache.org/jira/browse/OAK-8910 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Assignee: Thomas Mueller >Priority: Minor > Labels: amrit > Attachments: OAK-8910.patch > > > Improve [http://jackrabbit.apache.org/oak/docs/query/lucene.html] with the > following: > * Extend the *analyzers* section including a reference on how to support > *stemming* ([http://jackrabbit.apache.org/oak/docs/query/lucene.html]) > * *supersedes* - does not seem to be documented** > * *functionName (string)* & *useIfExists (string)* are not listed in the > canonical *Index Definition* structure. > * *function (string)* is not listed in the canonical *Property Definitions* > structure > * *weight* - in the canonical structure the default value is -1, but the > actual default is 5 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (OAK-8910) Improve OAK Lucene Index Documentation
[ https://issues.apache.org/jira/browse/OAK-8910?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller reassigned OAK-8910: --- Assignee: Thomas Mueller > Improve OAK Lucene Index Documentation > -- > > Key: OAK-8910 > URL: https://issues.apache.org/jira/browse/OAK-8910 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Assignee: Thomas Mueller >Priority: Minor > Labels: amrit > Attachments: OAK-8910.patch > > > Improve [http://jackrabbit.apache.org/oak/docs/query/lucene.html] with the > following: > * Extend the *analyzers* section including a reference on how to support > *stemming* ([http://jackrabbit.apache.org/oak/docs/query/lucene.html]) > * *supersedes* - does not seem to be documented** > * *functionName (string)* & *useIfExists (string)* are not listed in the > canonical *Index Definition* structure. > * *function (string)* is not listed in the canonical *Property Definitions* > structure > * *weight* - in the canonical structure the default value is -1, but the > actual default is 5 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-7671) [oak-run] Deprecate the datastorecheck command in favor of datastore
[ https://issues.apache.org/jira/browse/OAK-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17046750#comment-17046750 ] Thomas Mueller commented on OAK-7671: - Github has some issues currently according to https://www.githubstatus.com/ For me the patch looks good. For the method "encodeId", it would be good to add some comments on what it is doing and some example input and output. It's very hard to understand right now. But this was the case before, and is not related to the patch. If you already know some details (maybe by debugging), it would be good to add the info. It doesn't need to be a Javadoc: {noformat} /** * Encode the ... and extract the ... * Example: * => ... * => ... */ static String encodeId(String line, BlobStoreOptions.Type dsType) { // 0102030405... => 01/02/03/0102030405... blobId = (blobId.substring(0, 2) + FILE_SEPARATOR.value() + blobId.substring(2, 4) + FILE_SEPARATOR.value() + blobId .substring(4, 6) + FILE_SEPARATOR.value() + blobId); // 0102030405... => 0102-030405... blobId = (blobId.substring(0, 4) + DASH + blobId.substring(4)); if (list.size() > 1) { // ( this part I don't understand... why list.get(1)? what does it do?) return delimJoiner.join(blobId, EscapeUtils.unescapeLineBreaks(list.get(1))); {noformat} > [oak-run] Deprecate the datastorecheck command in favor of datastore > > > Key: OAK-7671 > URL: https://issues.apache.org/jira/browse/OAK-7671 > Project: Jackrabbit Oak > Issue Type: Task > Components: run >Reporter: Amit Jain >Assignee: Nitin Gupta >Priority: Major > Fix For: 1.26.0 > > > With the introduction of \{{datastore}} command which supports both garbage > collection as well as consistency check the \{{datastorecheck}} command > should be deprecated and delegated internally to use that implementation. > Besides some options which are currently not supported by the new command > should also be implemented e.g. --ids, --refs -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040129#comment-17040129 ] Thomas Mueller commented on OAK-8783: - http://svn.apache.org/r1874198 (trunk) > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch, > OAK-8783-v2.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8892: Labels: amrit (was: ) > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Assignee: Thomas Mueller >Priority: Minor > Labels: amrit > Fix For: 1.26.0 > > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller reassigned OAK-8892: --- Assignee: Thomas Mueller > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Assignee: Thomas Mueller >Priority: Minor > Fix For: 1.26.0 > > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8892. - Resolution: Fixed > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Assignee: Thomas Mueller >Priority: Minor > Labels: amrit > Fix For: 1.26.0 > > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17040100#comment-17040100 ] Thomas Mueller commented on OAK-8892: - Thanks [~reschke]! "svn patch" didn't work as expected... Now hopefully it's better: http://svn.apache.org/r1874197 (trunk) > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Fix For: 1.26.0 > > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8902) Add support in oak-run to list down blob ids for lucene indexes
[ https://issues.apache.org/jira/browse/OAK-8902?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039035#comment-17039035 ] Thomas Mueller commented on OAK-8902: - See my comments. > Add support in oak-run to list down blob ids for lucene indexes > --- > > Key: OAK-8902 > URL: https://issues.apache.org/jira/browse/OAK-8902 > Project: Jackrabbit Oak > Issue Type: Bug >Reporter: Nitin Gupta >Assignee: Nitin Gupta >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038820#comment-17038820 ] Thomas Mueller commented on OAK-8783: - Thanks [~amitjain]! I didn't think about this... > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch, > OAK-8783-v2.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038182#comment-17038182 ] Thomas Mueller commented on OAK-8892: - http://svn.apache.org/r1874108 > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Labels: amrit > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8892. - Resolution: Fixed > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Labels: amrit > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038181#comment-17038181 ] Thomas Mueller commented on OAK-8783: - http://svn.apache.org/r1874107 (trunk). Review is still welcome. > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch, > OAK-8783-v2.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17038128#comment-17038128 ] Thomas Mueller commented on OAK-8892: - [~reschke] no that was a mistake, I'm sorry... I will remove the export versions and try again. /cc [~amrverma] > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Labels: amrit > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036858#comment-17036858 ] Thomas Mueller commented on OAK-8783: - [~ngupta] [~tihom88] [~fabrizio.fort...@gmail.com] could you please review OAK-8783-v2.patch ? > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch, > OAK-8783-v2.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8783: Attachment: OAK-8783-v2.patch > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch, > OAK-8783-v2.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8892. - Resolution: Fixed > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Labels: amrit > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17036188#comment-17036188 ] Thomas Mueller commented on OAK-8892: - Thanks! http://svn.apache.org/r1873977 > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Labels: amrit > Attachments: OAK-8892.patch > > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8711) Queries with facets should not use traversal
[ https://issues.apache.org/jira/browse/OAK-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17030682#comment-17030682 ] Thomas Mueller commented on OAK-8711: - The attached patch looks good to me. One nitpick: in the test case, you could in theory check if the right index is used, by executing "explain select ..." and then check the query plan. But I think it's not strictly needed to have such a test case, I'm fine with what you have right now. > Queries with facets should not use traversal > > > Key: OAK-8711 > URL: https://issues.apache.org/jira/browse/OAK-8711 > Project: Jackrabbit Oak > Issue Type: Bug >Reporter: Nitin Gupta >Assignee: Nitin Gupta >Priority: Major > Labels: amrit > Attachments: OAK-8711.patch > > > Consider a scenario where a query is there with facets and the traversal cost > is less than the index cost that serves the facet query . This would be > problematic. > > In this case we should maybe set the traversal cost to infinity so that > traversal is not an option for queries with facets. > > In case there is no index available to serve this faceted query we can > probably throw an exception with a meaningful message . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8892) Add javadoc to package-info files
[ https://issues.apache.org/jira/browse/OAK-8892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029829#comment-17029829 ] Thomas Mueller commented on OAK-8892: - See pull request https://github.com/apache/jackrabbit-oak/pull/175 > Add javadoc to package-info files > - > > Key: OAK-8892 > URL: https://issues.apache.org/jira/browse/OAK-8892 > Project: Jackrabbit Oak > Issue Type: Task >Reporter: Amrit Verma >Priority: Minor > Labels: amrit > > Add javadoc to package-info files in all packages of {{oak-lucene}} , > {{oak-query-spi}} and {{oak-search}} . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8711) Queries with facets should not use traversal
[ https://issues.apache.org/jira/browse/OAK-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17029830#comment-17029830 ] Thomas Mueller commented on OAK-8711: - See pull request https://github.com/apache/jackrabbit-oak/pull/174 > Queries with facets should not use traversal > > > Key: OAK-8711 > URL: https://issues.apache.org/jira/browse/OAK-8711 > Project: Jackrabbit Oak > Issue Type: Bug >Reporter: Nitin Gupta >Assignee: Nitin Gupta >Priority: Major > Labels: amrit > > Consider a scenario where a query is there with facets and the traversal cost > is less than the index cost that serves the facet query . This would be > problematic. > > In this case we should maybe set the traversal cost to infinity so that > traversal is not an option for queries with facets. > > In case there is no index available to serve this faceted query we can > probably throw an exception with a meaningful message . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8711) Queries with facets should not use traversal
[ https://issues.apache.org/jira/browse/OAK-8711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8711: Labels: amrit (was: ) > Queries with facets should not use traversal > > > Key: OAK-8711 > URL: https://issues.apache.org/jira/browse/OAK-8711 > Project: Jackrabbit Oak > Issue Type: Bug >Reporter: Nitin Gupta >Assignee: Nitin Gupta >Priority: Major > Labels: amrit > > Consider a scenario where a query is there with facets and the traversal cost > is less than the index cost that serves the facet query . This would be > problematic. > > In this case we should maybe set the traversal cost to infinity so that > traversal is not an option for queries with facets. > > In case there is no index available to serve this faceted query we can > probably throw an exception with a meaningful message . -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8854) Improved log message when failed to index an node due to IOException
[ https://issues.apache.org/jira/browse/OAK-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8854. - Resolution: Fixed > Improved log message when failed to index an node due to IOException > > > Key: OAK-8854 > URL: https://issues.apache.org/jira/browse/OAK-8854 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.22.0 > > > When there is an IOException trying to index the node, there are cases where > the root cause (IOException message) is not logged. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8854) Improved log message when failed to index an node due to IOException
[ https://issues.apache.org/jira/browse/OAK-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8854: Fix Version/s: 1.22.0 > Improved log message when failed to index an node due to IOException > > > Key: OAK-8854 > URL: https://issues.apache.org/jira/browse/OAK-8854 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.22.0 > > > When there is an IOException trying to index the node, there are cases where > the root cause (IOException message) is not logged. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8854) Improved log message when failed to index an node due to IOException
[ https://issues.apache.org/jira/browse/OAK-8854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012951#comment-17012951 ] Thomas Mueller commented on OAK-8854: - http://svn.apache.org/r1872603 http://svn.apache.org/r1872604 > Improved log message when failed to index an node due to IOException > > > Key: OAK-8854 > URL: https://issues.apache.org/jira/browse/OAK-8854 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > When there is an IOException trying to index the node, there are cases where > the root cause (IOException message) is not logged. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OAK-8854) Improved log message when failed to index an node due to IOException
Thomas Mueller created OAK-8854: --- Summary: Improved log message when failed to index an node due to IOException Key: OAK-8854 URL: https://issues.apache.org/jira/browse/OAK-8854 Project: Jackrabbit Oak Issue Type: Improvement Components: indexing Reporter: Thomas Mueller Assignee: Thomas Mueller When there is an IOException trying to index the node, there are cases where the root cause (IOException message) is not logged. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6254) DataStore: API to retrieve approximate storage size
[ https://issues.apache.org/jira/browse/OAK-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6254: Priority: Minor (was: Major) > DataStore: API to retrieve approximate storage size > --- > > Key: OAK-6254 > URL: https://issues.apache.org/jira/browse/OAK-6254 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob >Reporter: Thomas Mueller >Priority: Minor > > The estimated size of the datastore (on disk) is needed to: > * monitor growth over time, or growth of certain operations > * monitor if garbage collection is effective > * avoid out of disk space > * estimate backup size > * statistical purposes (for example, if there are many repositories, to group > them by size) > Datastore size: we could use the following heuristic: We could read the file > sizes in ./datastore/00/00 (if it exists) and multiply by 65536; or > ./datastore/00 and multiply by 256. That would give a rough estimation > (within about 20% for repositories with datastore size > 50 GB). > I think this is mainly important for the FileDataStore. The S3 datastore, if > there is a simple and fast S3 API to read the size, then that would be good > as well, but if there is none, then returning "unknown" is fine for me. > As for the API, I would use something like this: {{long > getEstimatedStorageSize(int accuracyLevel)}} with accuracyLevel 1 for > inaccurate (fastest), 2 more accurate (slower),..., 9 precise (possibly very > slow). Similar to > [java.util.zip.Deflater.setLevel|https://docs.oracle.com/javase/7/docs/api/java/util/zip/Deflater.html#setLevel(int)]. > I would expect it takes up to 1 second for accuracyLevel 0, up to 5 seconds > for accuracyLevel 1, and possibly hours for level 9. With level 1, I would > read files in 00/00, with level 2 - 8 I would read files in 00, and with > level 9 I would read all the files. For level 1, I wouldn't stop; for level > 2, if it takes more than 5 seconds, I would stop and return the current best > estimate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6254) DataStore: API to retrieve approximate storage size
[ https://issues.apache.org/jira/browse/OAK-6254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6254: Fix Version/s: (was: 1.22.0) > DataStore: API to retrieve approximate storage size > --- > > Key: OAK-6254 > URL: https://issues.apache.org/jira/browse/OAK-6254 > Project: Jackrabbit Oak > Issue Type: Bug > Components: blob >Reporter: Thomas Mueller >Priority: Major > > The estimated size of the datastore (on disk) is needed to: > * monitor growth over time, or growth of certain operations > * monitor if garbage collection is effective > * avoid out of disk space > * estimate backup size > * statistical purposes (for example, if there are many repositories, to group > them by size) > Datastore size: we could use the following heuristic: We could read the file > sizes in ./datastore/00/00 (if it exists) and multiply by 65536; or > ./datastore/00 and multiply by 256. That would give a rough estimation > (within about 20% for repositories with datastore size > 50 GB). > I think this is mainly important for the FileDataStore. The S3 datastore, if > there is a simple and fast S3 API to read the size, then that would be good > as well, but if there is none, then returning "unknown" is fine for me. > As for the API, I would use something like this: {{long > getEstimatedStorageSize(int accuracyLevel)}} with accuracyLevel 1 for > inaccurate (fastest), 2 more accurate (slower),..., 9 precise (possibly very > slow). Similar to > [java.util.zip.Deflater.setLevel|https://docs.oracle.com/javase/7/docs/api/java/util/zip/Deflater.html#setLevel(int)]. > I would expect it takes up to 1 second for accuracyLevel 0, up to 5 seconds > for accuracyLevel 1, and possibly hours for level 9. With level 1, I would > read files in 00/00, with level 2 - 8 I would read files in 00, and with > level 9 I would read all the files. For level 1, I wouldn't stop; for level > 2, if it takes more than 5 seconds, I would stop and return the current best > estimate. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-5787) BlobStore should be AutoCloseable
[ https://issues.apache.org/jira/browse/OAK-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16994394#comment-16994394 ] Thomas Mueller commented on OAK-5787: - +1 > BlobStore should be AutoCloseable > - > > Key: OAK-5787 > URL: https://issues.apache.org/jira/browse/OAK-5787 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.22.0 > > Attachments: OAK-5787.diff > > > {{DocumentNodeStore}} currently calls {{close()}} if the blob store instance > implements {{Closeable}}. > This has led to problems where wrapper implementations did not implement it, > and thus the actual blob store instance wasn't properly shut down. > Proposal: make {{BlobStore}} extend {{Closeable}} and get rid of all > {{instanceof}} checks. > [~thomasm] [~amitjain] - feedback appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984861#comment-16984861 ] Thomas Mueller commented on OAK-8783: - http://svn.apache.org/r1870584 (trunk) - reviews are still welcome. I also had to change the version (from 1.0.1 to 1.1.0). > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8783: Component/s: indexing > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8783: Fix Version/s: (was: 1.22.0) > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8783: Fix Version/s: 1.22.0 > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.22.0 > > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984859#comment-16984859 ] Thomas Mueller commented on OAK-8783: - Good point! I will change the newObjectNotRespectingOrder test, so that it doesn't expect any specific order. > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984815#comment-16984815 ] Thomas Mueller commented on OAK-8783: - [~ngupta] [~tihom88] [~fabrizio.fort...@gmail.com] could you please review OAK-8783-json-1.patch ? > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8783: Attachment: OAK-8783-json-1.patch > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-json-1.patch, OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16984812#comment-16984812 ] Thomas Mueller commented on OAK-8783: - One problem is that the Gson library doesn't support the child order https://stackoverflow.com/questions/6365851/how-to-keep-fields-sequence-in-gson-serialization This is a problem because indexes in Oak do need to respect order of child nodes for some features: http://jackrabbit.apache.org/oak/docs/query/lucene.html "The rules are looked up in the order of there entry under indexRules node (indexRule node itself is of type nt:unstructured which has orderable child nodes)" - "Order of property definition node is important as some properties are based on regular expressions" Instead of Gson, we need use a different serialization library, e.g. the Oak JsonObject. I will add the needed features and tests there first. > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8794) oak-solr-osgi does not build for Java 8 if Jackson libraries upgraded to 2.10.0
[ https://issues.apache.org/jira/browse/OAK-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16982268#comment-16982268 ] Thomas Mueller commented on OAK-8794: - Un-assigning from me right now. > Would it be possible to update oak-parent/pom.xml to Jackson version 2.10.0 > and then specify 2.9.10 in oak-solr-osgi? [~teofili], do you know if this might work? > oak-solr-osgi does not build for Java 8 if Jackson libraries upgraded to > 2.10.0 > --- > > Key: OAK-8794 > URL: https://issues.apache.org/jira/browse/OAK-8794 > Project: Jackrabbit Oak > Issue Type: Bug > Components: solr >Affects Versions: 1.20.0 >Reporter: Matt Ryan >Priority: Major > > If the Jackson version in {{oak-parent/pom.xml}} is updated from 2.9.10 to > 2.10.0, we get a build failure in {{oak-solr-osgi}} if we try to build with > Java 8. > This is blocking OAK-8105 which in turn is blocking OAK-8607 and OAK-8104. > OAK-8105 is about updating {{AzureDataStore}} to the Azure version 12 SDK > which requires Jackson 2.10.0. > Would it be possible to update {{oak-parent/pom.xml}} to Jackson version > 2.10.0 and then specify 2.9.10 in {{oak-solr-osgi}}? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (OAK-8794) oak-solr-osgi does not build for Java 8 if Jackson libraries upgraded to 2.10.0
[ https://issues.apache.org/jira/browse/OAK-8794?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller reassigned OAK-8794: --- Assignee: (was: Thomas Mueller) > oak-solr-osgi does not build for Java 8 if Jackson libraries upgraded to > 2.10.0 > --- > > Key: OAK-8794 > URL: https://issues.apache.org/jira/browse/OAK-8794 > Project: Jackrabbit Oak > Issue Type: Bug > Components: solr >Affects Versions: 1.20.0 >Reporter: Matt Ryan >Priority: Major > > If the Jackson version in {{oak-parent/pom.xml}} is updated from 2.9.10 to > 2.10.0, we get a build failure in {{oak-solr-osgi}} if we try to build with > Java 8. > This is blocking OAK-8105 which in turn is blocking OAK-8607 and OAK-8104. > OAK-8105 is about updating {{AzureDataStore}} to the Azure version 12 SDK > which requires Jackson 2.10.0. > Would it be possible to update {{oak-parent/pom.xml}} to Jackson version > 2.10.0 and then specify 2.9.10 in {{oak-solr-osgi}}? -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16980259#comment-16980259 ] Thomas Mueller commented on OAK-8783: - Attached a first patch (work in progress). > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8783) Merge index definitions
[ https://issues.apache.org/jira/browse/OAK-8783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8783: Attachment: OAK-8783-v1.patch > Merge index definitions > --- > > Key: OAK-8783 > URL: https://issues.apache.org/jira/browse/OAK-8783 > Project: Jackrabbit Oak > Issue Type: Improvement >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Attachments: OAK-8783-v1.patch > > > If there are multiple versions of an index, e.g. asset-2-custom-2 and > asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OAK-8783) Merge index definitions
Thomas Mueller created OAK-8783: --- Summary: Merge index definitions Key: OAK-8783 URL: https://issues.apache.org/jira/browse/OAK-8783 Project: Jackrabbit Oak Issue Type: Improvement Reporter: Thomas Mueller Assignee: Thomas Mueller If there are multiple versions of an index, e.g. asset-2-custom-2 and asset-3, then oak-run should be able to merge them to asset-3-custom-1. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8779) QueryImpl: indexPlan used for logging always is null
[ https://issues.apache.org/jira/browse/OAK-8779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16979313#comment-16979313 ] Thomas Mueller commented on OAK-8779: - You are right. I saw this as well some time ago, but so far didn't log an issue. I will add that to the technical dept list. > QueryImpl: indexPlan used for logging always is null > > > Key: OAK-8779 > URL: https://issues.apache.org/jira/browse/OAK-8779 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Reporter: Julian Reschke >Priority: Minor > > > {noformat} > if (indexPlan != null && indexPlan.getPlanName() != null) { > indexName += "[" + indexPlan.getPlanName() + "]"; > } {noformat} > > (indexPlan always is null, maybe caused by code being moved around) > > cc: [~chetanm] [~thomasm] -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6261) Log queries that sort by un-indexed properties
[ https://issues.apache.org/jira/browse/OAK-6261?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6261: Fix Version/s: (was: 1.22.0) > Log queries that sort by un-indexed properties > -- > > Key: OAK-6261 > URL: https://issues.apache.org/jira/browse/OAK-6261 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Minor > > Queries that can read many nodes, and sort by properties that are not > indexed, can be very slow. This includes for example fulltext queries. > As a start, it might make sense to log an "info" level message (but avoid > logging the same message each time a query is run). Per configuration, this > could be turned to "warning". -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-7300) Lucene Index: per-column selectivity to improve cost estimation
[ https://issues.apache.org/jira/browse/OAK-7300?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-7300: Fix Version/s: (was: 1.22.0) > Lucene Index: per-column selectivity to improve cost estimation > --- > > Key: OAK-7300 > URL: https://issues.apache.org/jira/browse/OAK-7300 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene, query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > In OAK-6735 we have improved cost estimation for Lucene indexes, however the > following case is still not working as expected: a very common property is > indexes (many nodes have that property), and each value of that property is > more or less unique. In this case, currently the cost estimation is the total > number of documents that contain that property. Assuming the condition > "property is not null" this is correct, however for the common case "property > = x" the estimated cost is far too high. > A known workaround is to set the "costPerEntry" for the given index to a low > value, for example 0.2. However this isn't a good solution, as it affects all > properties and queries. > It would be good to be able to set the selectivity per property, for example > by specifying the number of distinct values, or (better yet) the average > number of entries for a given key (1 for unique values, 2 meaning for each > distinct values there are two documents on average). > That value can be set manually (cost override), and it can be set > automatically, e.g. when building the index, or updated from time to time > during the index update, using a cardinality > estimation algorithm. That doesn't have to be accurate; we could use an rough > approximation such as hyperbitbit. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-7374) Investigate changing the UUID generation algorithm / format to reduce index size, improve speed
[ https://issues.apache.org/jira/browse/OAK-7374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-7374: Fix Version/s: (was: 1.22.0) > Investigate changing the UUID generation algorithm / format to reduce index > size, improve speed > --- > > Key: OAK-7374 > URL: https://issues.apache.org/jira/browse/OAK-7374 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: core >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > UUIDs are currently randomly generated, which is bad for indexing; specially > read and writes access, due to low locality. > If we could add a time component, I think the index churn (amount of writes) > would shrink, and lookup would be faster. > It should be fairly easy to verify if that's really true (create a > proof-of-concept, and measure). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-3219) Lucene IndexPlanner should also account for number of property constraints evaluated while giving cost estimation
[ https://issues.apache.org/jira/browse/OAK-3219?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-3219: Fix Version/s: (was: 1.22.0) > Lucene IndexPlanner should also account for number of property constraints > evaluated while giving cost estimation > - > > Key: OAK-3219 > URL: https://issues.apache.org/jira/browse/OAK-3219 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Chetan Mehrotra >Assignee: Thomas Mueller >Priority: Minor > Labels: performance > > Currently the cost returned by Lucene index is a function of number of > indexed documents present in the index. If the number of indexed entries are > high then it might reduce chances of this index getting selected if some > property index also support of the property constraint. > {noformat} > /jcr:root/content/freestyle-cms/customers//element(*, cq:Page) > [(jcr:content/@title = 'm' or jcr:like(jcr:content/@title, 'm%')) > and jcr:content/@sling:resourceType = '/components/page/customer’] > {noformat} > Consider above query with following index definition > * A property index on resourceType > * A Lucene index for cq:Page with properties {{jcr:content/title}}, > {{jcr:content/sling:resourceType}} indexed and also path restriction > evaluation enabled > Now what the two indexes can help in > # Property index > ## Path restriction > ## Property restriction on {{sling:resourceType}} > # Lucene index > ## NodeType restriction > ## Property restriction on {{sling:resourceType}} > ## Property restriction on {{title}} > ## Path restriction > Now cost estimate currently works like this > * Property index - {{f(indexedValueEstimate, estimateOfNodesUnderGivenPath)}} > ** indexedValueEstimate - For 'sling:resourceType=foo' its the approximate > count for nodes having that as 'foo' > ** estimateOfNodesUnderGivenPath - Its derived from an approximate estimation > of nodes present under given path > * Lucene Index - {{f(totalIndexedEntries)}} > As cost of Lucene is too simple it does not reflect the reality. Following 2 > changes can be done to make it better > * Given that Lucene index can handle multiple constraints compared (4) to > property index (2), the cost estimate returned by it should also reflect this > state. This can be done by setting costPerEntry to 1/(no of property > restriction evaluated) > * Get the count for queried property value - This is similar to what > PropertyIndex does and assumes that Lucene can provide that information in > O(1) cost. In case of multiple supported property restriction this can be > minima of all -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6844) Consistency checker Directory value is always ":data"
[ https://issues.apache.org/jira/browse/OAK-6844?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6844: Fix Version/s: (was: 1.22.0) > Consistency checker Directory value is always ":data" > - > > Key: OAK-6844 > URL: https://issues.apache.org/jira/browse/OAK-6844 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.7.9 >Reporter: Paul Chibulcuteanu >Assignee: Thomas Mueller >Priority: Minor > > When running a _fullCheck_ consistency check from the Lucene Index statistics > MBean, the _Directory_ results is always _:data_ > See below: > {code} > /oak:index/lucene => VALID > Size : 42.3 MB > Directory : :data > Size : 42.3 MB > Num docs : 159132 > CheckIndex status : true > Time taken : 3.544 s > {code} > I'm not really sure what information should be put here, but the _:data_ > value is confusing. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6897) XPath query: option to _not_ convert "or" to "union"
[ https://issues.apache.org/jira/browse/OAK-6897?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6897: Fix Version/s: (was: 1.22.0) > XPath query: option to _not_ convert "or" to "union" > > > Key: OAK-6897 > URL: https://issues.apache.org/jira/browse/OAK-6897 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Trivial > > Right now, all XPath queries that contain "or" of the form "@a=1 or @b=2" are > converted to SQL-2 "union". In some cases, this is a problem, specially in > combination with "order by @jcr:score desc". > Now that SQL-2 "or" conditions can be converted to union (depending if union > has a lower cost), it is no longer strictly needed to do the union conversion > in the XPath conversion. Or at least emit different SQL-2 queries and take > the one with the lowest cost. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-5787) BlobStore should be AutoCloseable
[ https://issues.apache.org/jira/browse/OAK-5787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16975071#comment-16975071 ] Thomas Mueller commented on OAK-5787: - For DefaultSplitBlobStore, if both thrown exception, the first one is lost. I think a solution would be to use addSuppressed (available in Java 1.7): {noformat} + +@Override +public void close() throws Exception { +Exception thrown = null; +try { +oldBlobStore.close(); +} catch (Exception ex) { +thrown = ex; +} +try { +newBlobStore.close(); +} catch (Exception ex) { +if (thrown != null) { +thrown.addSuppressed(ex); +} else { +thrown = ex; +} +} +if (thrown != null) { +throw thrown; +} +} {noformat} > BlobStore should be AutoCloseable > - > > Key: OAK-5787 > URL: https://issues.apache.org/jira/browse/OAK-5787 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: blob >Reporter: Julian Reschke >Assignee: Julian Reschke >Priority: Minor > Fix For: 1.22.0 > > Attachments: OAK-5787.diff > > > {{DocumentNodeStore}} currently calls {{close()}} if the blob store instance > implements {{Closeable}}. > This has led to problems where wrapper implementations did not implement it, > and thus the actual blob store instance wasn't properly shut down. > Proposal: make {{BlobStore}} extend {{Closeable}} and get rid of all > {{instanceof}} checks. > [~thomasm] [~amitjain] - feedback appreciated. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973400#comment-16973400 ] Thomas Mueller commented on OAK-8673: - [~angela] I'm sorry I don't fully understand this... Is there some documentation where this is explained? It might help to have it, for cases were the cache sizes need to be adjusted (to avoid out of memory). As far as I know (maybe wrong), there is: * eager cache (per session? in number of entries and not memory usage. configurable as you configured it, but how?) * lazy-evaluation cache (per session? how large? I assume in number of entries and not memory usage. configurable?) * defaultpermissioncache (what is that exactly? is it lazy-evaluation cache or eager cache or something else?) When opening a session, the eager cache is filled if cache size is large enough(?) If too large, then not. But there is a lazy-evaluation. What I still don't get - If benchmark results are if the eager cache is disabled, why is it so slow? Is it just that for this test case, hit rate on the lazy-evaluation cache is so bad? > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Benchmarks with 10-times re-reading the same random item: > As I would have expected it seems that the negative impact of lazy-loading is > somewhat reduced, as the re-reading will hit the cache populated while > reading. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973185#comment-16973185 ] Thomas Mueller commented on OAK-8673: - > beyond the task at hand to re-evaluate if the current value of > eager-cache-size is sufficient Well you don't want to expand the cache size if there is a risk of running out of memory... But given the next statement I'm not sure if there really is such a risk... > even for the lazy-evaluation a cache is populated (in fact there are even 2 > maps in that case), so depending on the distribution of permission entries > and the access pattern (read/writing), the lazy cache might even consume more > memory than the eager-cache... But, why are benchmark results so bad the eager cache is disabled (size set to 0)? > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Benchmarks with 10-times re-reading the same random item: > As I would have expected it seems that the negative impact of lazy-loading is > somewhat reduced, as the re-reading will hit the cache populated while > reading. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16973150#comment-16973150 ] Thomas Mueller commented on OAK-8673: - So with cach size 0 (no cache), the system is very slow (basically unusable). So a cache is need. I see two problems: * A: Having one cache per session is problematic if there is no limit in the number of sessions: there is no way to guarantee the system will not run out of memory. Is there no way to use just one cache (for all sessions)? * B: Having a cache size in number of entries is problematic, if memory usage of entries is very different: there is no way to guarantee the system will not run out of memory. To solve this, in various places in Oak we use "weighted" caches, and estimate memory usage of entries (e.g. for strings, 24 + number of characters). I can help with this. I think both A and B need to be addressed. > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Benchmarks with 10-times re-reading the same random item: > As I would have expected it seems that the negative impact of lazy-loading is > somewhat reduced, as the re-reading will hit the cache populated while > reading. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8729. - Resolution: Fixed > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969230#comment-16969230 ] Thomas Mueller commented on OAK-8729: - http://svn.apache.org/r1869505 (trunk) > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969229#comment-16969229 ] Thomas Mueller commented on OAK-8729: - I'm afraid I don't know currently how we could make this part more stable... verifying the directory is still open would be a good idea, but I'm afraid I don't know currently how to do that without changing a lot of code (basically, not use the Lucene interfaces). > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16969227#comment-16969227 ] Thomas Mueller commented on OAK-8729: - > The close method for wrapForRead [1] calls remote.close and local.close [2] > [2]and same instance is being used by wrapForWrite[3]. Yes, that's true. I verified the remote is closed, but the tests don't fail due to that. Unfortunately, it is hard to verify the directory is not closed: there is a verify method in the Directory interface, but it is not public (only protected). > Can we perform operations even if close had been called on Directory instance? It looks like none of the tests failed due to this. It seems like the operations we perform don't cause problems. > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-5858) Lucene index may return the wrong result if path is excluded
[ https://issues.apache.org/jira/browse/OAK-5858?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-5858: Fix Version/s: (was: 1.20.0) > Lucene index may return the wrong result if path is excluded > > > Key: OAK-5858 > URL: https://issues.apache.org/jira/browse/OAK-5858 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Thomas Mueller >Priority: Major > > If a query uses a Lucene index that has "excludedPaths", the query result may > be wrong (not contain all matching nodes). This is case even if there is a > property index available for the queried property. Example: > {noformat} > Indexes: > /oak:index/resourceType/type = "property" > /oak:index/lucene/type = "lucene" > /oak:index/lucene/excludedPaths = ["/etc"] > /oak:index/lucene/indexRules/nt:base/properties/resourceType > Query: > /jcr:root/etc//*[jcr:like(@resourceType, "x%y")] > Index cost: > cost for /oak:index/resourceType is 1602.0 > cost for /oak:index/lucene is 1001.0 > Result: > (empty) > Expected result: > /etc/a > /etc/b > {noformat} > Here, the lucene index is picked, even thought the query explicitly queries > for /etc, and the lucene index has this path excluded. > I think the lucene index should not be picked in case the index does not > match the query path. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-5980) Bad Join Query Plan Used
[ https://issues.apache.org/jira/browse/OAK-5980?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-5980: Fix Version/s: (was: 1.20.0) > Bad Join Query Plan Used > > > Key: OAK-5980 > URL: https://issues.apache.org/jira/browse/OAK-5980 > Project: Jackrabbit Oak > Issue Type: Bug > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > For a join query, where selectors are joined over ischildnode but also can > use an index, > the selectors sometimes use the index instead of the much less > expensive parent join. Example: > {noformat} > select [a].* from [nt:unstructured] as [a] > inner join [nt:unstructured] as [b] on ischildnode([b], [a]) > inner join [nt:unstructured] as [c] on ischildnode([c], [b]) > inner join [nt:unstructured] as [d] on ischildnode([d], [c]) > inner join [nt:unstructured] as [e] on ischildnode([e], [d]) > where [a].[classname] = 'letter' > and isdescendantnode([a], '/content') > and [c].[classname] = 'chapter' > and localname([b]) = 'chapters' > and [e].[classname] = 'list' > and localname([d]) = 'lists' > and [e].[path] = cast('/content/abc' as path) > {noformat} > The order of selectors is sometimes wrong (not e, d, c, b, a), but > more importantly, selectors c and a use the index on className. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-5706) Function based indexes with "like" conditions
[ https://issues.apache.org/jira/browse/OAK-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-5706: Fix Version/s: (was: 1.20.0) > Function based indexes with "like" conditions > - > > Key: OAK-5706 > URL: https://issues.apache.org/jira/browse/OAK-5706 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > Currently, a function-based index is not used when using "like" conditions, > as follows: > {noformat} > /jcr:root//*[jcr:like(fn:lower-case(fn:name()), 'abc%')] > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-5739) Misleading traversal warning for spellcheck queries without index
[ https://issues.apache.org/jira/browse/OAK-5739?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-5739: Fix Version/s: (was: 1.20.0) > Misleading traversal warning for spellcheck queries without index > - > > Key: OAK-5739 > URL: https://issues.apache.org/jira/browse/OAK-5739 > Project: Jackrabbit Oak > Issue Type: Bug > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > In OAK-4313 we avoid traversal for native queries, but we see in some cases > traversal warnings as follows: > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl query plan > [nt:base] as [a] /* traverse "" where (spellcheck([a], 'NothingToFind')) > and (issamenode([a], [/])) */ > org.apache.jackrabbit.oak.query.QueryImpl Traversal query (query without > index): > select [jcr:path], [jcr:score], [rep:spellcheck()] from [nt:base] as a where > spellcheck('NothingToFind') > and issamenode(a, '/') > /* xpath: /jcr:root > [rep:spellcheck('NothingToFind')]/(rep:spellcheck()) */; > consider creating an index > {noformat} > This warning is misleading. If no index is available, then either the query > should fail, or the warning should say that the query result is not correct > because traversal is used. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-5369) Lucene Property Index: Syntax Error, cannot parse
[ https://issues.apache.org/jira/browse/OAK-5369?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-5369. - Resolution: Won't Fix > Lucene Property Index: Syntax Error, cannot parse > - > > Key: OAK-5369 > URL: https://issues.apache.org/jira/browse/OAK-5369 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > > The following query throws an exception in Apache Lucene: > {noformat} > /jcr:root//*[jcr:contains(., 'hello -- world')] > 22.12.2016 16:42:54.511 *WARN* [qtp1944702753-3846] > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex query via > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex@1c0006db > failed. > java.lang.RuntimeException: INVALID_SYNTAX_CANNOT_PARSE: Syntax Error, cannot > parse hello -- world: > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.tokenToQuery(LucenePropertyIndex.java:1450) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.tokenToQuery(LucenePropertyIndex.java:1418) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.access$900(LucenePropertyIndex.java:180) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$3.visitTerm(LucenePropertyIndex.java:1353) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$3.visit(LucenePropertyIndex.java:1307) > at > org.apache.jackrabbit.oak.query.fulltext.FullTextContains.accept(FullTextContains.java:63) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.getFullTextQuery(LucenePropertyIndex.java:1303) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.getLuceneRequest(LucenePropertyIndex.java:791) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex.access$300(LucenePropertyIndex.java:180) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.loadDocs(LucenePropertyIndex.java:375) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:317) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$1.computeNext(LucenePropertyIndex.java:306) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor$1.hasNext(LucenePropertyIndex.java:1571) > at com.google.common.collect.Iterators$7.computeNext(Iterators.java:645) > at > com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) > at > com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) > at > org.apache.jackrabbit.oak.spi.query.Cursors$PathCursor.hasNext(Cursors.java:205) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LucenePropertyIndex$LucenePathCursor.hasNext(LucenePropertyIndex.java:1595) > at > org.apache.jackrabbit.oak.query.ast.SelectorImpl.next(SelectorImpl.java:420) > at > org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.fetchNext(QueryImpl.java:828) > at > org.apache.jackrabbit.oak.query.QueryImpl$RowIterator.hasNext(QueryImpl.java:853) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.fetch(QueryResultImpl.java:98) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl$1.(QueryResultImpl.java:94) > at > org.apache.jackrabbit.oak.jcr.query.QueryResultImpl.getRows(QueryResultImpl.java:78) > Caused by: > org.apache.lucene.queryparser.flexible.standard.parser.ParseException: Syntax > Error, cannot parse hello -- world: > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.generateParseException(StandardSyntaxParser.java:1054) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.jj_consume_token(StandardSyntaxParser.java:936) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.Clause(StandardSyntaxParser.java:486) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.ModClause(StandardSyntaxParser.java:303) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.ConjQuery(StandardSyntaxParser.java:234) > at > org.apache.lucene.queryparser.flexible.standard.parser.StandardSyntaxParser.DisjQuery(StandardSyntaxParser.java:204) > at > org.apache.lucene.queryparser.flexible.st
[jira] [Updated] (OAK-3866) Sorting on relative properties doesn't work in Solr
[ https://issues.apache.org/jira/browse/OAK-3866?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-3866: Fix Version/s: (was: 1.20.0) > Sorting on relative properties doesn't work in Solr > --- > > Key: OAK-3866 > URL: https://issues.apache.org/jira/browse/OAK-3866 > Project: Jackrabbit Oak > Issue Type: Bug > Components: solr >Affects Versions: 1.0.22, 1.2.9, 1.3.13 >Reporter: Tommaso Teofili >Assignee: Tommaso Teofili >Priority: Major > > Executing a query like > {noformat} > /jcr:root/content/foo//*[(@sling:resourceType = 'x' or @sling:resourceType = > 'y') and jcr:contains(., 'bar*~')] order by jcr:content/@jcr:primaryType > descending > {noformat} > would assume sorting on the _jcr:primaryType_ property of resulting nodes' > _jcr:content_ children. > That is currently not supported in Solr, while it is in Lucene as the latter > supports index time aggregation. > We should inspect if it's possible to extend support for Solr too, most > probably via index time aggregation. > The query should not fail but at least log a warning about that limitation > for the time being. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-3437) Regression in org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR5 when enabling OAK-1617
[ https://issues.apache.org/jira/browse/OAK-3437?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-3437: Fix Version/s: (was: 1.20.0) > Regression in org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR5 when > enabling OAK-1617 > -- > > Key: OAK-3437 > URL: https://issues.apache.org/jira/browse/OAK-3437 > Project: Jackrabbit Oak > Issue Type: Bug > Components: solr >Reporter: Davide Giannella >Assignee: Tommaso Teofili >Priority: Major > > When enabling OAK-1617 (still to be committed) there's a regression in the > {{oak-solr-core}} unit tests > - {{org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR3}} > - {{org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR4}} > - {{org.apache.jackrabbit.core.query.JoinTest#testJoinWithOR5}} > The WIP of the feature can be found in > https://github.com/davidegiannella/jackrabbit-oak/tree/OAK-1617 and a full > patch will be attached shortly for review in OAK-1617 itself. > The feature is currently disabled, in order to enable it for unit testing an > approach like this can be taken > https://github.com/davidegiannella/jackrabbit-oak/blob/177df1a8073b1237857267e23d12a433e3d890a4/oak-core/src/test/java/org/apache/jackrabbit/oak/query/SQL2OptimiseQueryTest.java#L142 > or setting the system property {{-Doak.query.sql2optimisation}}. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6387) Building an index (new index + reindex): temporarily store blob references
[ https://issues.apache.org/jira/browse/OAK-6387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6387: Fix Version/s: (was: 1.20.0) > Building an index (new index + reindex): temporarily store blob references > -- > > Key: OAK-6387 > URL: https://issues.apache.org/jira/browse/OAK-6387 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene, query >Reporter: Thomas Mueller >Priority: Major > > If reindexing a Lucene index takes multiple days, and if datastore garbage > collection (DSGC) is run during that time, then DSGC may remove binaries of > that index because they are not referenced. > It would be good if all binaries that are needed, and that are older than > (for example) one hour, are referenced during reindexing (for example in a > temporary location). So that DSGC will not remove them. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6597) rep:excerpt not working for content indexed by aggregation in lucene
[ https://issues.apache.org/jira/browse/OAK-6597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-6597: Fix Version/s: (was: 1.20.0) > rep:excerpt not working for content indexed by aggregation in lucene > > > Key: OAK-6597 > URL: https://issues.apache.org/jira/browse/OAK-6597 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Affects Versions: 1.6.1, 1.7.6, 1.8.0 >Reporter: Dirk Rudolph >Assignee: Chetan Mehrotra >Priority: Major > Labels: excerpt > Attachments: excerpt-with-aggregation-test.patch > > > I mentioned that properties that got indexed due to an aggregation are not > considered for excerpts (highlighting) as they are not indexed as stored > fields. > See the attached patch that implements a test for excerpts in > {{LuceneIndexAggregationTest2}}. > It creates the following structure: > {code} > /content/foo [test:Page] > + bar (String) > - jcr:content [test:PageContent] > + bar (String) > {code} > where both strings (the _bar_ property at _foo_ and the _bar_ property at > _jcr:content_) contain different text. > Afterwards it queries for 2 terms ("tinc*" and "aliq*") that either exist in > _/content/foo/bar_ or _/content/foo/jcr:content/bar_ but not in both. For the > former one the excerpt is properly provided for the later one it isn't. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-7166) Union with different selector names
[ https://issues.apache.org/jira/browse/OAK-7166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-7166: Fix Version/s: (was: 1.20.0) > Union with different selector names > --- > > Key: OAK-7166 > URL: https://issues.apache.org/jira/browse/OAK-7166 > Project: Jackrabbit Oak > Issue Type: Bug > Components: query >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > The following query returns the wrong nodes: > {noformat} > /jcr:root/libs/(* | */* | */*/* | */*/*/* | */*/*/*/*)/install > select b.[jcr:path] as [jcr:path], b.[jcr:score] as [jcr:score], b.* from > [nt:base] as a > inner join [nt:base] as b on ischildnode(b, a) > where ischildnode(a, '/libs') and name(b) = 'install' > union select c.[jcr:path] as [jcr:path], c.[jcr:score] as [jcr:score], c.* > from [nt:base] as a > inner join [nt:base] as b on ischildnode(b, a) > inner join [nt:base] as c on ischildnode(c, b) > where ischildnode(a, '/libs') and name(c) = 'install' > union select d.[jcr:path] as [jcr:path], d.[jcr:score] as [jcr:score], d.* > from [nt:base] as a > inner join [nt:base] as b on ischildnode(b, a) > inner join [nt:base] as c on ischildnode(c, b) > inner join [nt:base] as d on ischildnode(d, c) > where ischildnode(a, '/libs') and name(d) = 'install' > {noformat} > If I change the selector name to "x" in each subquery, then it works. There > is no XPath version of this workaround: > {noformat} > select x.[jcr:path] as [jcr:path], x.[jcr:score] as [jcr:score], x.* from > [nt:base] as a > inner join [nt:base] as x on ischildnode(x, a) > where ischildnode(a, '/libs') and name(x) = 'install' > union select x.[jcr:path] as [jcr:path], x.[jcr:score] as [jcr:score], x.* > from [nt:base] as a > inner join [nt:base] as b on ischildnode(b, a) > inner join [nt:base] as x on ischildnode(x, b) > where ischildnode(a, '/libs') and name(x) = 'install' > union select x.[jcr:path] as [jcr:path], x.[jcr:score] as [jcr:score], x.* > from [nt:base] as a > inner join [nt:base] as b on ischildnode(b, a) > inner join [nt:base] as c on ischildnode(c, b) > inner join [nt:base] as x on ischildnode(x, c) > where ischildnode(a, '/libs') and name(x) = 'install' > {noformat} > Need to check if this is a Oak bug, or a bug in the query tool I use. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-7263) oak-lucene should not depend on oak-store-document
[ https://issues.apache.org/jira/browse/OAK-7263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-7263: Fix Version/s: (was: 1.20.0) > oak-lucene should not depend on oak-store-document > -- > > Key: OAK-7263 > URL: https://issues.apache.org/jira/browse/OAK-7263 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Reporter: Robert Munteanu >Priority: Major > > {{oak-lucene}} has a hard dependency on {{oak-store-document}} and that looks > wrong to me. > {noformat}[ERROR] Failed to execute goal > org.apache.maven.plugins:maven-compiler-plugin:3.7.0:compile > (default-compile) on project oak-lucene: Compilation failure: Compilation > failure: > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneDocumentHolder.java:[31,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneDocumentHolder.java:[37,46] > cannot find symbol > [ERROR] symbol: class JournalProperty > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:[33,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:[34,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:[38,47] > cannot find symbol > [ERROR] symbol: class JournalPropertyBuilder > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:[106,12] > cannot find symbol > [ERROR] symbol: class JournalProperty > [ERROR] location: class > org.apache.jackrabbit.oak.plugins.index.lucene.hybrid.LuceneJournalPropertyBuilder > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/LuceneIndexProviderService.java:[55,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/IndexedPaths.java:[29,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/IndexedPaths.java:[33,31] > cannot find symbol > [ERROR] symbol: class JournalProperty > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyService.java:[22,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyService.java:[23,54] > package org.apache.jackrabbit.oak.plugins.document.spi does not exist > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyService.java:[25,54] > cannot find symbol > [ERROR] symbol: class JournalPropertyService > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyService.java:[33,12] > cannot find symbol > [ERROR] symbol: class JournalPropertyBuilder > [ERROR] location: class > org.apache.jackrabbit.oak.plugins.index.lucene.hybrid.LuceneJournalPropertyService > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:[50,5] > method does not override or implement a method from a supertype > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:[61,5] > method does not override or implement a method from a supertype > [ERROR] > /home/robert/Documents/sources/apache/jackrabbit-oak/oak-
[jira] [Commented] (OAK-7370) order by jcr:score desc doesn't work across union query created by optimizing OR clauses
[ https://issues.apache.org/jira/browse/OAK-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968356#comment-16968356 ] Thomas Mueller commented on OAK-7370: - Thanks [~catholicon]! I removed the fix version. > order by jcr:score desc doesn't work across union query created by optimizing > OR clauses > > > Key: OAK-7370 > URL: https://issues.apache.org/jira/browse/OAK-7370 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Reporter: Vikas Saurabh >Assignee: Thomas Mueller >Priority: Major > > Merging of sub-queries created due to optimizing OR clauses doesn't work for > sorting on {{jcr:score}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-7370) order by jcr:score desc doesn't work across union query created by optimizing OR clauses
[ https://issues.apache.org/jira/browse/OAK-7370?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-7370: Fix Version/s: (was: 1.20.0) > order by jcr:score desc doesn't work across union query created by optimizing > OR clauses > > > Key: OAK-7370 > URL: https://issues.apache.org/jira/browse/OAK-7370 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Reporter: Vikas Saurabh >Assignee: Thomas Mueller >Priority: Major > > Merging of sub-queries created due to optimizing OR clauses doesn't work for > sorting on {{jcr:score}} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16968163#comment-16968163 ] Thomas Mueller commented on OAK-8673: - [~angela] Thanks! One more question: In the issue description, you write "we almost never benefit from the lazy permission evaluation (compared to reading all permission entries right away)". I assume you mean lazy permission evaluation isn't _faster_ than reading all permission entries right away, right? If so, is it a lot _slower_? There are two points I want to make: * We should understand why it does / does not impact performance - this is important to be able to have a somewhat accurate mental model * Maybe it has an impact on memory usage? So we could say let's keep lazy evaluation to save memory? How much? If the answer is: lazy evaluation doesn't save any memory and doesn't have any memory impact, then we can probably simplify the code (to never or always do lazy evaluation, whatever is simpler). > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Benchmarks with 10-times re-reading the same random item: > As I would have expected it seems that the negative impact of lazy-loading is > somewhat reduced, as the re-reading will hit the cache populated while > reading. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8162) When query with OR is divided into union of queries, options (like index tag) are not passed into subqueries.
[ https://issues.apache.org/jira/browse/OAK-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964823#comment-16964823 ] Thomas Mueller commented on OAK-8162: - [~reschke] you are right, it would be good to backport this to Oak 1.10 and 1.8. I don't think Oak 1.6 is needed, as it doesn't support index tags. Do you want me to do this? > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries. > -- > > Key: OAK-8162 > URL: https://issues.apache.org/jira/browse/OAK-8162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.10.2, 1.8.17 >Reporter: Piotr Tajduś >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.14.0 > > > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries - in effect alternative query sometimes f.e. > uses indexes it shouldn't use. > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl.buildAlternativeQuery() > org.apache.jackrabbit.oak.query.QueryImpl.copyOf() > > 2019-03-21 16:32:25,600 DEBUG > [org.apache.jackrabbit.oak.query.QueryEngineImpl] (default task-1) Parsing > JCR-SQL2 statement: select distinct d.* from [crkid:document] as d where > ([d].[metadane/inneMetadane/*/wartosc] = 'AX' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') or > ([d].[metadane/inneMetadane/*/wartosc] = 'AB' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') option(index tag > crkid_dokument_month_2019_3) > 2019-03-21 16:32:25,607 DEBUG [org.apache.jackrabbit.oak.query.QueryImpl] > (default task-1) cost using filter Filter(query=select distinct d.* from > [crkid:document] as d where ([d].[metadane/inneMetadane/*/wartosc] = 'AB') > and ([d].[metadane/inneMetadane/*/klucz] = 'InnyKod'), path=*, > property=[metadane/inneMetadane/*/klucz=[InnyKod], > metadane/inneMetadane/*/wartosc=[AB]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8162) When query with OR is divided into union of queries, options (like index tag) are not passed into subqueries.
[ https://issues.apache.org/jira/browse/OAK-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8162: Labels: candidate_oak_1_10 candidate_oak_1_8 (was: ) > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries. > -- > > Key: OAK-8162 > URL: https://issues.apache.org/jira/browse/OAK-8162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.10.2, 1.8.17 >Reporter: Piotr Tajduś >Assignee: Thomas Mueller >Priority: Major > Labels: candidate_oak_1_10, candidate_oak_1_8 > Fix For: 1.14.0 > > > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries - in effect alternative query sometimes f.e. > uses indexes it shouldn't use. > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl.buildAlternativeQuery() > org.apache.jackrabbit.oak.query.QueryImpl.copyOf() > > 2019-03-21 16:32:25,600 DEBUG > [org.apache.jackrabbit.oak.query.QueryEngineImpl] (default task-1) Parsing > JCR-SQL2 statement: select distinct d.* from [crkid:document] as d where > ([d].[metadane/inneMetadane/*/wartosc] = 'AX' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') or > ([d].[metadane/inneMetadane/*/wartosc] = 'AB' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') option(index tag > crkid_dokument_month_2019_3) > 2019-03-21 16:32:25,607 DEBUG [org.apache.jackrabbit.oak.query.QueryImpl] > (default task-1) cost using filter Filter(query=select distinct d.* from > [crkid:document] as d where ([d].[metadane/inneMetadane/*/wartosc] = 'AB') > and ([d].[metadane/inneMetadane/*/klucz] = 'InnyKod'), path=*, > property=[metadane/inneMetadane/*/klucz=[InnyKod], > metadane/inneMetadane/*/wartosc=[AB]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8162) When query with OR is divided into union of queries, options (like index tag) are not passed into subqueries.
[ https://issues.apache.org/jira/browse/OAK-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8162: Affects Version/s: (was: 1.6.18) > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries. > -- > > Key: OAK-8162 > URL: https://issues.apache.org/jira/browse/OAK-8162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.10.2, 1.8.17 >Reporter: Piotr Tajduś >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.14.0 > > > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries - in effect alternative query sometimes f.e. > uses indexes it shouldn't use. > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl.buildAlternativeQuery() > org.apache.jackrabbit.oak.query.QueryImpl.copyOf() > > 2019-03-21 16:32:25,600 DEBUG > [org.apache.jackrabbit.oak.query.QueryEngineImpl] (default task-1) Parsing > JCR-SQL2 statement: select distinct d.* from [crkid:document] as d where > ([d].[metadane/inneMetadane/*/wartosc] = 'AX' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') or > ([d].[metadane/inneMetadane/*/wartosc] = 'AB' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') option(index tag > crkid_dokument_month_2019_3) > 2019-03-21 16:32:25,607 DEBUG [org.apache.jackrabbit.oak.query.QueryImpl] > (default task-1) cost using filter Filter(query=select distinct d.* from > [crkid:document] as d where ([d].[metadane/inneMetadane/*/wartosc] = 'AB') > and ([d].[metadane/inneMetadane/*/klucz] = 'InnyKod'), path=*, > property=[metadane/inneMetadane/*/klucz=[InnyKod], > metadane/inneMetadane/*/wartosc=[AB]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8162) When query with OR is divided into union of queries, options (like index tag) are not passed into subqueries.
[ https://issues.apache.org/jira/browse/OAK-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8162: Affects Version/s: 1.6.18 1.8.17 > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries. > -- > > Key: OAK-8162 > URL: https://issues.apache.org/jira/browse/OAK-8162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.10.2, 1.6.18, 1.8.17 >Reporter: Piotr Tajduś >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.14.0 > > > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries - in effect alternative query sometimes f.e. > uses indexes it shouldn't use. > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl.buildAlternativeQuery() > org.apache.jackrabbit.oak.query.QueryImpl.copyOf() > > 2019-03-21 16:32:25,600 DEBUG > [org.apache.jackrabbit.oak.query.QueryEngineImpl] (default task-1) Parsing > JCR-SQL2 statement: select distinct d.* from [crkid:document] as d where > ([d].[metadane/inneMetadane/*/wartosc] = 'AX' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') or > ([d].[metadane/inneMetadane/*/wartosc] = 'AB' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') option(index tag > crkid_dokument_month_2019_3) > 2019-03-21 16:32:25,607 DEBUG [org.apache.jackrabbit.oak.query.QueryImpl] > (default task-1) cost using filter Filter(query=select distinct d.* from > [crkid:document] as d where ([d].[metadane/inneMetadane/*/wartosc] = 'AB') > and ([d].[metadane/inneMetadane/*/klucz] = 'InnyKod'), path=*, > property=[metadane/inneMetadane/*/klucz=[InnyKod], > metadane/inneMetadane/*/wartosc=[AB]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964764#comment-16964764 ] Thomas Mueller commented on OAK-8673: - > 0 should be possible can run those in addition I would probably do that, and check if it really works as expected (the cache is really empty). Or maybe hardcode some logic that means if 0, then don't use the cache (might be a bit hard). > the lazy-loading doesn't seems to have a beneficial effect (except for > reading really few items, which in AEM is rarely the case) Do you assume that with a small EagerCacheSize, lazy loading isn't used at all? I don't know the code, but it sounds like it's better to somehow disable the lazy loading logic, in order to be sure it's not used by some unexpected code path. > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964741#comment-16964741 ] Thomas Mueller commented on OAK-8729: - I tried writing a special test case, but it is not easy... I could sometimes reproduce the issue, but only if the existing test is run many times, and only when instrumenting the MemoryNodeBuilder. > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964739#comment-16964739 ] Thomas Mueller commented on OAK-8729: - Attached a patch for review, [~catholicon] [~nitigupt][~tihom88]. > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8729: Attachment: OAK-8729.patch > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > Attachments: OAK-8729.patch > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8729: Affects Version/s: 1.12.0 1.14.0 1.16.0 1.18.0 > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8729: Fix Version/s: 1.20.0 > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Affects Versions: 1.12.0, 1.14.0, 1.16.0, 1.18.0 >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.20.0 > > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (OAK-8729) Lucene Directory concurrency issue
[ https://issues.apache.org/jira/browse/OAK-8729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller reassigned OAK-8729: --- Assignee: Thomas Mueller > Lucene Directory concurrency issue > -- > > Key: OAK-8729 > URL: https://issues.apache.org/jira/browse/OAK-8729 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene >Reporter: Thomas Mueller >Assignee: Thomas Mueller >Priority: Major > > There is a concurrency issue in the DefaultDirectoryFactory. It is > reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run > in a loop (1000 times). The problem is that the MemoryNodeBuilder is used > concurrently: > * thread 1 is closing the directory (after writing to it) > * thread 2 is trying to create a new file > {noformat} > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) > at > org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) > at org.apache.lucene.store.Directory.copy(Directory.java:184) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) > at > org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) > at java.util.concurrent.FutureTask.run(FutureTask.java:266) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964729#comment-16964729 ] Thomas Mueller commented on OAK-8673: - > the threshold to move from eagerly-loading all permission entries to lazy > loading is defined by the EagerCacheSize. So, maybe test with EagerCacheSize = 0, or (if that's not possible) 1? > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (OAK-8673) Determine and possibly adjust size of eagerCacheSize
[ https://issues.apache.org/jira/browse/OAK-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16964711#comment-16964711 ] Thomas Mueller commented on OAK-8673: - > we almost never benefit from the lazy permission evaluation (compared to > reading all permission entries right away). Just to make sure: It sounds like "lazy permission evaluation disabled" means "reading all permission entries right away"... right? And then it sounds like you consider disabling lazy permission evaluation? Which benchmark results show data for "lazy permission evaluation disabled", and which results show results for "lazy permission evaluation enabled"? I only see different settings for * Items to Read * Repeat Read * Number of ACEs * Number of Principals * EagerCacheSize > Determine and possibly adjust size of eagerCacheSize > > > Key: OAK-8673 > URL: https://issues.apache.org/jira/browse/OAK-8673 > Project: Jackrabbit Oak > Issue Type: Technical task > Components: core, security >Reporter: Angela Schreiber >Assignee: Angela Schreiber >Priority: Major > > The initial results of the {{EagerCacheSizeTest}} seem to indicate that we > almost never benefit from the lazy permission evaluation (compared to reading > all permission entries right away). From my understanding of the results the > only exception are those cases where only very few items are being accessed > (e.g. reading 100 items). > However, I am not totally sure if this is not a artifact of the random-read. > I therefore started extending the benchmark with an option to re-read a > randomly picked item more that once, which according to some analysis done > quite some time ago is a common scenario specially when using Oak in > combination with Apache Sling. > Result are attached to OAK-8662 (possibly more to come). -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OAK-8729) Lucene Directory concurrency issue
Thomas Mueller created OAK-8729: --- Summary: Lucene Directory concurrency issue Key: OAK-8729 URL: https://issues.apache.org/jira/browse/OAK-8729 Project: Jackrabbit Oak Issue Type: Improvement Components: lucene Reporter: Thomas Mueller There is a concurrency issue in the DefaultDirectoryFactory. It is reproducible sometimes using CopyOnWriteDirectoryTest.copyOnWrite(), if run in a loop (1000 times). The problem is that the MemoryNodeBuilder is used concurrently: * thread 1 is closing the directory (after writing to it) * thread 2 is trying to create a new file {noformat} at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setProperty(MemoryNodeBuilder.java:525) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.close(OakDirectory.java:264) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.close(BufferedOakDirectory.java:217) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnReadDirectory$2.run(CopyOnReadDirectory.java:305) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.exists(MemoryNodeBuilder.java:284) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:362) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.setChildNode(MemoryNodeBuilder.java:356) at org.apache.jackrabbit.oak.plugins.memory.MemoryNodeBuilder.child(MemoryNodeBuilder.java:342) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.OakDirectory.createOutput(OakDirectory.java:214) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.BufferedOakDirectory.createOutput(BufferedOakDirectory.java:178) at org.apache.lucene.store.Directory.copy(Directory.java:184) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:322) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$3.call(CopyOnWriteDirectory.java:1) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:105) at org.apache.jackrabbit.oak.plugins.index.lucene.directory.CopyOnWriteDirectory$2$1.call(CopyOnWriteDirectory.java:1) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Resolved] (OAK-8162) When query with OR is divided into union of queries, options (like index tag) are not passed into subqueries.
[ https://issues.apache.org/jira/browse/OAK-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller resolved OAK-8162. - Resolution: Fixed Yes, this is fixed. I also change the fix version to 1.14. > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries. > -- > > Key: OAK-8162 > URL: https://issues.apache.org/jira/browse/OAK-8162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.10.2 >Reporter: Piotr Tajduś >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.14.0 > > > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries - in effect alternative query sometimes f.e. > uses indexes it shouldn't use. > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl.buildAlternativeQuery() > org.apache.jackrabbit.oak.query.QueryImpl.copyOf() > > 2019-03-21 16:32:25,600 DEBUG > [org.apache.jackrabbit.oak.query.QueryEngineImpl] (default task-1) Parsing > JCR-SQL2 statement: select distinct d.* from [crkid:document] as d where > ([d].[metadane/inneMetadane/*/wartosc] = 'AX' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') or > ([d].[metadane/inneMetadane/*/wartosc] = 'AB' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') option(index tag > crkid_dokument_month_2019_3) > 2019-03-21 16:32:25,607 DEBUG [org.apache.jackrabbit.oak.query.QueryImpl] > (default task-1) cost using filter Filter(query=select distinct d.* from > [crkid:document] as d where ([d].[metadane/inneMetadane/*/wartosc] = 'AB') > and ([d].[metadane/inneMetadane/*/klucz] = 'InnyKod'), path=*, > property=[metadane/inneMetadane/*/klucz=[InnyKod], > metadane/inneMetadane/*/wartosc=[AB]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-8162) When query with OR is divided into union of queries, options (like index tag) are not passed into subqueries.
[ https://issues.apache.org/jira/browse/OAK-8162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Mueller updated OAK-8162: Fix Version/s: (was: 1.20.0) 1.14.0 > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries. > -- > > Key: OAK-8162 > URL: https://issues.apache.org/jira/browse/OAK-8162 > Project: Jackrabbit Oak > Issue Type: Bug > Components: core >Affects Versions: 1.10.2 >Reporter: Piotr Tajduś >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.14.0 > > > When query with OR is divided into union of queries, options (like index tag) > are not passed into subqueries - in effect alternative query sometimes f.e. > uses indexes it shouldn't use. > {noformat} > org.apache.jackrabbit.oak.query.QueryImpl.buildAlternativeQuery() > org.apache.jackrabbit.oak.query.QueryImpl.copyOf() > > 2019-03-21 16:32:25,600 DEBUG > [org.apache.jackrabbit.oak.query.QueryEngineImpl] (default task-1) Parsing > JCR-SQL2 statement: select distinct d.* from [crkid:document] as d where > ([d].[metadane/inneMetadane/*/wartosc] = 'AX' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') or > ([d].[metadane/inneMetadane/*/wartosc] = 'AB' and > [d].[metadane/inneMetadane/*/klucz] = 'InnyKod') option(index tag > crkid_dokument_month_2019_3) > 2019-03-21 16:32:25,607 DEBUG [org.apache.jackrabbit.oak.query.QueryImpl] > (default task-1) cost using filter Filter(query=select distinct d.* from > [crkid:document] as d where ([d].[metadane/inneMetadane/*/wartosc] = 'AB') > and ([d].[metadane/inneMetadane/*/klucz] = 'InnyKod'), path=*, > property=[metadane/inneMetadane/*/klucz=[InnyKod], > metadane/inneMetadane/*/wartosc=[AB]]) > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)