[jira] [Created] (OAK-9816) Add 'maxRowsRead' to Query Stats JMX bean output
Tom Blackford created OAK-9816: -- Summary: Add 'maxRowsRead' to Query Stats JMX bean output Key: OAK-9816 URL: https://issues.apache.org/jira/browse/OAK-9816 Project: Jackrabbit Oak Issue Type: Improvement Components: query Reporter: Tom Blackford At present, although the QueryStatsData type records the max rows read for a particular query, it does not expose this data via the JMX bean. It would be useful to have this data available both in the JMX table and also in other features which rely on this bean's output. -- This message was sent by Atlassian Jira (v8.20.7#820007)
[jira] [Comment Edited] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498621#comment-17498621 ] Tom Blackford edited comment on OAK-9708 at 2/27/22, 4:35 PM: -- PR merged; now any property restriction beginning with ':' (ie :indexName, :indexTag etc) will be ignored. was (Author: rma61...@adobe.com): PR merged. > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value testTag > {code} > because of the presence of the property restriction ":indexTag" (from the > index tag option) being incorrectly considered to be a property restriction > which must match regex. > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498621#comment-17498621 ] Tom Blackford commented on OAK-9708: PR merged. > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value testTag > {code} > because of the presence of the property restriction ":indexTag" (from the > index tag option) being incorrectly considered to be a property restriction > which must match regex. > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Resolved] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford resolved OAK-9708. Resolution: Fixed > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value testTag > {code} > because of the presence of the property restriction ":indexTag" (from the > index tag option) being incorrectly considered to be a property restriction > which must match regex. > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9708: --- Description: A query like this: {code} //*[jcr:contains(., '"/my/path"')] option(index tag testTag) {code} ...which uses an index with a path regex, will log a WARN like : {code} Potentially improper use of index /oak:index/regexIndex with queryFilterRegex (["']|^)/ to search for value testTag {code} because of the presence of the property restriction ":indexTag" (from the index tag option) being incorrectly considered to be a property restriction which must match regex. This should be ignored in the logic here - https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 was: A query like this: {code} //*[jcr:contains(., '"/my/path"')] option(index tag testTag) {code} ...which uses an index with a path regex, will still log a WARN like : {code} Potentially improper use of index /oak:index/regexIndex with queryFilterRegex (["']|^)/ to search for value /my/path {code} because of the presence of the property restriction ":indexTag" (from the index tag option). This should be ignored in the logic here - https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value testTag > {code} > because of the presence of the property restriction ":indexTag" (from the > index tag option) being incorrectly considered to be a property restriction > which must match regex. > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9708: --- Description: A query like this: {code} //*[jcr:contains(., '"/my/path"')] option(index tag testTag) {code} ...which uses an index with a path regex, will still log a WARN like : {code} Potentially improper use of index /oak:index/regexIndex with queryFilterRegex (["']|^)/ to search for value /my/path {code} because of the presence of the property restriction ":indexTag" (from the index tag option). This should be ignored in the logic here - https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 was: A query like this: {code} //*[jcr:contains(., '"/my/path"')] option(index tag testTag) {code} ...which uses an index with a path regex, will still log a WARN like : {code} Potentially improper use of index /oak:index/regexIndex with queryFilterRegex (["']|^)/ to search for value /my/path {code} because of the presense of the property restriction ":indexTag" (from the index tag option). This should be ignored in the logic here - https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will still log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value /my/path > {code} > because of the presence of the property restriction ":indexTag" (from the > index tag option). > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17498124#comment-17498124 ] Tom Blackford commented on OAK-9708: PR open now - https://github.com/apache/jackrabbit-oak/pull/504 > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will still log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value /my/path > {code} > because of the presense of the property restriction ":indexTag" (from the > index tag option). > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9708) Invalid logging of 'improper' regex WARN
[ https://issues.apache.org/jira/browse/OAK-9708?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9708: --- Priority: Minor (was: Major) > Invalid logging of 'improper' regex WARN > > > Key: OAK-9708 > URL: https://issues.apache.org/jira/browse/OAK-9708 > Project: Jackrabbit Oak > Issue Type: Bug > Components: indexing >Reporter: Tom Blackford >Priority: Minor > > A query like this: > {code} > //*[jcr:contains(., '"/my/path"')] option(index tag testTag) > {code} > ...which uses an index with a path regex, will still log a WARN like : > {code} > Potentially improper use of index /oak:index/regexIndex with queryFilterRegex > (["']|^)/ to search for value /my/path > {code} > because of the presense of the property restriction ":indexTag" (from the > index tag option). > This should be ignored in the logic here - > https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (OAK-9708) Invalid logging of 'improper' regex WARN
Tom Blackford created OAK-9708: -- Summary: Invalid logging of 'improper' regex WARN Key: OAK-9708 URL: https://issues.apache.org/jira/browse/OAK-9708 Project: Jackrabbit Oak Issue Type: Bug Components: indexing Reporter: Tom Blackford A query like this: {code} //*[jcr:contains(., '"/my/path"')] option(index tag testTag) {code} ...which uses an index with a path regex, will still log a WARN like : {code} Potentially improper use of index /oak:index/regexIndex with queryFilterRegex (["']|^)/ to search for value /my/path {code} because of the presense of the property restriction ":indexTag" (from the index tag option). This should be ignored in the logic here - https://github.com/apache/jackrabbit-oak/blob/trunk/oak-search/src/main/java/org/apache/jackrabbit/oak/plugins/index/search/spi/query/FulltextIndexPlanner.java#L308-L319 -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9670) Log an WARN when a fulltext query cannot find an appropriate index
[ https://issues.apache.org/jira/browse/OAK-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9670: --- Summary: Log an WARN when a fulltext query cannot find an appropriate index (was: Log an ERROR when a fulltext query cannot find an appropriate index) > Log an WARN when a fulltext query cannot find an appropriate index > -- > > Key: OAK-9670 > URL: https://issues.apache.org/jira/browse/OAK-9670 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Tom Blackford >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.44.0 > > > Fulltext queries cannot be handled by traversal, so we need to highlight > prominently when a fulltext query has been issues but no appropriate index > can be found. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (OAK-9670) Log an ERROR when a fulltext query cannot find an appropriate index
[ https://issues.apache.org/jira/browse/OAK-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17477944#comment-17477944 ] Tom Blackford commented on OAK-9670: PR raised https://github.com/apache/jackrabbit-oak/pull/467 cc [~thomasm] > Log an ERROR when a fulltext query cannot find an appropriate index > --- > > Key: OAK-9670 > URL: https://issues.apache.org/jira/browse/OAK-9670 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Tom Blackford >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.44.0 > > > Fulltext queries cannot be handled by traversal, so we need to highlight > prominently when a fulltext query has been issues but no appropriate index > can be found. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9670) Log an ERROR when a fulltext query cannot find an appropriate index
[ https://issues.apache.org/jira/browse/OAK-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9670: --- Description: Fulltext queries cannot be handled by traversal, so we need to highlight prominently when a fulltext query has been issues but no appropriate index can be found. (was: Fulltext queries cannot be handled by traversal, so we need to highlight prominently ) > Log an ERROR when a fulltext query cannot find an appropriate index > --- > > Key: OAK-9670 > URL: https://issues.apache.org/jira/browse/OAK-9670 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Tom Blackford >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.44.0 > > > Fulltext queries cannot be handled by traversal, so we need to highlight > prominently when a fulltext query has been issues but no appropriate index > can be found. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9670) Log an ERROR when a fulltext query cannot find an appropriate index
[ https://issues.apache.org/jira/browse/OAK-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9670: --- Summary: Log an ERROR when a fulltext query cannot find an appropriate index (was: Log a Warning when a fulltext query cannot find an appropriate index) > Log an ERROR when a fulltext query cannot find an appropriate index > --- > > Key: OAK-9670 > URL: https://issues.apache.org/jira/browse/OAK-9670 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Tom Blackford >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.44.0 > > > Fulltext queries cannot be handled by traversal, so we need to highlight > prominently -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (OAK-9670) Log a Warning when a fulltext query cannot find an appropriate index
[ https://issues.apache.org/jira/browse/OAK-9670?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-9670: --- Description: Fulltext queries cannot be handled by traversal, so we need to highlight prominently (was: Oak only considers a query as slow if it scans over 100'000 nodes. We can change the limit to 5000.) > Log a Warning when a fulltext query cannot find an appropriate index > > > Key: OAK-9670 > URL: https://issues.apache.org/jira/browse/OAK-9670 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Tom Blackford >Assignee: Thomas Mueller >Priority: Major > Fix For: 1.44.0 > > > Fulltext queries cannot be handled by traversal, so we need to highlight > prominently -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (OAK-9670) Log a Warning when a fulltext query cannot find an appropriate index
Tom Blackford created OAK-9670: -- Summary: Log a Warning when a fulltext query cannot find an appropriate index Key: OAK-9670 URL: https://issues.apache.org/jira/browse/OAK-9670 Project: Jackrabbit Oak Issue Type: Improvement Components: query Reporter: Tom Blackford Assignee: Thomas Mueller Fix For: 1.44.0 Oak only considers a query as slow if it scans over 100'000 nodes. We can change the limit to 5000. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (OAK-6632) [upgrade] oak-upgrade should support azure blobstorage
[ https://issues.apache.org/jira/browse/OAK-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17015108#comment-17015108 ] Tom Blackford commented on OAK-6632: Hi [~tomek.rekawek] Attaching a patch [^oak-upgrade-azureblob-tb.patch] to add this support and adding test cases for the various scenarios (as with the S3-related tests, these are skipped unless Auzre config is provided). I took the advice above and made sure that the 'src-azuredatastore' was not required; if no path is provided either explicitly (via 'src-azuredatastore' or via the 'path' property in the azure config), the tests check that that the value of 'java.io.tmpdir' is used instead. Hope this is ok - let me know if you'd like any further changes / tests. > [upgrade] oak-upgrade should support azure blobstorage > -- > > Key: OAK-6632 > URL: https://issues.apache.org/jira/browse/OAK-6632 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: upgrade >Reporter: Raul Hudea >Priority: Major > Labels: azureblob > Attachments: oak-upgrade-azureblob-tb.patch > > > oak-upgrade should support azuredatastore in addition to s3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (OAK-6632) [upgrade] oak-upgrade should support azure blobstorage
[ https://issues.apache.org/jira/browse/OAK-6632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-6632: --- Attachment: oak-upgrade-azureblob-tb.patch > [upgrade] oak-upgrade should support azure blobstorage > -- > > Key: OAK-6632 > URL: https://issues.apache.org/jira/browse/OAK-6632 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: upgrade >Reporter: Raul Hudea >Priority: Major > Labels: azureblob > Attachments: oak-upgrade-azureblob-tb.patch > > > oak-upgrade should support azuredatastore in addition to s3 -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (OAK-8267) Limit number of values in 'nestedCugs' hidden property in NestedCugHook
Tom Blackford created OAK-8267: -- Summary: Limit number of values in 'nestedCugs' hidden property in NestedCugHook Key: OAK-8267 URL: https://issues.apache.org/jira/browse/OAK-8267 Project: Jackrabbit Oak Issue Type: Bug Components: authorization-cug Reporter: Tom Blackford The logic in NestedCugHook.addNestedCugPath maintains a hidden multivalue string property at /:nestedCugs (see [1]). If a customer had many thousands of CUGs, this would result in many thousands of values on this string property which is unlikely to scale. >From [~anchela]: {quote} the reason for storing it is performance optimization i.e. minimizing reading from nodes to see if they hold a cug if the intended usages is that there are few and most nodes don't have a cug. i wouldn't not want to remove the hidden property for that default use case. but we could for sure take a look to see if we could introduce a threshold similar to the one at the root node i.e. using a counter instead of maintaining the complete list and in addition drop the list altogether in that case {quote} [1] https://github.com/apache/jackrabbit-oak/blob/073f2b5378cd198a9cb30eb1f57958fb805ce508/oak-authorization-cug/src/main/java/org/apache/jackrabbit/oak/spi/security/authorization/cug/impl/NestedCugHook.java#L79 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-8166) Index definition with orderable property definitions with and without functions breaks index
[ https://issues.apache.org/jira/browse/OAK-8166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-8166: --- Description: If an index definition contains the same orderable property with and without functions, it will fail to index any node which contains that property. The failure will be logged as [1]. Steps to reproduce: * Configure index with the two property definitions shown at [2]. * Refresh the index definition * Modify a node that falls under the definition - it will fail with the exception shown at [1] * Modify the 'non-function' index definition to not be orderable (orderable=false) * Refresh the index definition * Modify the same node - note there is no exception. Thanks to [~catholicon] for assistance identifying root cause. [1] {code} 25.03.2019 15:39:04.135 *WARN* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor Failed to index the node [/content/dam/Unknown-2.png] java.lang.IllegalArgumentException: DocValuesField ":dvjcr:content/metadata/dc:title" appears more than once in this document (only one value is allowed per field) at org.apache.lucene.index.SortedDocValuesWriter.addValue(SortedDocValuesWriter.java:62) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocValuesProcessor.addSortedField(DocValuesProcessor.java:125) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:59) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1534) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.updateDocument(DefaultIndexWriter.java:86) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.addOrUpdate(LuceneIndexEditor.java:258) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:140) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.spi.commit.CompositeEditor.leave(CompositeEditor.java:74) [org.apache.jackrabbit.oak-store-spi:1.8.9] {code} [2] {code} "dcTitle": { "jcr:primaryType": "nt:unstructured", "nodeScopeIndex": "true", "useInSuggest": "true", "ordered": "true", "propertyIndex": "true", "useInSpellcheck": "true", "name": "jcr:content/metadata/dc:title", "boost": "2.0" }, "dcTitleLowercase": { "jcr:primaryType": "nt:unstructured", "ordered": "true", "propertyIndex": "true", "function": "fn:lower-case(jcr:content/metadata/@dc:title)" } {code} was: If an index definition contains the same orderable property with and without functions, it will fail to index any node which contains that property. The failure will be logged as [1]. Steps to reproduce: * Configure index with the two property definitions shown at [2]. * Refresh the index definition * Modify a node that falls under the definition - it will fail with the exception shown at [1] * Modify the 'non-function' index definition to not be orderable (orderable=false) * Refresh the index definition * Modify the same node - note there is no exception. Thanks to [~catholicon] for assistance identifying root cause. [1] {code} 25.03.2019 15:39:04.135 *WARN* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor Failed to index the node [/content/dam/Unknown-2.png] java.lang.IllegalArgumentException: DocValuesField ":dvjcr:content/metadata/dc:title" appears more than once in this document (only one value is allowed per field) at org.apache.lucene.index.SortedDocValuesWriter.addValue(SortedDocValuesWriter.java:62) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocValuesProcessor.addSortedField(DocValuesProcessor.java:125) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:59) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.
[jira] [Created] (OAK-8166) Index definition with orderable property definitions with and without functions breaks index
Tom Blackford created OAK-8166: -- Summary: Index definition with orderable property definitions with and without functions breaks index Key: OAK-8166 URL: https://issues.apache.org/jira/browse/OAK-8166 Project: Jackrabbit Oak Issue Type: Bug Components: indexing Affects Versions: 1.8.12 Reporter: Tom Blackford If an index definition contains the same orderable property with and without functions, it will fail to index any node which contains that property. The failure will be logged as [1]. Steps to reproduce: * Configure index with the two property definitions shown at [2]. * Refresh the index definition * Modify a node that falls under the definition - it will fail with the exception shown at [1] * Modify the 'non-function' index definition to not be orderable (orderable=false) * Refresh the index definition * Modify the same node - note there is no exception. Thanks to [~catholicon] for assistance identifying root cause. [1] {code} 25.03.2019 15:39:04.135 *WARN* [async-index-update-async] org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor Failed to index the node [/content/dam/Unknown-2.png] java.lang.IllegalArgumentException: DocValuesField ":dvjcr:content/metadata/dc:title" appears more than once in this document (only one value is allowed per field) at org.apache.lucene.index.SortedDocValuesWriter.addValue(SortedDocValuesWriter.java:62) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocValuesProcessor.addSortedField(DocValuesProcessor.java:125) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocValuesProcessor.addField(DocValuesProcessor.java:59) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.TwoStoredFieldsConsumers.addField(TwoStoredFieldsConsumers.java:36) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:236) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:253) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:455) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1534) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1507) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.plugins.index.lucene.writer.DefaultIndexWriter.updateDocument(DefaultIndexWriter.java:86) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.addOrUpdate(LuceneIndexEditor.java:258) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:140) [org.apache.jackrabbit.oak-lucene:1.8.9] at org.apache.jackrabbit.oak.spi.commit.CompositeEditor.leave(CompositeEditor.java:74) [org.apache.jackrabbit.oak-store-spi:1.8.9] {code} [2] {code} "dcTitle": { "jcr:primaryType": "nt:unstructured", "nodeScopeIndex": "true", "useInSuggest": "true", "ordered": "false", "propertyIndex": "true", "useInSpellcheck": "true", "name": "jcr:content/metadata/dc:title", "boost": "2.0" }, "dcTitleLowercase": { "jcr:primaryType": "nt:unstructured", "ordered": "true", "propertyIndex": "true", "function": "fn:lower-case(jcr:content/metadata/@dc:title)" } {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction
[ https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16717510#comment-16717510 ] Tom Blackford commented on OAK-7914: {quote}I'm not able to reproduce this issue. There is a guard in the code handling the {{gc.log}} that prevents it from being updated if the compaction phase fails and doesn't install a new head revision\{quote} Hey [~frm] - thanks for checking... yeah - I think the issue was that the ownership of the file was sending the logging off - we were indeed seeing the error [1] until I fixed that, and then once I corrected that the original issue seemed to go away, Apologies for the false alarm. [1] {code:java} 16.11.2018 02:45:38.645 *ERROR* [TarMK revision gc [/mnt/crx/author/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.GCJournal Error writing gc journal java.nio.file.AccessDeniedException: /mnt/crx/author/crx-quickstart/repository/segmentstore/gc.log at sun.nio.fs.UnixException.translateToIOException(UnixException.java:84) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107){code} > Cleanup updates the gc.log after a failed compaction > > > Key: OAK-7914 > URL: https://issues.apache.org/jira/browse/OAK-7914 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Affects Versions: 1.6.15 >Reporter: Francesco Mari >Assignee: Francesco Mari >Priority: Critical > Fix For: 1.6.16 > > Attachments: compaction.log > > > The {{gc.log}} is always updated during the cleanup phase, regardless of the > result of the compaction phase. This might cause a scenario similar to the > following. > - A repository of 100GB, of which 40GB is garbage, is compacted. > - The estimation phase decides it's OK to compact. > - Compaction produces a new head state, adding another 60GB. > - Compaction fails, maybe because of too many concurrent commits. > - Cleanup removes the 60GB generated during compaction. > - Cleanup adds an entry to the {{gc.log}} recording the current size of the > repository, 100GB. > Now, let's imagine that compaction is run shortly after that. The amount of > content added to the repository is negligible. For the sake of simplicity, > let's say that the size of the repository hasn't changed. The following > happens. > - The repository is 100GB, of which 40GB is the same garbage that wasn't > removed above. > - The estimation phase decides it's not OK to compact, because the {{gc.log}} > reports that the latest known size of the repository is 100GB, and there is > not enough content to remove. > This is in fact a bug, because there are 40GB worth of garbage in the > repository, but estimation is not able to see that anymore. The solution > seems to be not to update the {{gc.log}} if compaction fails. In other words, > {{gc.log}} should contain the size of the *compacted* repository over time, > and no more. > Thanks to [~rma61...@adobe.com] for reporting it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction
[ https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16704950#comment-16704950 ] Tom Blackford commented on OAK-7914: Hi [~mduerig] and [~frm] Apologies for the delay. After fixing the permissions on gc.log, I'm no longer certain that this is actually an issue... See log at [^compaction.log]specifically the lines [1] and [2]. * At [1] we see that the first compaction sees the tar files at the time of last GC as 103.8GB (with the current size being 105.2GB) ** This compaction fails * Due to some unique config, the environment reties the compaction. ** The second execution still sees the size at last compaction as 103.8GB, so the compaction runs again... ...as such, I think the issue might have been the permissions on gc.log all along... WDYT? [1] {code:java} 27.11.2018 02:00:00.102 *INFO* [TarMK revision gc [/mnt/crx/author/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #4: estimation completed in 90.87 μs (0 ms). Segmentstore size has increased since the last garbage collection from 103.8 GB (103771792896 bytes) to 105.2 GB (105240221184 bytes), an increase of 1.5 GB (1468428288 bytes) or 1%. This is greater than sizeDeltaEstimation=1.1 GB (1073741824 bytes), so running garbage collection {code} {code:java} 27.11.2018 03:24:28.064 *INFO* [TarMK revision gc [/mnt/crx/author/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #5: compaction failed after 42.78 min (2566555 ms), and 6 cycles {code} [2] {code:java} 27.11.2018 02:41:41.508 *INFO* [TarMK revision gc [/mnt/crx/author/crx-quickstart/repository/segmentstore]] org.apache.jackrabbit.oak.segment.file.FileStore TarMK GC #5: estimation completed in 64.08 μs (0 ms). Segmentstore size has increased since the last garbage collection from 103.8 GB (103771792896 bytes) to 105.3 GB (105253026304 bytes), an increase of 1.5 GB (1481233408 bytes) or 1%. This is greater than sizeDeltaEstimation=1.1 GB (1073741824 bytes), so running garbage collection {code} > Cleanup updates the gc.log after a failed compaction > > > Key: OAK-7914 > URL: https://issues.apache.org/jira/browse/OAK-7914 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Francesco Mari >Priority: Major > Fix For: 1.10 > > Attachments: compaction.log > > > The {{gc.log}} is always updated during the cleanup phase, regardless of the > result of the compaction phase. This might cause a scenario similar to the > following. > - A repository of 100GB, of which 40GB is garbage, is compacted. > - The estimation phase decides it's OK to compact. > - Compaction produces a new head state, adding another 60GB. > - Compaction fails, maybe because of too many concurrent commits. > - Cleanup removes the 60GB generated during compaction. > - Cleanup adds an entry to the {{gc.log}} recording the current size of the > repository, 100GB. > Now, let's imagine that compaction is run shortly after that. The amount of > content added to the repository is negligible. For the sake of simplicity, > let's say that the size of the repository hasn't changed. The following > happens. > - The repository is 100GB, of which 40GB is the same garbage that wasn't > removed above. > - The estimation phase decides it's not OK to compact, because the {{gc.log}} > reports that the latest known size of the repository is 100GB, and there is > not enough content to remove. > This is in fact a bug, because there are 40GB worth of garbage in the > repository, but estimation is not able to see that anymore. The solution > seems to be not to update the {{gc.log}} if compaction fails. In other words, > {{gc.log}} should contain the size of the *compacted* repository over time, > and no more. > Thanks to [~rma61...@adobe.com] for reporting it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-7914) Cleanup updates the gc.log after a failed compaction
[ https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-7914: --- Attachment: compaction.log > Cleanup updates the gc.log after a failed compaction > > > Key: OAK-7914 > URL: https://issues.apache.org/jira/browse/OAK-7914 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Francesco Mari >Priority: Major > Fix For: 1.10 > > Attachments: compaction.log > > > The {{gc.log}} is always updated during the cleanup phase, regardless of the > result of the compaction phase. This might cause a scenario similar to the > following. > - A repository of 100GB, of which 40GB is garbage, is compacted. > - The estimation phase decides it's OK to compact. > - Compaction produces a new head state, adding another 60GB. > - Compaction fails, maybe because of too many concurrent commits. > - Cleanup removes the 60GB generated during compaction. > - Cleanup adds an entry to the {{gc.log}} recording the current size of the > repository, 100GB. > Now, let's imagine that compaction is run shortly after that. The amount of > content added to the repository is negligible. For the sake of simplicity, > let's say that the size of the repository hasn't changed. The following > happens. > - The repository is 100GB, of which 40GB is the same garbage that wasn't > removed above. > - The estimation phase decides it's not OK to compact, because the {{gc.log}} > reports that the latest known size of the repository is 100GB, and there is > not enough content to remove. > This is in fact a bug, because there are 40GB worth of garbage in the > repository, but estimation is not able to see that anymore. The solution > seems to be not to update the {{gc.log}} if compaction fails. In other words, > {{gc.log}} should contain the size of the *compacted* repository over time, > and no more. > Thanks to [~rma61...@adobe.com] for reporting it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (OAK-7914) Cleanup updates the gc.log after a failed compaction
[ https://issues.apache.org/jira/browse/OAK-7914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16697524#comment-16697524 ] Tom Blackford commented on OAK-7914: Thanks [~mduerig] and [~frm] I went to collect logs then noticed that GC log was not being correctly written (it was owned by the root user for some reason). I have corrected that and I will see how it works over the weekend (it is Oak 1.6 so there is no tail/full distinction here IIUC). > Cleanup updates the gc.log after a failed compaction > > > Key: OAK-7914 > URL: https://issues.apache.org/jira/browse/OAK-7914 > Project: Jackrabbit Oak > Issue Type: Bug > Components: segment-tar >Reporter: Francesco Mari >Priority: Major > Fix For: 1.10 > > > The {{gc.log}} is always updated during the cleanup phase, regardless of the > result of the compaction phase. This might cause a scenario similar to the > following. > - A repository of 100GB, of which 40GB is garbage, is compacted. > - The estimation phase decides it's OK to compact. > - Compaction produces a new head state, adding another 60GB. > - Compaction fails, maybe because of too many concurrent commits. > - Cleanup removes the 60GB generated during compaction. > - Cleanup adds an entry to the {{gc.log}} recording the current size of the > repository, 100GB. > Now, let's imagine that compaction is run shortly after that. The amount of > content added to the repository is negligible. For the sake of simplicity, > let's say that the size of the repository hasn't changed. The following > happens. > - The repository is 100GB, of which 40GB is the same garbage that wasn't > removed above. > - The estimation phase decides it's not OK to compact, because the {{gc.log}} > reports that the latest known size of the repository is 100GB, and there is > not enough content to remove. > This is in fact a bug, because there are 40GB worth of garbage in the > repository, but estimation is not able to see that anymore. The solution > seems to be not to update the {{gc.log}} if compaction fails. In other words, > {{gc.log}} should contain the size of the *compacted* repository over time, > and no more. > Thanks to [~rma61...@adobe.com] for reporting it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (OAK-7318) Add documentation for function indexes
Tom Blackford created OAK-7318: -- Summary: Add documentation for function indexes Key: OAK-7318 URL: https://issues.apache.org/jira/browse/OAK-7318 Project: Jackrabbit Oak Issue Type: Documentation Components: core, lucene, query Reporter: Tom Blackford It looks like we might need to add documentation for the function index added in OAK-3574 at [https://jackrabbit.apache.org/oak/docs/query/lucene.html.] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (OAK-6076) Ensure tracking of slow queries includes the full execution time including getting result nodes
[ https://issues.apache.org/jira/browse/OAK-6076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-6076: --- Summary: Ensure tracking of slow queries includes the full execution time including getting result nodes (was: Ensure tracking of slow queries includes the full execution time ) > Ensure tracking of slow queries includes the full execution time including > getting result nodes > --- > > Key: OAK-6076 > URL: https://issues.apache.org/jira/browse/OAK-6076 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: query >Reporter: Tom Blackford > > At present the query durations shown in the Oak Query Statistics do not > include the full execution time to provide the results of the query (IIUC the > include only the initial Query Execution time and not the time to get the > resulting nodes). Often the latter time is far higher and as such, many slow > and troublesome queries are not really shown in the QueryStats Slow Queries. > cc [~chetanm] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (OAK-6076) Ensure tracking of slow queries includes the full execution time
Tom Blackford created OAK-6076: -- Summary: Ensure tracking of slow queries includes the full execution time Key: OAK-6076 URL: https://issues.apache.org/jira/browse/OAK-6076 Project: Jackrabbit Oak Issue Type: Improvement Components: query Reporter: Tom Blackford At present the query durations shown in the Oak Query Statistics do not include the full execution time to provide the results of the query (IIUC the include only the initial Query Execution time and not the time to get the resulting nodes). Often the latter time is far higher and as such, many slow and troublesome queries are not really shown in the QueryStats Slow Queries. cc [~chetanm] -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (OAK-5931) Inconsistent behaviour when removing nodes with rep:policy subnodes for users without modify ACL permissions
[ https://issues.apache.org/jira/browse/OAK-5931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tom Blackford updated OAK-5931: --- Attachment: ACLTest.java Adding test case showing the different behaviours. > Inconsistent behaviour when removing nodes with rep:policy subnodes for > users without modify ACL permissions > - > > Key: OAK-5931 > URL: https://issues.apache.org/jira/browse/OAK-5931 > Project: Jackrabbit Oak > Issue Type: Bug > Components: security >Affects Versions: 1.4.14, 1.6.1 >Reporter: Tom Blackford > Attachments: ACLTest.java > > > If a session (without rep:modifyAccessControl) removes a node with a > rep:policy subnode and then recreates it within the same save (without the > rep:policy subnode) the commit diff will mistake the action for the removal > of the ACL, which this session is not authorised to do. > If the session is saved prior to recreating the node, both saves (after > remove and after recreate) will succeed. > From discussion with angela: > {quote} > the diff mechanism used within Root.commit cannot distinguish between the > removal of a policy or the replace of the access controlled node with one > that doesn't have the policy set. within that diff it looks like the removal > of the policy node > {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Created] (OAK-5931) Inconsistent behaviour when removing nodes with rep:policy subnodes for users without modify ACL permissions
Tom Blackford created OAK-5931: -- Summary: Inconsistent behaviour when removing nodes with rep:policy subnodes for users without modify ACL permissions Key: OAK-5931 URL: https://issues.apache.org/jira/browse/OAK-5931 Project: Jackrabbit Oak Issue Type: Bug Components: security Affects Versions: 1.6.1, 1.4.14 Reporter: Tom Blackford If a session (without rep:modifyAccessControl) removes a node with a rep:policy subnode and then recreates it within the same save (without the rep:policy subnode) the commit diff will mistake the action for the removal of the ACL, which this session is not authorised to do. If the session is saved prior to recreating the node, both saves (after remove and after recreate) will succeed. >From discussion with angela: {quote} the diff mechanism used within Root.commit cannot distinguish between the removal of a policy or the replace of the access controlled node with one that doesn't have the policy set. within that diff it looks like the removal of the policy node {quote} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (OAK-4400) Correlate index with the index definition used to build it
[ https://issues.apache.org/jira/browse/OAK-4400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15701623#comment-15701623 ] Tom Blackford commented on OAK-4400: I think this is quite an important improvement for larger deployments. In environments where re-indexing is slow (lots of binaries, 'remote datastore' i.e. S3) the current behaviour can be quite a challenge, as until a modified index has been re-indexed, queries won't necessarily include all results. Even with pre-extraction configured, re-indexing time might be several hours, which is a long time to leave an environment in an inconsistent state. > Correlate index with the index definition used to build it > -- > > Key: OAK-4400 > URL: https://issues.apache.org/jira/browse/OAK-4400 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: lucene, query >Affects Versions: 1.4 >Reporter: Valentin Olteanu > Fix For: 1.8 > > > Currently, if the definition of an index is changed without reindexing, it > will get in an "inconsistent" state. > Of course, the reindexing is usually necessary, but it would be useful to > know with which definition the index was built. This could increase the > visibility of the indexing state and help debugging issues related to it. > Some questions this improvement should respond to: > # What is the definition of the index when the (re)indexing was triggered? > # Are there any changes in the definition since the trigger? Which? > I can imagine a solution built by "versioning" the definition nodes > (oak:QueryIndexDefinition). When the reindex is triggered, a new version of > the node is created and the indexer stores a reference to it. > This would also allow the indexer to keep using the same definition until a > new reindex, even if changes are made meanwhile (i.e. use a fixed version > instead of the latest definition). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (OAK-3813) Exception in datastore leads to async index stop indexing new content
[ https://issues.apache.org/jira/browse/OAK-3813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15112788#comment-15112788 ] Tom Blackford commented on OAK-3813: That is possibly the case, by I can confirm that, as Alex states, the fragility in the Indexing still exists in later Oak versions (in this case 1.2.9). > Exception in datastore leads to async index stop indexing new content > - > > Key: OAK-3813 > URL: https://issues.apache.org/jira/browse/OAK-3813 > Project: Jackrabbit Oak > Issue Type: Bug > Components: lucene >Affects Versions: 1.2.2 >Reporter: Alexander Klimetschek >Priority: Critical > > We are using an S3 based datastore and that (for some other reasons) > sometimes starts to miss certain blobs and throws an exception, see below. > Unfortunately, it seems that this blocks the indexing of any new content - as > the index will try again and again to index that missing binary and fail at > the same point. > It would be great if the indexing process could be more resilient against > error like this. (I think the datastore implementation should probably not > propagate that exception to the outside but just log it, but that's a > separate issue). > This is seen with oak 1.2.2. I had a look at the [latest version on > trunk|https://github.com/apache/jackrabbit-oak/blob/d5da738aa6b43424f84063322987b765aead7813/oak-core/src/main/java/org/apache/jackrabbit/oak/plugins/index/AsyncIndexUpdate.java#L427-L431] > but it seems the behavior has not changed since then. > {noformat} > 17.12.2015 20:50:26.418 -0500 *ERROR* [pool-7-thread-5] > org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job > execution of > org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@5cc5e2f6 : Error > occurred while obtaining InputStream for blobId > [2832539c16b1a2e5745370ee89e41ab562436c5f#109419] > java.lang.RuntimeException: Error occurred while obtaining InputStream for > blobId [2832539c16b1a2e5745370ee89e41ab562436c5f#109419] > at > org.apache.jackrabbit.oak.plugins.blob.BlobStoreBlob.getNewStream(BlobStoreBlob.java:49) > at > org.apache.jackrabbit.oak.plugins.segment.SegmentBlob.getNewStream(SegmentBlob.java:84) > at > org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.loadBlob(OakDirectory.java:216) > at > org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexFile.readBytes(OakDirectory.java:264) > at > org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readBytes(OakDirectory.java:350) > at > org.apache.jackrabbit.oak.plugins.index.lucene.OakDirectory$OakIndexInput.readByte(OakDirectory.java:356) > at org.apache.lucene.store.DataInput.readInt(DataInput.java:84) > at org.apache.lucene.codecs.CodecUtil.checkHeader(CodecUtil.java:126) > at > org.apache.lucene.codecs.lucene41.Lucene41PostingsReader.(Lucene41PostingsReader.java:75) > at > org.apache.lucene.codecs.lucene41.Lucene41PostingsFormat.fieldsProducer(Lucene41PostingsFormat.java:430) > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.(PerFieldPostingsFormat.java:195) > at > org.apache.lucene.codecs.perfield.PerFieldPostingsFormat.fieldsProducer(PerFieldPostingsFormat.java:244) > at > org.apache.lucene.index.SegmentCoreReaders.(SegmentCoreReaders.java:116) > at org.apache.lucene.index.SegmentReader.(SegmentReader.java:96) > at > org.apache.lucene.index.ReadersAndUpdates.getReader(ReadersAndUpdates.java:141) > at > org.apache.lucene.index.BufferedUpdatesStream.applyDeletesAndUpdates(BufferedUpdatesStream.java:279) > at > org.apache.lucene.index.IndexWriter.applyAllDeletesAndUpdates(IndexWriter.java:3191) > at > org.apache.lucene.index.IndexWriter.maybeApplyDeletes(IndexWriter.java:3182) > at org.apache.lucene.index.IndexWriter.doFlush(IndexWriter.java:3155) > at org.apache.lucene.index.IndexWriter.flush(IndexWriter.java:3123) > at > org.apache.lucene.index.IndexWriter.closeInternal(IndexWriter.java:988) > at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:932) > at org.apache.lucene.index.IndexWriter.close(IndexWriter.java:894) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.closeWriter(LuceneIndexEditorContext.java:169) > at > org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.leave(LuceneIndexEditor.java:190) > at > org.apache.jackrabbit.oak.plugins.index.IndexUpdate.leave(IndexUpdate.java:221) > at > org.apache.jackrabbit.oak.spi.commit.VisibleEditor.leave(VisibleEditor.java:63) > at > org.apache.jackrabbit.oak.spi.commit.EditorDiff.process(EditorDiff.java:56
[jira] [Created] (OAK-3870) Support for XPATH queries using wildcards in property names in indexes
Tom Blackford created OAK-3870: -- Summary: Support for XPATH queries using wildcards in property names in indexes Key: OAK-3870 URL: https://issues.apache.org/jira/browse/OAK-3870 Project: Jackrabbit Oak Issue Type: Improvement Components: query Reporter: Tom Blackford At present, it's not possible to optimise the XPATH query below using an index: {code} /jcr:root/content//* [jcr:content/relationship/*/@relationshipName="exampleRelationshipName"] {code} ...due to the wildcard in the relative property path. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (OAK-3595) Repository fails to start after definition of lucene property index with nullCheckEnabled on nt:base
Tom Blackford created OAK-3595: -- Summary: Repository fails to start after definition of lucene property index with nullCheckEnabled on nt:base Key: OAK-3595 URL: https://issues.apache.org/jira/browse/OAK-3595 Project: Jackrabbit Oak Issue Type: Bug Components: lucene Affects Versions: 1.2.4 Reporter: Tom Blackford * Set up a Lucene property index beneath /oak:index node on the ‘nt:base’ node type. * Add a property index for 'testProp' * On the property index definition, set the following properties: ** analyzed (Boolean) : true ** isRegexp (Boolean) : true ** name (String) : testProp ** nodeScopeIndex (Boolean) : true ** nullCheckEnabled (Boolean) : true ** propertyIndex (Boolean) : true ** useInExcerpt (Boolean) : true * The following exception will start appearing in the logs: {code} 05.11.2015 16:52:23.524 *ERROR* [pool-7-thread-1] org.apache.sling.commons.scheduler.impl.QuartzScheduler Exception during job execution of org.apache.jackrabbit.oak.plugins.index.AsyncIndexUpdate@6540caf4 : nullCheckEnabled can be set to true for property definition using regular expression java.lang.IllegalStateException: nullCheckEnabled can be set to true for property definition using regular expression at org.apache.jackrabbit.oak.plugins.index.lucene.PropertyDefinition.validate(PropertyDefinition.java:200) at org.apache.jackrabbit.oak.plugins.index.lucene.PropertyDefinition.(PropertyDefinition.java:125) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexDefinition$IndexingRule.collectPropConfigs(IndexDefinition.java:830) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexDefinition$IndexingRule.(IndexDefinition.java:650) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexDefinition.collectIndexRules(IndexDefinition.java:555) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexDefinition.(IndexDefinition.java:240) at org.apache.jackrabbit.oak.plugins.index.lucene.IndexDefinition.(IndexDefinition.java:217) at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorContext.(LuceneIndexEditorContext.java:143) at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditor.(LuceneIndexEditor.java:134) at org.apache.jackrabbit.oak.plugins.index.lucene.LuceneIndexEditorProvider.getIndexEditor(LuceneIndexEditorProvider.java:63) at org.apache.jackrabbit.oak.plugins.index.CompositeIndexEditorProvider.getIndexEditor(CompositeIndexEditorProvider.java:74) at org.apache.jackrabbit.oak.spi.whiteboard.WhiteboardIndexEditorProvider.getIndexEditor(WhiteboardIndexEditorProvider.java:52) at org.apache.jackrabbit.oak.plugins.index.IndexUpdate.collectIndexEditors(IndexUpdate.ja {code} * Stop and Restart the repository ** The repository will fail to start with the following exception {code} [org.apache.jackrabbit.oak.api.jmx.RepositoryManagementMBean]] ServiceEvent REGISTERED 05.11.2015 16:48:28.238 *ERROR* [FelixStartLevel] com.adobe.granite.repository.impl.SlingRepositoryManager start: Uncaught Throwable trying to access Repository, calling stopRepository() java.lang.IllegalMonitorStateException: attempt to unlock read lock, not locked by current thread at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.unmatchedUnlockException(ReentrantReadWriteLock.java:444) at java.util.concurrent.locks.ReentrantReadWriteLock$Sync.tryReleaseShared(ReentrantReadWriteLock.java:428) at java.util.concurrent.locks.AbstractQueuedSynchronizer.releaseShared(AbstractQueuedSynchronizer.java:1341) at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.unlock(ReentrantReadWriteLock.java:881) {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)