Hi all,

We have a servlet in place that exports redirects to apache using rewrite
maps [1]. That servlet is running a query [2] against a large repository
that holds ~ 2 Mio nodes for the primary type cq:PageContent (as referenced
in the query). We have a lucene index defined for property redirectTarget
that holds around 1 Mio documents when checked via JMX [3]  (the custom
index also holds the properties sling:alias and sling:vanityPath that are
not strictly needed for this query but for another use case, see [7] for
exact definition). When checking the query with the explain query tool, it
always uses the index www_redirectmanager as desired. The amount of nodes
that have the property redirectTarget set is ~150,000. The servlet returns
usually within 1-2 minutes which is totally fine (it is called once per
hour).

Since upgrading to OAK 1.8.7 (we had 1.4.3 before without problems), we get
the error [6] in around 2% of the cases (so most of the time it works, but
sometimes we get the error and the servlet fails, it is *not*
deterministic). I suppose this is connected to the change in [5]. We have
already increased queryLimitInMemory and queryLimitReads
(PID org.apache.jackrabbit.oak.query.QueryEngineSettingsService) to 500,000
(from default 200,000) but we still get the error every now and then. We
had once one node that always (deterministically) returned the error [6],
after reindexing of [7] we were back to non-deterministic 2% of the queries
(but even while the problem was deterministic on that node, explain query
always returned that index to be used).

I have the following understanding:
1. The settings queryLimitInMemory and queryLimitReads both are evaluated
*after* the results form the index are retrieved (so the query engine asks
the index for nodes and gets ~150,000 and reads those and then applies
further criteria to filter the result set further, to avoid large result
sets for this filtering those properties are in place)
2. Having multiple properties in the index [3] should not really make a
difference for this particular problem since no matter how many properties
are held in index the result set for query [1] is always the same
3. No matter if the assumptions from 1. and 2. are true, the problem should
be deterministic

Has anyone else run into a similar problem? Are the assumptions above
correct? Obviously the query [1] could be split up to run many queries for
sub paths or even traverse all paths for the property, but conceptionally
it should really possible to do this in one query IMHO.

-Georg

[1] https://httpd.apache.org/docs/2.4/rewrite/rewritemap.html
[2] SELECT * FROM [cq:PageContent] AS s WHERE ISDESCENDANTNODE([/content])
and s.[redirectTarget] is not null
[3]
/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DLucene+Index+statistics%2Ctype%3DLuceneIndex
[4]
https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Slow_Queries_and_Read_Limits
[5] https://issues.apache.org/jira/browse/OAK-6875
[6] 07.12.2018 11:01:22.408 *WARN* [192.168.166.72 [1544176801343] GET
/bin/www/redirectmap/redirecttarget HTTP/1.1]
org.apache.jackrabbit.oak.query.FilterIterators The query read or traversed
more than 500000 nodes.
java.lang.UnsupportedOperationException: The query read or traversed more
than 500000 nodes. To avoid affecting other tasks, processing was stopped.

[7]
    <www_redirectmanager
        jcr:primaryType="oak:QueryIndexDefinition"
        async="async"
        compatVersion="{Long}2"
        evaluatePathRestrictions="{Boolean}true"

excludedPaths="[/var,/system,/apps,/libs,/content/dam,/etc,/jcr:system]"
        reindex="{Boolean}false"
        reindexCount="{Long}7"
        type="lucene">
        <aggregates jcr:primaryType="nt:unstructured">
            <cq:PageContent jcr:primaryType="nt:unstructured">
                <include0
                    jcr:primaryType="nt:unstructured"
                    path="*"
                    relativeNode="{Boolean}false"/>
            </cq:PageContent>
        </aggregates>
        <facets jcr:primaryType="nt:unstructured"/>
        <indexRules jcr:primaryType="nt:unstructured">
            <cq:PageContent jcr:primaryType="nt:unstructured">
                <properties jcr:primaryType="nt:unstructured">
                    <redirectTarget
                        jcr:primaryType="nt:unstructured"
                        name="redirectTarget"
                        notNullCheckEnabled="{Boolean}true"
                        propertyIndex="{Boolean}true"/>
                    <alias
                        jcr:primaryType="nt:unstructured"
                        name="sling:alias"
                        notNullCheckEnabled="{Boolean}true"
                        propertyIndex="{Boolean}true"/>
                    <vanityPath
                        jcr:primaryType="nt:unstructured"
                        name="sling:vanityPath"
                        notNullCheckEnabled="{Boolean}true"
                        propertyIndex="{Boolean}true"/>
                </properties>
            </cq:PageContent>
        </indexRules>
    </www_redirectmanager>

Reply via email to