Hi all,
sorry for cross-posting, but I didn't get an answer on the users list.
I think the change made with OAK-6875 does not always have the desired
effect (for sure there is some un-deterministic behaviour for large
content being accessed via a lucene index which should at least be
deterministic&explainable). See below email for details (if somebody
could confirm or reject my assumptions, that would already help a lot!)
Also in general: Wouldn't it make sense to introduce a query option ala
[1] to disable read/memory limits for one particular query? It would
then just be a safety net for queries that unexpectedly exceed the
limits, for special use cases as described below it could be turned it
off.
-Georg
[1]
https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Query_Option_Index_Tag
Original Message
Subject: Problem with read limits & a query using a lucene index with
many results (but below setting queryLimitReads)
Date: 2019-02-07 01:46
From: Georg Henzler
To: us...@jackrabbit.apache.org
Reply-To: us...@jackrabbit.apache.org
Hi all,
We have a servlet in place that exports redirects to apache using
rewrite
maps [1]. That servlet is running a query [2] against a large repository
that holds ~ 2 Mio nodes for the primary type cq:PageContent (as
referenced
in the query). We have a lucene index defined for property
redirectTarget
that holds around 1 Mio documents when checked via JMX [3] (the custom
index also holds the properties sling:alias and sling:vanityPath that
are
not strictly needed for this query but for another use case, see [7] for
exact definition). When checking the query with the explain query tool,
it
always uses the index www_redirectmanager as desired. The amount of
nodes
that have the property redirectTarget set is ~150,000. The servlet
returns
usually within 1-2 minutes which is totally fine (it is called once per
hour).
Since upgrading to OAK 1.8.7 (we had 1.4.3 before without problems), we
get
the error [6] in around 2% of the cases (so most of the time it works,
but
sometimes we get the error and the servlet fails, it is *not*
deterministic). I suppose this is connected to the change in [5]. We
have
already increased queryLimitInMemory and queryLimitReads
(PID org.apache.jackrabbit.oak.query.QueryEngineSettingsService) to
500,000
(from default 200,000) but we still get the error every now and then. We
had once one node that always (deterministically) returned the error
[6],
after reindexing of [7] we were back to non-deterministic 2% of the
queries
(but even while the problem was deterministic on that node, explain
query
always returned that index to be used).
I have the following understanding:
1. The settings queryLimitInMemory and queryLimitReads both are
evaluated
*after* the results form the index are retrieved (so the query engine
asks
the index for nodes and gets ~150,000 and reads those and then applies
further criteria to filter the result set further, to avoid large result
sets for this filtering those properties are in place)
2. Having multiple properties in the index [3] should not really make a
difference for this particular problem since no matter how many
properties
are held in index the result set for query [1] is always the same
3. No matter if the assumptions from 1. and 2. are true, the problem
should
be deterministic
Has anyone else run into a similar problem? Are the assumptions above
correct? Obviously the query [1] could be split up to run many queries
for
sub paths or even traverse all paths for the property, but
conceptionally
it should really possible to do this in one query IMHO.
-Georg
[1] https://httpd.apache.org/docs/2.4/rewrite/rewritemap.html
[2] SELECT * FROM [cq:PageContent] AS s WHERE
ISDESCENDANTNODE([/content])
and s.[redirectTarget] is not null
[3]
/system/console/jmx/org.apache.jackrabbit.oak%3Aname%3DLucene+Index+statistics%2Ctype%3DLuceneIndex
[4]
https://jackrabbit.apache.org/oak/docs/query/query-engine.html#Slow_Queries_and_Read_Limits
[5] https://issues.apache.org/jira/browse/OAK-6875
[6] 07.12.2018 11:01:22.408 *WARN* [192.168.166.72 [1544176801343] GET
/bin/www/redirectmap/redirecttarget HTTP/1.1]
org.apache.jackrabbit.oak.query.FilterIterators The query read or
traversed
more than 50 nodes.
java.lang.UnsupportedOperationException: The query read or traversed
more
than 50 nodes. To avoid affecting other tasks, processing was
stopped.
[7]