[ 
https://issues.apache.org/jira/browse/SLING-8407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836946#comment-16836946
 ] 

Thomas Mueller commented on SLING-8407:
---------------------------------------

[~jebailey] I'm afraid I'm not familiar with the specific requirements of the 
job manager. But looking at the code, it seems it makes a lot of sense to run a 
query. There are potentially many filter conditions, and running a query allows 
to use an index. Assuming the index definition is correct, using an index 
reduces the reads from O\(n\) (where n is the number of jobs in the repository) 
to roughly O\(1\), assuming the consumer only iterates over a constant number 
of entries of the result (which I hope is the case). So in fact you do 
sacrifice a lot of speed and efficiency if you traverse and filter yourself.

> which utilizes a tree traversal, which is not the same as an oak traversal

I assume with "oak traversal" you mean "tree traversal within the query 
engine", and with "tree traversal" you mean traversing a tree using the JCR API 
(getNodes,...). Speed-wise, there is no difference. It's just that running a 
query logs a warning in case you traverse a lot, while using the JCR API does 
not. (Unfortunately we can't easily implement a warning in this case - the JCR 
API is very fine-grained, making this hard.)

> JobManagerImpl.findJobs should prevent traversal
> ------------------------------------------------
>
>                 Key: SLING-8407
>                 URL: https://issues.apache.org/jira/browse/SLING-8407
>             Project: Sling
>          Issue Type: Improvement
>          Components: Event
>            Reporter: Thomas Mueller
>            Priority: Major
>
> The method 
> [JobManagerImpl.findJobs|https://github.com/apache/sling-org-apache-sling-event/blob/master/src/main/java/org/apache/sling/event/impl/jobs/JobManagerImpl.java#L373]
>  runs a JCR query to find all jobs for a topic.
> It is possible that such a query is running while the repository isn't 
> initialized yet, meaning while the index isn't available yet. What is 
> happening in this case is that the query is traversing all nodes below that 
> path, triggering a warning that the query doesn't use an index. It is 
> sometimes happening when a health check is running before the repository is 
> initialized (ReplicationQueueHealthCheck and DistributionQueueHealthCheck).
> It doesn't make sense that the query traverses the nodes. It should use an 
> index. If the index isn't available yet, it should fail. Therefore, the query 
> should use "option(traversal fail)". That would result in an exception that 
> can be caught.  I will log a related issue to change the health checks to 
> process this exception and return HEALTH_CHECK_ERROR for this case.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to