Hey Julian, Ok cool, for me the context is querying on a page in AEM, so I am creating a query for one cq:Page node, so that will be most of the times max like 10-20 nodes. So what you are saying then is that it shouldn’t really matter in performance to choose either for manually traverse myself or doing a query when looking to see if a specific property name exists on the page, because behind the scene it will most likely traverse itself then anyway, right?
Thanks! Roy > On 20 Jun 2016, at 15:43, Julian Sedding <[email protected]> wrote: > > Hi Roy > > From you question ("hard to put an index to it") I assume that you are > running on an Oak repository. If that is incorrect, my answer does not > apply. > > Oak will always consider traversal as an alternative to existing > indexes. For most queries the cost of traversal is so high that an > index is chosen. However, if no suitable index exists (and > theoretically also if the traversal is cheaper than a lookup in a > matching index), it will do a traversal behind the scenes. Note that > traversal logs a warning every 10000 traversed nodes. So if you plan > to traverse more than that you should really consider creating an > index. > > In short: with Oak using a query on a small subtree should give you > what you want, even without an index. > > Regards > Julian > > > On Thu, Jun 16, 2016 at 4:44 PM, Steven Walters <[email protected]> wrote: >> Hopefully other people chime in here, I've only had bad experiences >> with utilizing queries and have often resulted in personally never >> using them - so I always end up iterating/navigating myself. >> >> Theoretically if you have a REALLY GOOD index then you may get some >> similar performances, but if your index(es) are inefficient, then it's >> just wasted CPU cycles (you'd wish those CPU cycles were going to a >> good cause, but they're not). >> >> the transition of Sling (and AEM) to Oak from Jackrabbit 2.x made this >> experience worse with the awkward indexing policies/process in Oak, >> and the fact that Oak never seemed to ever use multiple indexes. >> Oak always seemed to calculates the costs of the entire query against >> all the available indexes and only chooses the ONE best index. >> This sounds like a good idea in theory, but then most DBMS I've used >> in the past utilize ALL the indexes they can - not just one. >> >> So basically i guess this comes to be "If you have a good index (in >> that it can apply to ALL the conditions/attributes/properties of your >> query) then using a query should be fine, otherwise iterate yourself" >> having any condition missing from the index can be fatal in >> performance, such as lacking the evaluatePathRestrictions = true, >> which without it is basically death of the system if you have a lot of >> content. >> >> But really, I hope some other people with more positive experiences >> can provide some better advice. >> >> On Thu, Jun 16, 2016 at 11:08 PM, Roy Teeuwen <[email protected]> wrote: >>> Ok, it would be handy to have an estimate on the approximate amount / >>> levels of resources when to go for iterating vs querying :). >>> >>> Greets >>> Roy >>>> On 16 Jun 2016, at 16:06, Steven Walters <[email protected]> wrote: >>>> >>>> if you know there are that few resources, then I say iterating would be >>>> better performing than XPath / JCR-SQL2 queries. >>>> This is primarily from past experience speaking in that queries have >>>> generally turned out (often MUCH) slower than directly iterating if you >>>> know what you're actually looking for. >>>> >>>> >>>> On Thu, Jun 16, 2016 at 10:28 PM, Roy Teeuwen <[email protected]> wrote: >>>> >>>>> Hello all, >>>>> >>>>> Lets say I got a resource with around 10-20 child/grand-child resources, >>>>> not going deeper than 3 levels max. What is the most performant when >>>>> searching for the child resources containing a specific property (the >>>>> property is configurable with OSGi, so hard to put an index on it). >>>>> Iterating the child / grand-child resources until you find it or making an >>>>> xpath/jcr-sql2 query? When would one option start to be more performant >>>>> than the other. >>>>> >>>>> Thanks! >>>>> Roy >>>
