Re: Query Performance and Optimization

Christoph Kiehl Wed, 14 Mar 2007 10:10:34 -0800

Marcel Reutegger wrote:

Christoph Kiehl wrote:

I've created a jira issue: http://issues.apache.org/jira/browse/JCR-791


Are you working on this issue? Or should I try to implement something?


I just started working on it ;)


Great news ;)

Now that you are working on implementing this cache on a per index reader basis,I got another suggestion for improvement ;)

As I understand in DescendantSelfAxisQuery.DescendantSelfAxisScorer thecontextHits are used to filter the subHits result to only include nodes of thegiven context. The context is something like /foo/bar//*, which means alldescendents of /foo/bar. Is that right?In our application the context for most of our queries is the same, so it wouldmake a lot of sense to cache the contextHits for this context. There is alreadya todo in the constructor of DescendantSelfAxisScorer which probably aims at this.I would go even further and not only cache these contextHits, but cachecontextHits per _node_ in a hierarchy, which means there is a BitSet for/foo/bar/bla[1], /foo/bar/bla[2] and so on. If I need the BitSet for /foo/bar//*I could just join the BitSets of the descendents. This would allow reuse theBitSets for different contexts. What do you think about this? It should improveperformance a lot the larger the resultset is an the less specific your context is.

It seems like if I rewrite the following query from

/foo/[EMAIL PROTECTED]:bar!='john' and @foo:bar!='doe']

to

/foo/*[not(@foo:bar='john' or @foo:bar='doe')]

I get a better performance. Can you confirm this?
Yes, I can. Basically because any != comparison is translated into: getall nodes with the given property, then exclude the ones that match theliteral. Which is obviously much more expensive than just: get all nodesthat match a given literal.

Wouldn't it make sense to rewrite all @foo:bar!='john' queries tonot(@foo:bar!='john') by default instead of using creating a MatchAllQuery?


Cheers,
Christoph

Re: Query Performance and Optimization

Reply via email to