Marcel Reutegger wrote:
Christoph Kiehl wrote:
I've created a jira issue: http://issues.apache.org/jira/browse/JCR-791
Are you working on this issue? Or should I try to implement something?
I just started working on it ;)
Great news ;)
Now that you are working on implementing this cache on a per index reader basis,
I got another suggestion for improvement ;)
As I understand in DescendantSelfAxisQuery.DescendantSelfAxisScorer the
contextHits are used to filter the subHits result to only include nodes of the
given context. The context is something like /foo/bar//*, which means all
descendents of /foo/bar. Is that right?
In our application the context for most of our queries is the same, so it would
make a lot of sense to cache the contextHits for this context. There is already
a todo in the constructor of DescendantSelfAxisScorer which probably aims at this.
I would go even further and not only cache these contextHits, but cache
contextHits per _node_ in a hierarchy, which means there is a BitSet for
/foo/bar/bla[1], /foo/bar/bla[2] and so on. If I need the BitSet for /foo/bar//*
I could just join the BitSets of the descendents. This would allow reuse the
BitSets for different contexts. What do you think about this? It should improve
performance a lot the larger the resultset is an the less specific your context is.
It seems like if I rewrite the following query from
/foo/[EMAIL PROTECTED]:bar!='john' and @foo:bar!='doe']
to
/foo/*[not(@foo:bar='john' or @foo:bar='doe')]
I get a better performance. Can you confirm this?
Yes, I can. Basically because any != comparison is translated into: get
all nodes with the given property, then exclude the ones that match the
literal. Which is obviously much more expensive than just: get all nodes
that match a given literal.
Wouldn't it make sense to rewrite all @foo:bar!='john' queries to
not(@foo:bar!='john') by default instead of using creating a MatchAllQuery?
Cheers,
Christoph