well, the problem with that approach is the following:

assume you have a tree of nodes under /a, let's say 10 million nodes. then a user renames /a to /b. the index would have to re-index 10 million nodes. this operation is currently very efficient and takes just a couple of milliseconds, because the nodes in the index are just linked with a parent uuid. renaming a node simply means an update of one node (document) in the index.

but I agree with both of you that there is a lot of potential in optimizing path/hierarchy resolution in the lucene query handler in jackrabbit. some optimization is already done by caching the child->parent link information. e.g. see: http://svn.apache.org/repos/asf/jackrabbit/tags/1.2.3/jackrabbit-core/src/main/java/org/apache/jackrabbit/core/query/lucene/CachingIndexReader.java
(-> the field called 'parents')

That's in the end what the ChildAxisQuery and DescendantSelfAxisQuery use.

regards
 marcel

Michael Neale wrote:
Yeah I would +1 to that, its something I do fairly often (there is often a
lot of info in a path that is relevant to a query - given that we have gone
ahead and nicely partitioned our content !).

On 3/13/07, David Johnson <[EMAIL PROTECTED]> wrote:

As another example, for each node, perhaps every potential parent path
could
be added to the index - as an example a node at /a/b/c/d/e/f/g would have
index entries:

path1: /a
path2: /a/b
path3: /a/b/c
path4: /a/b/c/d
path5: /a/b/c/d/e
path6: /a/b/c/d/e/f

so queries for specific sub-paths - e.g., select * from my:type where
jcr:path like '/a/b/c/%'  could be mapped to a direct lucene match query
i.e.,
path3 = /a/b/c

The index entry to use for the Lucene query could be determined easily by
simple parsing of the path specified in the query.

Perhaps something like this is already in the code. Is ChildAxisQuery and
DescendantSelfAxisQuery currently used for cases like this?

-Dave

Reply via email to