AW: Bulk load of several nodes

Kevin Müller Mon, 16 Jan 2012 04:47:14 -0800

Thanks for your answer Alex.

"No, there will be one query. This acts against the lucene search index.
For each result (= row = node) based on the search index, the node be
loaded (= fetched from the persistence manager).
That last step can be done lazily - i.e. only a number X of results is
fetched at the beginning, the rest will be fetched when you iterate that
far through the results (see resultFetchSize [0])."

The second step is what I was talking about, not the Lucene query but the SQL 
query that fetches the actual data for each node (I'm using a 
DatabasePersistenceManager). One separate query is executed for each result 
node.

"Now for the nodes itself: if you use a bundle persistence manager, nodes
are stored as "node bundle" which consist of all properties (except for
larger binaries in the data store). Thus if a node is fetched from the
persistence manager, it will already have all properties in-memory."

Yes, that means I can get all properties of ONE node in ONE database query but 
I still can't get properties of N nodes in ONE query. If we communicate with a 
non local database and we get like 100 search results this could be quite a 
speedup I imagine ...

-----Ursprüngliche Nachricht-----
Von: Alexander Klimetschek [mailto:[email protected]] 
Gesendet: Montag, 16. Januar 2012 13:25
An: [email protected]
Betreff: Re: Bulk load of several nodes

On 13.01.12 11:01, "Kevin Müller" <[email protected]> wrote:

>Hi,
>
>I think in some cases it would be useful to get all or some properties of
>a set of nodes in one database query.
>Can somebody tell me if something like this is planned for future
>releases ?
>
>One example for this usecase would be:
>for (RowIterator it =
>qm.createQuery("//*[@prop='some_value']/(@prop2|@prop3)",
>Query.XPATH).execute().getRows(); it.hasNext(); ) {
>        Row row = it.nextRow();
>        Map map = new HashMap();
>        for (String key : Arrays.asList("prop2", "prop3")) {
>                Value val = row.getValue(key);
>                map.put(key, val != null ? val.getString() : null);
>        }
>        res.put(row.getPath(), map);
>}
>
>Wouldn't be nice if this could be done with one database roundtrip -
>right now (2.2.10) there are at least n roundtrips (n == number of
>results in query) it seems to me.

No, there will be one query. This acts against the lucene search index.
For each result (= row = node) based on the search index, the node be
loaded (= fetched from the persistence manager). This needs to be done not
only for returning the node, but also for checking ACLs (i.e. if it can be
put in the result, because the user has read access). Note that the search
results do not store the results in any way other than using the plain JCR
nodes - the Row interface is just a wrapper around the Node in
Jackrabbit's search implementation.

That last step can be done lazily - i.e. only a number X of results is
fetched at the beginning, the rest will be fetched when you iterate that
far through the results (see resultFetchSize [0]).

Now for the nodes itself: if you use a bundle persistence manager, nodes
are stored as "node bundle" which consist of all properties (except for
larger binaries in the data store). Thus if a node is fetched from the
persistence manager, it will already have all properties in-memory.

[0] http://wiki.apache.org/jackrabbit/Search

HTH,
Alex

-- 
Alexander Klimetschek
Developer // Adobe (Day) // Berlin - Basel

AW: Bulk load of several nodes

Reply via email to