Re: How do I do a join between multiple model.listStatments calls?

Andy Seaborne Mon, 14 Nov 2016 01:14:17 -0800


Niels Andersen wrote:

If the query above returned 10,000 children in iterator1, then
iterator2 will be called 10,000 times. This does not seem to be very
efficient.


Compared with what?

If the pattern for iterator2, without the information from iterator1returns 10,000,000 items, (which it would in the hash join case), thenit would perform worse.

To the best of my knowledge, TDB already has indexed lists of OSP,
POS and SPO. I would have thought that there was a way to run the
second query by just passing an ordered list of the objects returned
in the first query. This provides for far better matching than having
to run the same query many times.


SPO, POS, OSP are not lists (they are B+Trees with range scans).

TDB usually uses an index join (there are has joins as well).

It does it efficiently, not retrieving the RDFterm representation (whichwould require persistent storage access although it is heavily cached)but using the internal numbers used in the index.


{ :nodeResource :child ?X .
  ?X rdfs:label ?Y
}

TDB will, in the absence of an stats file, will execute in that order.If you swap them, it will still start at ":nodeResource :child ?X ."

I don't see where filtering "< 5" fits into this example. rdfs;labelsare typically strings.


FILTER(?Z > 12345) is faster if done by TDB than in API code.

If you can calling the pattern repeatedly with different :nodeResourcevalues, then you will incur overhead.


    Andy

Re: How do I do a join between multiple model.listStatments calls?

Reply via email to