Jena has APIs for local and remote access for SPARQL.

Many large installations are a SPARQL triple store with business logic layer.

On 14/11/16 19:10, Niels Andersen wrote:
Andy is answering my original question about joins, he stated that
Jena ARQ is using the Jena API, Graph.find and listStatement (you
included this in your response).

I said it uses Graph.find or is faster.

TDB cuts through Graph.find and listStatements to work on the indexes themselves.

Again, if I understand this
correctly, then Jena ARQ does not implement a join algorithm based on
two sorted lists, so the join must be performed using lookups for
each element returned from the first list (like I showed in my
example). While this is OK for small datasets, it becomes problematic
for large datasets. Do I understand this correctly?

It's called an index join and in TDB does work with RDF terms but with internal ids (which are fixed 8 bytes long). The representation of teh RDF terms are left on disk unless needed later ("if you do not need data, do not touch it.").

If the first set is small, an index join is faster than a merge join. A merge join still need to traverse the whole of both sides if it does not use sideways passing ... in which case it becomes a form of index join. Due to caching, index lookup is not necessarily expensive.

I would still like to hear what you are intending to use RDF for. What features of semntic web, or RDF are you exploting? You email address suggests an IoT application.

        Andy

Reply via email to