On 11/05/2021 16:54, Kimball, Adam wrote:
I know that I’ve asked this question before, but I am still struggling to
understand how I might handle this case:
I have a Jena DB of event entries. One common way to view the events is to
page through them. Normally this is done by seeing the most recent 50 events
and then paging to the next 50 most recent and so on.
In pure SPARQL, I don’t really see an efficient way to accomplish this. With
limit and offset, I don’t really save anything other than i/o since the whole
result set will need to be ordered before this limit/offset has an effect. And
that is killing us now.
My guess is we will need to implement some caching or possibly index the graph
with Lucene or something. It is doable but definitely not ideal. Maybe I can
use the quad position to facilitate this? I am assuming this cannot be
optimized within Jena itself?
Best,
Adam
Hi Adam,
No - there isn't a better way in std SPARQL. If you think the app is
going to process all the results, reading the whole thing into some
local cache is a way to go.
The proper solution is a overhaul of the SPARQL protocol.
Also, HTTP/2 may offer some iteresting possibilities.
Specific to ARQ: query execution is often predictable and stable order.
There aren't many places where - absent concurrent updates - the order
will be different from call to call.
FWIW Jena does optimize "top k" sorts SELECT-sort-LIMIT/OFFSET up to
(from memory) k=1000 items.
> Maybe I can use the quad position to facilitate this?
Not sure what the idea is here.
Andy