Hi
I am a little confused by this discussion.
I understand the original posters question -- what they are talking
about is an incredibly common use case. Run query, look at the first
1-200 results, add filters, re-run new query, repeat and rinse.
If this were JDBC the client would run one query with no limits/offsets,
so in this example the query with the results ordered by date. The
client UI would then implement the paging itself. The client would cache
past pages, so that users can page backward, and forward paging is just
standard JDBC result streaming. Admittedly if you were clever you could
also do a dance with a scrolling result set, although personally I don't
find them all that useful.
What isn't clear to me is why you can't do the same thing with SPARQL?
In other words I am unclear where the implementation issues with
SPARQL/Jena are occurring.
thanks
graham
I don't know what Jena's policy regarding diverging from the SPARQL
standards are, but
On 14/05/21 7:32 am, Andy Seaborne wrote:
On 11/05/2021 16:54, Kimball, Adam wrote:
I know that I’ve asked this question before, but I am still
struggling to understand how I might handle this case:
I have a Jena DB of event entries. One common way to view the events
is to page through them. Normally this is done by seeing the most
recent 50 events and then paging to the next 50 most recent and so on.
In pure SPARQL, I don’t really see an efficient way to accomplish
this. With limit and offset, I don’t really save anything other than
i/o since the whole result set will need to be ordered before this
limit/offset has an effect. And that is killing us now.
My guess is we will need to implement some caching or possibly index
the graph with Lucene or something. It is doable but definitely not
ideal. Maybe I can use the quad position to facilitate this? I am
assuming this cannot be optimized within Jena itself?
Best,
Adam
Hi Adam,
No - there isn't a better way in std SPARQL. If you think the app is
going to process all the results, reading the whole thing into some
local cache is a way to go.
The proper solution is a overhaul of the SPARQL protocol.
Also, HTTP/2 may offer some iteresting possibilities.
Specific to ARQ: query execution is often predictable and stable
order. There aren't many places where - absent concurrent updates -
the order will be different from call to call.
FWIW Jena does optimize "top k" sorts SELECT-sort-LIMIT/OFFSET up to
(from memory) k=1000 items.
> Maybe I can use the quad position to facilitate this?
Not sure what the idea is here.
Andy
--
Doubt is a pain too lonely to know that faith is his twin brother. - Kahlil
Gibran