Honestly - probably because of lack of knowledge - I don't see how that
can happen with the text index. You have a single triple pattern that is
querying the Lucene index for the given pattern and returns by default
at most 10 000 documents.
text:query (skos:prefLabel skos:altLabel "\"xx yy\"" "lang:en" )
translates to
( (prefLabel:"\"xx yy\"" OR altLabel:"\"xx yy\"") AND lang:en)
which indeed can return duplicate documents as for each triple a
separate document is created and indexed.
I still don't get how a query with limit 1000 returning 560 then doesn't
return 100 if using limit 100
Currently, I find your results quite counter intuitive, but I still have
to learn a log when using RDF, SPARQL and Jena.
Can you share some data please to reproduce?
What happens for a single property only? Pagination should work as
you're doing, the Lucene query is internally executed once, then cached
- for later requests the same Lucene documents hits should be reused
On 19.10.22 08:21, Mikael Pesonen wrote:
Hi,
yes, same select as only query gets exactly limit amount of triples.
On 18/10/2022 16.48, Lorenz Buehmann wrote:
did you get those results when running only this subquery? Afaik, the
default limit of the Lucene text query is at most 10 000 documents -
and I don't think that the outer LIMIT would make it to the Lucene
request
On 18.10.22 13:35, Mikael Pesonen wrote:
I have a bigger query that starts with inner select
{ SELECT ?s ?score WHERE {
(?s ?score) text:query (skos:prefLabel skos:altLabel "\"xx yy\""
"lang:en" ) .
} order by desc(?score) offset 0 limit 1000 }
There are about 10000 results. limit 1000 returns ~560 and limit 100
~75 results. How do I page results correctly?