Hi,
over at RDFLib we got a bug report [1] that our SPARQL engine produces weird
results for an aggregate query over result sets with undefined variables like
this:
SELECT ?x (SAMPLE(?y) as ?ys) (SAMPLE(?z) as ?zs) WHERE {
VALUES (?x ?y ?z) {
(2 6 UNDEF)
(2 UNDEF 10)
(3 UNDEF 15)
(3 9 UNDEF)
(5 UNDEF 25)
}
}
GROUP BY ?x
Intuitively my understanding is that this should produce a result set like this:
VALUES (?x ?ys ?zs) {
(2 6 10)
(3 9 15)
(5 UNDEF 25)
}
I fixed this in [2], but there seems to be confusion what the correct solution
is (not only between the RDFLib devs):
- Virtuoso returns the above [1]
- Jena returns { (2 UNDEF UNDEF) (3 UNDEF UNDEF) (5 UNDEF 25) } [2]
So what's the correct answer?
Cheers,
Jörn
PS: The relevant sections seem to be [3], [4] and [5], but to me don't seem
explicit enough to answer the above.
PPS: I also couldn't find any tests in [6] for the above.
[1]: https://github.com/RDFLib/rdflib/issues/563
[2]: https://github.com/RDFLib/rdflib/pull/567
[3]: https://www.w3.org/TR/sparql11-query/#defn_aggSample
[4]: https://www.w3.org/TR/sparql11-query/#aggregateExample2
[5]: https://www.w3.org/TR/sparql11-query/#aggregateAlgebra
[6]: https://www.w3.org/2009/sparql/docs/tests/