Shawn Smith created JENA-1771: --------------------------------- Summary: Spilling combined with DISTINCT .. ORDER BY returns rows in the wrong order Key: JENA-1771 URL: https://issues.apache.org/jira/browse/JENA-1771 Project: Apache Jena Issue Type: Bug Components: ARQ Affects Versions: Jena 3.13.1 Reporter: Shawn Smith
It looks like Jena assumes that OpDistinct preserves order, but order is not preserved when spilling occurs. This is only a problem when the ARQ.spillToDiskThreshold setting has been configured. Consider the following query: {code:java} SELECT DISTINCT * WHERE { ?x :p ?v } ORDER BY ASC(?v) {code} Jena executes the ORDER BY ASC(?v) before the DISTINCT, relying on the SPARQL requirement: bq. The order of Distinct(Ψ) must preserve any ordering given by OrderBy. But, when spilling, QueryIterDistinct (which executes OpDistinct) creates a DistinctDataBag with a BindingComparator without any sort conditions. As a result, the DISTINCT operation doesn't preserve the ORDER BY ASC(?v) requirement. You can reproduce this by injecting the following into QueryTest.java then running the ARQTestRefEngine tests: {code:java} void runTestSelect(Query query, QueryExecution qe) { qe.getContext().set(ARQ.spillToDiskThreshold, 4); // add this line ... {code} For example, "ARQTestRefEngine -> Algebra optimizations -> QueryTest.opt-top-05" will fail with: {code} Query: PREFIX : <http://example/> SELECT DISTINCT * WHERE { ?x :p ?v } ORDER BY ASC(?v) LIMIT 5 Got: 5 -------------------------------- ------------- | x | v | ============= | :x1 | 1 | | :x2 | 2 | | :x10 | 10 | | :x11 | 11 | | :x12 | 12 | ------------- Expected: 5 ----------------------------- ------------ | x | v | ============ | :x1 | 1 | | :x2 | 2 | | :x3 | 3 | | :x3a | 3 | | :x4 | 4 | ------------ junit.framework.AssertionFailedError: Results do not match at junit.framework.Assert.fail(Assert.java:57) at junit.framework.Assert.assertTrue(Assert.java:22) at junit.framework.TestCase.assertTrue(TestCase.java:192) at org.apache.jena.sparql.junit.QueryTest.runTestSelect(QueryTest.java:284) at org.apache.jena.sparql.junit.QueryTest.runTestForReal(QueryTest.java:201) at org.apache.jena.sparql.junit.EarlTestCase.runTest(EarlTestCase.java:88) at junit.framework.TestCase.runBare(TestCase.java:141) at junit.framework.TestResult$1.protect(TestResult.java:122) at junit.framework.TestResult.runProtected(TestResult.java:142) at junit.framework.TestResult.run(TestResult.java:125) at junit.framework.TestCase.run(TestCase.java:129) {code} -- This message was sent by Atlassian Jira (v8.3.4#803005)