[ https://issues.apache.org/jira/browse/MARMOTTA-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052943#comment-16052943 ]
Xavier Sumba commented on MARMOTTA-603: --------------------------------------- A possible related issue with sub select in optional. h1. Data h2. Data for test case 1 {code:sparql} <urn:s1> a <urn:C> . <urn:s2> a <urn:C> . <urn:s3> a <urn:C> . <urn:s4> a <urn:C> . <urn:s5> a <urn:C> . <urn:s6> a <urn:C> . <urn:s7> a <urn:C> . <urn:s8> a <urn:C> . <urn:s9> a <urn:C> . <urn:s10> a <urn:C> . <urn:s11> a <urn:C> . <urn:s12> a <urn:C> . <urn:s1> <urn:p> "01" . <urn:s2> <urn:p> "02" . <urn:s3> <urn:p> "03" . <urn:s4> <urn:p> "04" . <urn:s5> <urn:p> "05" . <urn:s6> <urn:p> "06" . <urn:s7> <urn:p> "07" . <urn:s8> <urn:p> "08" . <urn:s9> <urn:p> "09" . <urn:s10> <urn:p> "10" . <urn:s11> <urn:p> "11" . <urn:s12> <urn:p> "12" . {code} h2. Data for test case 2: {code:sparql} <u:1> <u:r> <u:subject> . <u:1> <u:v> 1 . <u:1> <u:x> <u:x1> . <u:2> <u:r> <u:subject> . <u:2> <u:v> 2 . <u:2> <u:x> <u:x2> . <u:3> <u:r> <u:subject> . <u:3> <u:v> 3 . <u:3> <u:x> <u:x3> . <u:4> <u:r> <u:subject> . <u:4> <u:v> 4 . <u:4> <u:x> <u:x4> . <u:5> <u:r> <u:subject> . <u:5> <u:v> 5 . <u:5> <u:x> <u:x5> . {code} h1. Tests h2. Test case 1: Subquery select is getting values between "1" or "2", but it's returinig a weird results. {code:sparql} SELECT ?s ?label WHERE { ?s a <urn:C> . OPTIONAL { {SELECT ?label WHERE { ?s <urn:p> ?label . } ORDER BY ?label LIMIT 2 } } } ORDER BY ?s LIMIT 10 {code} Query translated to SQL {code:sql} SELECT S2.V2 AS V2, P1.subject AS V1 FROM triples P1 INNER JOIN nodes AS P1_subject_V1 ON P1.subject = P1_subject_V1.id LEFT JOIN (SELECT P1.object AS V2, P1.subject AS V1 FROM triples P1 INNER JOIN nodes AS P1_object_V2 ON P1.object = P1_object_V2.id WHERE P1.deleted = false AND P1.predicate = 876129878216290304 ORDER BY P1_object_V2.svalue ASC ) AS S2 ON (P1.subject = S2.V1) WHERE P1.deleted = false AND P1.predicate = 876129878635720704 AND P1.object = 876129878069489664 ORDER BY P1_subject_V1.svalue ASC LIMIT 10 {code} Expected results {code:sparql} [s=urn:s1;label="01"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s1;label="02"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s10;label="01"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s10;label="02"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s11;label="01"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s11;label="02"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s12;label="01"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s12;label="02"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s2;label="01"^^<http://www.w3.org/2001/XMLSchema#string>] [s=urn:s2;label="02"^^<http://www.w3.org/2001/XMLSchema#string>] {code} Resulsts obtained: {code:sparql} [s=urn:s1;label="01"^^xsd:string] [s=urn:s10;label="10"^^xsd:string] [s=urn:s11;label="11"^^xsd:string] [s=urn:s12;label="12"^^xsd:string] [s=urn:s2;label="02"^^xsd:string] [s=urn:s3;label="03"^^xsd:string] [s=urn:s4;label="04"^^xsd:string] [s=urn:s5;label="05"^^xsd:string] [s=urn:s6;label="06"^^xsd:string] [s=urn:s7;label="07"^^xsd:string] {code} h2. Test Case 2: Error in tranlation of query. {code:sparql} select ?x { { select ?v { ?v <u:r> <u:subject> filter (?v = <u:1>) } }. optional { select ?val { ?v <u:v> ?val .} } ?v <u:x> ?x } {code} Query translated to SQL {code:sql} SELECT NULL AS V1, NULL AS V3, P2.object AS V2 FROM triples P2 CROSS JOIN (SELECT P1.subject AS V1 FROM triples P1 INNER JOIN nodes AS P1_subject_V1 ON P1.subject = P1_subject_V1.id WHERE P1.deleted = false AND P1.predicate = 876144626856243200 AND P1.object = 876144626914963456 AND P1.subject = 876144626751385600 ) AS S1 LEFT JOIN (SELECT NULL AS V1, P1.object AS V2 FROM triples P1 WHERE P1.deleted = false AND P1.predicate = 876144627028209664 ) AS S3 WHERE P2.deleted = false AND P2.predicate = 876144627326005248 AND P2.subject = S1.V1 {code} Expected result: {code:sparql} [x=u:x1] [x=u:x1] [x=u:x1] [x=u:x1] [x=u:x1] {code} Results obtained {code:java} ERROR: syntax error at or near "WHERE" LINE 20: WHERE P2.deleted = false {code} This was found while migrating Marmotta to Sesame 2.8.11 [1] in a new test case. The error persists in branches master and develop. For now, ignoring test cases in MARMOTTA-659 [2]. [1] https://issues.apache.org/jira/browse/MARMOTTA-659 [2] https://github.com/gmora1223/marmotta/commit/e0c9879c93471a475fd0366386e31667be148e6b > SPARQL OPTIONAL issues > ---------------------- > > Key: MARMOTTA-603 > URL: https://issues.apache.org/jira/browse/MARMOTTA-603 > Project: Marmotta > Issue Type: Bug > Components: KiWi Triple Store > Affects Versions: 3.3.0 > Reporter: Rupert Westenthaler > Priority: Critical > > The SPARQL implemenation of the KiWi triple store seams to have issues with > the evaluation of OPTIONAL segments of SPARQL queries. In the following test > data and test queries are provided. > h2. Data > {code} > <urn:test.org:place.1> rdf:type schema:Palce ; > schema:geo <urn:test.org:geo.1> ; > schema:name "Place 1" . > <urn:test.org:geo.1> rdf:type schema:GeoCoordinates ; > schema:latitude "16"^^xsd:double ; > schema:longitude "17"^^xsd:double ; > schema:elevation "123"^^xsd:int . > <urn:test.org:place.2> rdf:type schema:Palce ; > schema:geo <urn:test.org:geo.2> ; > schema:name "Place 2" . > <urn:test.org:geo.2> rdf:type schema:GeoCoordinates ; > schema:latitude "15"^^xsd:double ; > schema:longitude "16"^^xsd:double ; > schema:elevation "99"^^xsd:int . > <urn:test.org:place.3> rdf:type schema:Palce ; > schema:geo <urn:test.org:geo.3> ; > schema:name "Place 3" . > <urn:test.org:geo.3> rdf:type schema:GeoCoordinates ; > schema:latitude "15"^^xsd:double ; > schema:longitude "17"^^xsd:double . > <urn:test.org:place.4> rdf:type schema:Palce ; > schema:geo <urn:test.org:geo.4> ; > schema:name "Place 4" . > <urn:test.org:geo.4> rdf:type schema:GeoCoordinates ; > schema:longitude "17"^^xsd:double ; > schema:elevation "123"^^xsd:int . > {code} > Important is that `geo.1` and `geo.2` do have all latitude, longitude and > elevation defined. `geo.3` has no elevation and `geo.4` is missing the > latitude to simulate invalid geo coordinate data. > h2. Test Case 1 > The following query using an OPTIONAL graph pattern including > `schema:latitude` and `schema:longitude`. This assumes a user just want > lat/long values of locations that do define both. > {code} > PREFIX schema: <http://schema.org/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > SELECT * WHERE { > ?entity schema:geo ?location > OPTIONAL { > ?location schema:latitude ?lat . > ?location schema:longitude ?long . > } > } > {code} > translate to the Algebra > {code} > (base <http://example/base/> > (prefix ((schema: <http://schema.org/>) > (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)) > (leftjoin > (bgp (triple ?entity schema:geo ?location)) > (bgp > (triple ?location schema:latitude ?lat) > (triple ?location schema:longitude ?long) > )))) > {code} > The expected result are > {code} > entity,location,lat,long > urn:test.org:place.1,urn:test.org:geo.1,16,17 > urn:test.org:place.2,urn:test.org:geo.2,15,16 > urn:test.org:place.3,urn:test.org:geo.3,15,17 > urn:test.org:place.4,urn:test.org:geo.4,, > {code} > All four locations are expected in the result set as the `OPTIONAL` graph > pattern is translated to a `leftjoin` with `triple ?entity schema:geo > ?location`. > However for `geo.4` no value is expected for `?lat` AND `long` as this > resource only defines a longitude and therefore does not match > {code} > (bgp > (triple ?location schema:latitude ?lat) > (triple ?location schema:longitude ?long) > ) > {code} > Marmotta responses with > {code} > entity,location,lat,long > urn:test.org:place.1,urn:test.org:geo.1,16,17 > urn:test.org:place.2,urn:test.org:geo.2,15,16 > urn:test.org:place.3,urn:test.org:geo.3,15,17 > urn:test.org:place.4,urn:test.org:geo.4,,17 > {code} > Note that the longitude is returned for the resource `geo.4` > h2. Test Case 2 > As a variation we now also include the `schema:elevation` in the OPTIONAL > graph pattern. > {code} > PREFIX schema: <http://schema.org/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > SELECT * WHERE { > ?entity schema:geo ?location > OPTIONAL { > ?location schema:latitude ?lat . > ?location schema:longitude ?long . > ?location schema:elevation ?alt . > } > } > {code} > This query translates to the following algebra > {code} > (base <http://example/base/> > (prefix ((schema: <http://schema.org/>) > (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)) > (leftjoin > (bgp (triple ?entity schema:geo ?location)) > (bgp > (triple ?location schema:latitude ?lat) > (triple ?location schema:longitude ?long) > (triple ?location schema:elevation ?alt) > )))) > {code} > The expected result would have 4 result rows where `lat`, `long` and `alt` > values are only provided for `geo.1` and `geo.2`. > {code} > entity,location,lat,long,alt > urn:test.org:place.1,urn:test.org:geo.1,16,17,123 > urn:test.org:place.2,urn:test.org:geo.2,15,16,99 > urn:test.org:place.3,urn:test.org:geo.3,,, > urn:test.org:place.4,urn:test.org:geo.4,,, > {code} > With this query Marmotta behaves very strange as the results depend on the > ordering of the tripple patterns in the `OPTIONAL` graph pattern. I will not > include all variations but just provide two examples: > {code} > OPTIONAL { > ?location schema:latitude ?lat . > ?location schema:longitude ?long . > ?location schema:elevation ?alt . > } > {code} > gives > {code} > entity,location,lat,long,alt > urn:test.org:place.1,urn:test.org:geo.1,1.6E1,1.7E1,123 > urn:test.org:place.2,urn:test.org:geo.2,1.5E1,1.6E1,99 > urn:test.org:place.4,urn:test.org:geo.4,,1.7E1,123 > {code} > while > {code} > OPTIONAL { > ?location schema:longitude ?long . > ?location schema:latitude ?lat . > ?location schema:elevation ?alt . > } > {code} > gives > {code} > entity,location,long,lat,alt > urn:test.org:place.1,urn:test.org:geo.1,1.7E1,1.6E1,123 > urn:test.org:place.2,urn:test.org:geo.2,1.6E1,1.5E1,99 > {code} > This behavior further indicates that `OPTIONAL` are wrongly processed. > h2. Test Case 3 > Modifying the query to > {code} > PREFIX schema: <http://schema.org/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > SELECT * WHERE { > ?entity schema:geo ?location > OPTIONAL { > ?location schema:latitude ?lat . > ?location schema:longitude ?long . > } > OPTIONAL { > ?location schema:elevation ?alt . > } > } > {code} > results in a similar result to _Test Case 1_ where we have 4 results, but for > `geo.4` we do get the unexpected value for `?long`. > h2. Test Case 4 > This test case assumes that the user requires `lat` and `long` and optionally > wants the `alt` but only for resources that do have a valid location. > {code} > PREFIX schema: <http://schema.org/> > PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> > SELECT * WHERE { > ?entity schema:geo ?location > OPTIONAL { > ?location schema:latitude ?lat . > ?location schema:longitude ?long . > OPTIONAL { > ?location schema:elevation ?alt . > } > } > } > {code} > This translates to the following algebra > {code} > (base <http://example/base/> > (prefix ((schema: <http://schema.org/>) > (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>)) > (leftjoin > (bgp (triple ?entity schema:geo ?location)) > (leftjoin > (bgp > (triple ?location schema:latitude ?lat) > (triple ?location schema:longitude ?long) > ) > (bgp (triple ?location schema:elevation ?alt)))))) > {code} > So `lat` and `long` values are `leftjoin` with the `alt`. Than the result is > in an other `leftjoin` with the results of `?entity schema:geo ?location`. > Because expected results are as follows > {code} > entity,location,lat,long,alt > urn:test.org:place.1,urn:test.org:geo.1,16,17,123 > urn:test.org:place.2,urn:test.org:geo.2,15,16,99 > urn:test.org:place.3,urn:test.org:geo.3,,, > urn:test.org:place.4,urn:test.org:geo.4,,, > {code} > Marmotta however returns > {code} > entity,location,lat,long,alt > urn:test.org:place.1,urn:test.org:geo.1,16,17,123 > urn:test.org:place.2,urn:test.org:geo.2,15,16,99 > urn:test.org:place.3,urn:test.org:geo.3,15,17, > urn:test.org:place.4,urn:test.org:geo.4,,17,123 > {code} > All test cases show that OPTIONAL query segments are not correctly evaluated > by the SPARQL implementation of the KiWi triple store. -- This message was sent by Atlassian JIRA (v6.4.14#64029)