[ 
https://issues.apache.org/jira/browse/MARMOTTA-603?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16052943#comment-16052943
 ] 

Xavier Sumba commented on MARMOTTA-603:
---------------------------------------

A possible related issue with sub select in optional. 
h1. Data
h2. Data for test case 1
{code:sparql}
 <urn:s1> a <urn:C> .  
 <urn:s2> a <urn:C> .  
 <urn:s3> a <urn:C> .  
 <urn:s4> a <urn:C> .  
 <urn:s5> a <urn:C> .  
 <urn:s6> a <urn:C> .  
 <urn:s7> a <urn:C> .  
 <urn:s8> a <urn:C> .  
 <urn:s9> a <urn:C> .  
 <urn:s10> a <urn:C> .  
 <urn:s11> a <urn:C> .  
 <urn:s12> a <urn:C> .  
 <urn:s1> <urn:p> "01" .  
 <urn:s2> <urn:p> "02" .  
 <urn:s3> <urn:p> "03" .  
 <urn:s4> <urn:p> "04" .  
 <urn:s5> <urn:p> "05" .  
 <urn:s6> <urn:p> "06" .  
 <urn:s7> <urn:p> "07" .  
 <urn:s8> <urn:p> "08" .  
 <urn:s9> <urn:p> "09" .  
 <urn:s10> <urn:p> "10" .  
 <urn:s11> <urn:p> "11" .  
 <urn:s12> <urn:p> "12" .  
{code}

h2. Data for test case 2:

{code:sparql}
<u:1> <u:r> <u:subject> .
<u:1> <u:v> 1 .
<u:1> <u:x> <u:x1> .
<u:2> <u:r> <u:subject> .
<u:2> <u:v> 2 .
<u:2> <u:x> <u:x2> .
<u:3> <u:r> <u:subject> .
<u:3> <u:v> 3 .
<u:3> <u:x> <u:x3> .
<u:4> <u:r> <u:subject> .
<u:4> <u:v> 4 .
<u:4> <u:x> <u:x4> .
<u:5> <u:r> <u:subject> .
<u:5> <u:v> 5 .
<u:5> <u:x> <u:x5> .
{code}

h1. Tests
h2. Test case 1:

Subquery select is getting values between "1" or  "2", but it's returinig a 
weird results.
{code:sparql}
SELECT ?s ?label
WHERE { 
          ?s a <urn:C> .
          OPTIONAL  { {SELECT ?label  WHERE { 
                     ?s <urn:p> ?label . 
              } ORDER BY ?label LIMIT 2 
                    }
       }
}
ORDER BY ?s
LIMIT 10
{code}
Query  translated to SQL

{code:sql}
SELECT S2.V2 AS V2, P1.subject AS V1
 FROM triples P1
    INNER JOIN nodes AS P1_subject_V1 ON P1.subject = P1_subject_V1.id 
 LEFT JOIN 
  (SELECT P1.object AS V2, P1.subject AS V1
 FROM triples P1
    INNER JOIN nodes AS P1_object_V2 ON P1.object = P1_object_V2.id 
 WHERE P1.deleted = false
      AND P1.predicate = 876129878216290304
 
 ORDER BY P1_object_V2.svalue ASC 

 ) AS S2 ON (P1.subject = S2.V1)
 WHERE P1.deleted = false
      AND P1.predicate = 876129878635720704
      AND P1.object = 876129878069489664
 
 ORDER BY P1_subject_V1.svalue ASC 

 LIMIT 10 
{code}
Expected results
{code:sparql}
[s=urn:s1;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s1;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s10;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s10;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s11;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s11;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s12;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s12;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s2;label="01"^^<http://www.w3.org/2001/XMLSchema#string>]
[s=urn:s2;label="02"^^<http://www.w3.org/2001/XMLSchema#string>]
{code}
Resulsts obtained:
{code:sparql}
[s=urn:s1;label="01"^^xsd:string]
[s=urn:s10;label="10"^^xsd:string]
[s=urn:s11;label="11"^^xsd:string]
[s=urn:s12;label="12"^^xsd:string]
[s=urn:s2;label="02"^^xsd:string]
[s=urn:s3;label="03"^^xsd:string]
[s=urn:s4;label="04"^^xsd:string]
[s=urn:s5;label="05"^^xsd:string]
[s=urn:s6;label="06"^^xsd:string]
[s=urn:s7;label="07"^^xsd:string]
{code}

h2. Test Case 2:

Error in tranlation of query.
{code:sparql}
select ?x { 
 { select ?v { ?v <u:r> <u:subject> filter (?v = <u:1>) } }.
  optional {  select ?val { ?v <u:v> ?val .} }
  ?v <u:x> ?x 
}
{code}
Query translated to SQL

{code:sql}
SELECT NULL AS V1, NULL AS V3, P2.object AS V2
 FROM triples P2
 CROSS JOIN 
  (SELECT P1.subject AS V1
 FROM triples P1
    INNER JOIN nodes AS P1_subject_V1 ON P1.subject = P1_subject_V1.id 
 WHERE P1.deleted = false
      AND P1.predicate = 876144626856243200
      AND P1.object = 876144626914963456
       AND P1.subject = 876144626751385600
 
 ) AS S1
 LEFT JOIN 
  (SELECT NULL AS V1, P1.object AS V2
 FROM triples P1
 WHERE P1.deleted = false
      AND P1.predicate = 876144627028209664
 
 ) AS S3
 WHERE P2.deleted = false
      AND P2.predicate = 876144627326005248
       AND P2.subject = S1.V1
{code}

Expected result:

{code:sparql}
[x=u:x1]
[x=u:x1]
[x=u:x1]
[x=u:x1]
[x=u:x1]
{code}


Results obtained

{code:java}
ERROR:  syntax error at or near "WHERE"
LINE 20:  WHERE P2.deleted = false
{code}

This was found while migrating Marmotta to Sesame 2.8.11 [1] in a new test 
case. The error persists in branches master and develop. For now, ignoring test 
cases in MARMOTTA-659 [2].

[1] https://issues.apache.org/jira/browse/MARMOTTA-659
[2] 
https://github.com/gmora1223/marmotta/commit/e0c9879c93471a475fd0366386e31667be148e6b


> SPARQL OPTIONAL issues
> ----------------------
>
>                 Key: MARMOTTA-603
>                 URL: https://issues.apache.org/jira/browse/MARMOTTA-603
>             Project: Marmotta
>          Issue Type: Bug
>          Components: KiWi Triple Store
>    Affects Versions: 3.3.0
>            Reporter: Rupert Westenthaler
>            Priority: Critical
>
> The SPARQL implemenation of the KiWi triple store seams to have issues with 
> the evaluation of OPTIONAL segments of SPARQL queries. In the following test 
> data and test queries are provided.
> h2. Data
> {code}
>       <urn:test.org:place.1> rdf:type schema:Palce ;
>               schema:geo <urn:test.org:geo.1> ;
>               schema:name "Place 1" .
>       <urn:test.org:geo.1> rdf:type schema:GeoCoordinates ;
>               schema:latitude "16"^^xsd:double ;
>               schema:longitude "17"^^xsd:double ;
>               schema:elevation "123"^^xsd:int .
>       <urn:test.org:place.2> rdf:type schema:Palce ;
>               schema:geo <urn:test.org:geo.2> ;
>               schema:name "Place 2" .
>       <urn:test.org:geo.2> rdf:type schema:GeoCoordinates ;
>               schema:latitude "15"^^xsd:double ;
>               schema:longitude "16"^^xsd:double ;
>               schema:elevation "99"^^xsd:int .
>       <urn:test.org:place.3> rdf:type schema:Palce ;
>               schema:geo <urn:test.org:geo.3> ;
>               schema:name "Place 3" .
>       <urn:test.org:geo.3> rdf:type schema:GeoCoordinates ;
>               schema:latitude "15"^^xsd:double ;
>               schema:longitude "17"^^xsd:double .
>       <urn:test.org:place.4> rdf:type schema:Palce ;
>               schema:geo <urn:test.org:geo.4> ;
>               schema:name "Place 4" .
>       <urn:test.org:geo.4> rdf:type schema:GeoCoordinates ;
>               schema:longitude "17"^^xsd:double ;
>               schema:elevation "123"^^xsd:int .
> {code}
> Important is that `geo.1` and `geo.2` do have all latitude, longitude and 
> elevation defined. `geo.3` has no elevation and `geo.4` is missing the 
> latitude to simulate invalid geo coordinate data.
> h2. Test Case 1
> The following query using an OPTIONAL graph pattern including 
> `schema:latitude` and `schema:longitude`. This assumes a user just want 
> lat/long values of locations that do define both.
> {code}
>     PREFIX schema: <http://schema.org/>
>     PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>     SELECT * WHERE {
>         ?entity schema:geo ?location
>         OPTIONAL {
>             ?location schema:latitude ?lat .
>             ?location    schema:longitude ?long .
>         }
>     }
> {code}
> translate to the Algebra
> {code}
>     (base <http://example/base/>
>         (prefix ((schema: <http://schema.org/>)
>                 (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
>             (leftjoin
>             (bgp (triple ?entity schema:geo ?location))
>             (bgp
>                 (triple ?location schema:latitude ?lat)
>                 (triple ?location schema:longitude ?long)
>              ))))
> {code}
> The expected result are 
> {code}
>     entity,location,lat,long
>     urn:test.org:place.1,urn:test.org:geo.1,16,17
>     urn:test.org:place.2,urn:test.org:geo.2,15,16
>     urn:test.org:place.3,urn:test.org:geo.3,15,17
>     urn:test.org:place.4,urn:test.org:geo.4,,
> {code}
> All four locations are expected in the result set as the `OPTIONAL` graph 
> pattern is translated to a `leftjoin` with `triple ?entity schema:geo 
> ?location`.
> However for `geo.4` no value is expected for `?lat` AND `long` as this 
> resource only defines a longitude and therefore does not match
> {code}
>     (bgp
>         (triple ?location schema:latitude ?lat)
>         (triple ?location schema:longitude ?long)
>     )
> {code}
> Marmotta responses with 
> {code}
>     entity,location,lat,long
>     urn:test.org:place.1,urn:test.org:geo.1,16,17
>     urn:test.org:place.2,urn:test.org:geo.2,15,16
>     urn:test.org:place.3,urn:test.org:geo.3,15,17
>     urn:test.org:place.4,urn:test.org:geo.4,,17
> {code}
> Note that the longitude is returned for the resource `geo.4`
> h2. Test Case 2
> As a variation we now also include the `schema:elevation` in the OPTIONAL 
> graph pattern.
> {code}
>     PREFIX schema: <http://schema.org/>
>     PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>     SELECT * WHERE {
>         ?entity schema:geo ?location
>         OPTIONAL {
>                   ?location schema:latitude ?lat .
>             ?location schema:longitude ?long .
>             ?location schema:elevation ?alt .
>         }
>     }
> {code}
> This query translates to the following algebra
> {code}
>     (base <http://example/base/>
>         (prefix ((schema: <http://schema.org/>)
>                    (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
>             (leftjoin
>             (bgp (triple ?entity schema:geo ?location))
>             (bgp
>                 (triple ?location schema:latitude ?lat)
>                 (triple ?location schema:longitude ?long)
>                 (triple ?location schema:elevation ?alt)
>             ))))
> {code}
> The expected result would have 4 result rows where `lat`, `long` and `alt` 
> values are only provided for `geo.1` and `geo.2`.
> {code}
>     entity,location,lat,long,alt
>     urn:test.org:place.1,urn:test.org:geo.1,16,17,123
>     urn:test.org:place.2,urn:test.org:geo.2,15,16,99
>     urn:test.org:place.3,urn:test.org:geo.3,,,
>     urn:test.org:place.4,urn:test.org:geo.4,,,
> {code}
> With this query Marmotta behaves very strange as the results depend on the 
> ordering of the  tripple patterns in the `OPTIONAL` graph pattern. I will not 
> include all variations but just provide two examples:
> {code}
>         OPTIONAL {
>                   ?location schema:latitude ?lat .
>             ?location schema:longitude ?long .
>             ?location schema:elevation ?alt .
>         }
> {code}
> gives
> {code}
>     entity,location,lat,long,alt
>     urn:test.org:place.1,urn:test.org:geo.1,1.6E1,1.7E1,123
>     urn:test.org:place.2,urn:test.org:geo.2,1.5E1,1.6E1,99
>     urn:test.org:place.4,urn:test.org:geo.4,,1.7E1,123
> {code}
> while
> {code}
>         OPTIONAL {
>             ?location schema:longitude ?long .
>                   ?location schema:latitude ?lat .
>             ?location schema:elevation ?alt .
>         }
> {code}
> gives
> {code}
>     entity,location,long,lat,alt
>     urn:test.org:place.1,urn:test.org:geo.1,1.7E1,1.6E1,123
>     urn:test.org:place.2,urn:test.org:geo.2,1.6E1,1.5E1,99
> {code}
> This behavior further indicates that `OPTIONAL` are wrongly processed.
> h2. Test Case 3
> Modifying the query to 
> {code}
>     PREFIX schema: <http://schema.org/>
>     PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>     SELECT * WHERE {
>         ?entity schema:geo ?location
>         OPTIONAL {
>                   ?location schema:latitude ?lat .
>             ?location schema:longitude ?long .
>         }
>         OPTIONAL {
>             ?location schema:elevation ?alt .
>         }
>     }
> {code}
> results in a similar result to _Test Case 1_ where we have 4 results, but for 
> `geo.4` we do get the unexpected value for `?long`.
> h2. Test Case 4
> This test case assumes that the user requires `lat` and `long` and optionally 
> wants the `alt` but only for resources that do have a valid location.
> {code}
>     PREFIX schema: <http://schema.org/>
>     PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
>     SELECT * WHERE {
>         ?entity schema:geo ?location
>         OPTIONAL {
>                   ?location schema:latitude ?lat .
>             ?location schema:longitude ?long .
>             OPTIONAL {
>                 ?location schema:elevation ?alt .
>             }
>         }
>     }
> {code}
> This translates to the following algebra
> {code}
>     (base <http://example/base/>
>         (prefix ((schema: <http://schema.org/>)
>                    (rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>))
>             (leftjoin
>                 (bgp (triple ?entity schema:geo ?location))
>                 (leftjoin
>                     (bgp
>                         (triple ?location schema:latitude ?lat)
>                         (triple ?location schema:longitude ?long)
>                     )
>                         (bgp (triple ?location schema:elevation ?alt))))))
> {code}
> So `lat` and `long` values are `leftjoin` with the `alt`. Than the result is 
> in an other `leftjoin` with the results of `?entity schema:geo ?location`. 
> Because expected results are as follows
> {code}
>     entity,location,lat,long,alt
>     urn:test.org:place.1,urn:test.org:geo.1,16,17,123
>     urn:test.org:place.2,urn:test.org:geo.2,15,16,99
>     urn:test.org:place.3,urn:test.org:geo.3,,,
>     urn:test.org:place.4,urn:test.org:geo.4,,,
> {code}
> Marmotta however returns
> {code}
>     entity,location,lat,long,alt
>     urn:test.org:place.1,urn:test.org:geo.1,16,17,123
>     urn:test.org:place.2,urn:test.org:geo.2,15,16,99
>     urn:test.org:place.3,urn:test.org:geo.3,15,17,
>     urn:test.org:place.4,urn:test.org:geo.4,,17,123
> {code}
> All test cases show that OPTIONAL query segments are not correctly evaluated 
> by the SPARQL implementation of the KiWi triple store.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to