Andy -

Yes we are talking about the same thing, I understand the scoping of
FILTER in OPTIONAL and when it applies over the join rather than over the
inner operator

Mike -

Thanks for confirming my suspicion, turns out to be a trivial bug in the
handling of left joins in dotNetRDF only when there is a cross product.
The normal join case was already correctly handling this part of the spec
and I just somehow missed it in the cross product case.

Cheers,

Rob

On 27/11/2013 15:39, "Andy Seaborne" <[email protected]> wrote:

>Hi Rob,
>
>Partial answer - I'm about to go into a RDF-WG telecon but I'll work
>through the details later.  I just wanted to check we are talking about
>the same because "OPTIONAL{ ... FILTER ... }" is special.
>
>You'll see in the algebra there is no (filter) in this part.
>
> >   (leftjoin
> >     (graph <http://a>
> >      (bgp (triple ?s ?p ?o)))
> >     (graph <http://b>
> >      (bgp (triple ?s0 ?p0 ?o0)))
> >     (&& (&& (sameTerm ?s ?s0) (sameTerm ?p ?p0)) (sameTerm ?o ?o0)))))
>
>The (&&...) is the 3rd argument to the leftJoin operation and forms part
>of the join condition, not a filter over the GRAPH <http://b> { ?s0 ?p0
>?o0 . } nor applied after the LeftJoin - in SQL terms, is't the ON
>condition for a leftjoin.  Scope-wide the the (&&) can see the ?s which
>it could not otherwise.
>
>For example this is a different query:
>
>SELECT *
>WHERE
>{
>   GRAPH <http://a>
>   {
>     ?s ?p ?o .
>   }
>   OPTIONAL
>   {
>     {
>       GRAPH <http://b> { ?s0 ?p0 ?o0 . }
>       FILTER (SAMETERM(?s, ?s0) && SAMETERM(?p, ?p0) && SAMETERM(?o,
>?o0))
>     }
>   }
>   FILTER(!BOUND(?s0))
>}
>
>there is an additional {} inside the OPTIONAL {}.
>
>(filter (! (bound ?s0))
>   (leftjoin
>     (graph <http://a>
>       (bgp (triple ?s ?p ?o)))
>     (filter (&& (&& (sameTerm ?s ?s0) (sameTerm ?p ?p0))
>                 (sameTerm ?o ?o0))
>       (graph <http://b>
>         (bgp (triple ?s0 ?p0 ?o0))))))
>
>Now has 2* (filter)
>
>ARQ then gives 2 rows
>
>----------------------------------------------------------
>| s           | p           | o           | s0 | p0 | o0 |
>==========================================================
>| <http://r2> | <http://r2> | <http://r2> |    |    |    |
>| <http://r1> | <http://r1> | <http://r1> |    |    |    |
>----------------------------------------------------------
>
>The &&-filter is always false (?s not defined)
>
>Is that what dotNetRDF returns?
>
>I get this with the normal and ref query engines in ARQ.
>
>       Andy
>
><http://r1> <http://r1> <http://r1> <http://a> .
><http://r2> <http://r2> <http://r2> <http://a> .
><http://r1> <http://r1> <http://r1> <http://b> .
><http://r2> <http://r2> <http://r2> <http://b> .
>
>
>On 27/11/13 11:07, Rob Vesse wrote:
>> Hey Andy
>>
>> Prompted by a bug originally reported for dotNetRDF (CORE-386 [1])
>>which I
>> initially rejected as Invalid based on my understanding of how LeftJoin
>> behaves I then reopened because the user reporting it gets different
>> behaviour in ARQ (which I have reproduced) so I am unclear which of
>> dotNetRDF or ARQ is doing things wrong based on my understanding of the
>> specification.
>>
>> The test data is the trivial Turtle document as follows:
>>
>> <http://r1> <http://r1> <http://r1> .
>> <http://r2> <http://r2> <http://r2> .
>>
>> And the query is as follows:
>>
>> SELECT *
>> WHERE
>> {
>>    GRAPH <http://a>
>>    {
>>      ?s ?p ?o .
>>    }
>>    OPTIONAL
>>    {
>>      GRAPH <http://b> { ?s0 ?p0 ?o0 . }
>>      FILTER (SAMETERM(?s, ?s0) && SAMETERM(?p, ?p0) && SAMETERM(?o,
>>?o0))
>>    }
>>    FILTER(!BOUND(?s0))
>> }
>>
>>
>> And for reference the unoptimised algebra is as follows:
>>
>> (base <http://example/base/>
>>   (filter (! (bound ?s0))
>>    (leftjoin
>>     (graph <http://a>
>>      (bgp (triple ?s ?p ?o)))
>>     (graph <http://b>
>>      (bgp (triple ?s0 ?p0 ?o0)))
>>     (&& (&& (sameTerm ?s ?s0) (sameTerm ?p ?p0)) (sameTerm ?o ?o0)))))
>>
>>
>> The intent of the query is to calculate the delta of the graphs I.e. the
>> triples that are present in <http://a> that are not present in
>><http://b>.
>>   So given two identical graphs it was intended to return 0 results,
>> however the behaviour in dotNetRDF is that it returns 2 results whereas
>> ARQ returns 0 results.
>>
>> My belief was that dotNetRDF is correct and I'll explain why, I think I
>> may be wrong and if so I'd love to understand why.  My understanding of
>> the flow of execution is as follows:
>>
>> Step 1 - Execute the LHS of the left join which finds all triples in
>>graph
>> <http://a> and thus returns the following:
>>
>> s = r1, p = r1, o = r1
>> s = r2, p = r2, o = r2
>>
>> Step 2 - Execute the RHS of the left join which finds all triples in
>>graph
>> <http://b> and thus returns the following:
>>
>> s0 = r1, p0 = r1, o0 = r1
>> s0 = r2, p0 = r2, o0 = r2
>>
>>
>> Step 3 - Calculate the possible join
>>
>> s = r1, p = r1, o = r1, s0 = r1, p0 = r1, o0 = r1
>> s = r1, p = r1, o = r1, s0 = r2, p0 = r2, o0 = r2
>> s = r2, p = r2, o = r2, s0 = r1, p0 = r1, o0 = r1
>> s = r2, p = r2, o = r2, s0 = r2, p0 = r2, o0 = r2
>>
>>
>> Step 4 - Apply the filter on the left join
>>
>>
>> s = r1, p = r1, o = r1
>> s = r1, p = r1, o = r1, s0 = r2, p0 = r2, o0 = r2
>> s = r2, p = r2, o = r2
>> s = r2, p = r2, o = r2, s0 = r2, p0 = r2, o0 = r2
>>
>> This I think is where ARQ and dotNetRDF differ in behaviour and where I
>> suspect my implementation is wrong.  For the rows where FILTER fails for
>> some (but not all rows) I retain the LHS whereas ARQ does not.  I'm
>> guessing that I'm missing some bit of the SPARQL specification for
>> LeftJoin that says that where there is at least one valid joinable
>> solution for a LHS solution then the LHS does not need to be preserved
>>on
>> its own?
>>
>>
>> If you could point me to this I would much appreciate this.
>>
>> Step 5 - Apply the outer filter
>>
>>
>> s0 = r1, p0 = r1, o0 = r1
>> s0 = r2, p0 = r2, o0 = r2
>>
>>
>> So dotNetRDF returns 2 results but ARQ returns 0 results for this query.
>> Am I correct in thinking I've got a bug in my LeftJoin implementation
>>over
>> in dotNetRDF?  Or is this actually a subtle bug in ARQ?
>>
>> Thanks,
>>
>> Rob
>>
>> p.s. code for this test case and variations on it is committed as
>> TestGraphDeltas
>>
>> [1] http://dotnetrdf.org/tracker/Issues/IssueDetail.aspx?id=386
>>
>>
>>
>>
>




Reply via email to