unsubscribe me please.

On Fri, Jun 3, 2022 at 10:30 AM Andy Seaborne <a...@apache.org> wrote:

> Probably a bug then.
>
> Are you going to be making improvements to query
> tranformation/optimization as part of your work on the enhanced SERVICE
> handling on the active PR?
>
>      Andy
>
> On 03/06/2022 10:39, Claus Stadler wrote:
> > Hi again,
> >
> >
> > I think the point was missed; what I was actually after is that in the
> > following query a "join" is optimized into a "sequence"
> >
> > and I wonder whether this is the correct behavior if a LIMIT/OFFSET is
> > present.
> >
> > So running the following query with optimize enabled/disabled gives
> > different results:
> >
> > SELECT * {
> >    SERVICE <https://dbpedia.org/sparql> { SELECT * { ?s a
> > <http://dbpedia.org/ontology/MusicalArtist> } LIMIT 5 }
> >    SERVICE <https://dbpedia.org/sparql> { SELECT * { ?s
> > <http://www.w3.org/2000/01/rdf-schema#label> ?x } LIMIT 1 }
> > }
> >
> >
> > ➜  bin ./arq --query service-query.rq
> >
> >    (sequence !!!!!
> >
> >      (service <https://dbpedia.org/sparql>
> >        (slice _ 5
> >          (bgp (triple ?s
> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> > <http://dbpedia.org/ontology/MusicalArtist>))))
> >      (service <https://dbpedia.org/sparql>
> >        (slice _ 1
> >          (bgp (triple ?s <http://www.w3.org/2000/01/rdf-schema#label>
> > ?x)))))
> >
> >
> -------------------------------------------------------------------------------
>
> >
> > | s                                                   |
> > x                     |
> >
> ===============================================================================
>
> >
> > | <http://dbpedia.org/resource/Aarti_Mukherjee>       | "Aarti
> > Mukherjee"@en  |
> > | <http://dbpedia.org/resource/Abatte_Barihun>        | "Abatte
> > Barihun"@en   |
> > | <http://dbpedia.org/resource/Abby_Abadi>            | "Abby
> > Abadi"@en       |
> > | <http://dbpedia.org/resource/Abd_al_Malik_(rapper)> | "Abd al
> > Malik"@de     |
> > | <http://dbpedia.org/resource/Abdul_Wahid_Khan>      | "Abdul Wahid
> > Khan"@en |
> >
> -------------------------------------------------------------------------------
>
> >
> >
> >
> > ./arq --explain --optimize=no --query service-query.rq
> >    (join !!!!!
> >      (service <https://dbpedia.org/sparql>
> >        (slice _ 5
> >          (bgp (triple ?s
> > <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> > <http://dbpedia.org/ontology/MusicalArtist>))))
> >      (service <https://dbpedia.org/sparql>
> >        (slice _ 1
> >          (bgp (triple ?s <http://www.w3.org/2000/01/rdf-schema#label>
> > ?x)))))
> > ---------
> > | s | x |
> > =========
> > ---------
> >
> >
> > Cheers,
> >
> > Claus
> >
> >
> > On 03.06.22 10:22, Andy Seaborne wrote:
> >>
> >>
> >> On 02/06/2022 21:19, Claus Stadler wrote:
> >>> Hi,
> >>>
> >>> I noticed some interesting results when using SERVICE with a sub
> >>> query with a slice (limit / offset).
> >>>
> >>>
> >>> Preliminary Remark:
> >>>
> >>> Because SPARQL semantics is bottom up, a query such as the following
> >>> will not yield bindings for ?x:
> >>>
> >>> SELECT * {
> >>>    SERVICE <https://dbpedia.org/sparql> { SELECT * { ?s a
> >>> <http://dbpedia.org/ontology/MusicalArtist> } LIMIT 5 }
> >>>    SERVICE <https://dbpedia.org/sparql> { BIND(?s AS ?x) }
> >>> }
> >>
> >> The query plan for that is:
> >>
> >> (join
> >>   (service <https://dbpedia.org/sparql>
> >>     (slice _ 5
> >>       (bgp (triple ?s
> >> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type>
> >> <http://dbpedia.org/ontology/MusicalArtist>))))
> >>   (service <https://dbpedia.org/sparql>
> >>     (extend ((?x ?s))
> >>       (table unit))))
> >>
> >> which has not had any optimization applied.  ARQ checks scopes before
> >> doing any transfomation.
> >>
> >> Change BIND(?s AS ?x) to BIND(?s1 AS ?x)
> >>
> >> and it will have (join) replaced by (sequence)
> >>
> >> -----------------------------------------------------------
> >> | s                                                   | x |
> >> ===========================================================
> >> | <http://dbpedia.org/resource/Aarti_Mukherjee>       |   |
> >> | <http://dbpedia.org/resource/Abatte_Barihun>        |   |
> >> | <http://dbpedia.org/resource/Abby_Abadi>            |   |
> >> | <http://dbpedia.org/resource/Abd_al_Malik_(rapper)> |   |
> >> | <http://dbpedia.org/resource/Abdul_Wahid_Khan>      |   |
> >> -----------------------------------------------------------
> >>
> >> LIMIT 1 is a no-op - the second SERVICE always evals to one row of no
> >> columns. Which makes the second SERVICE the join identity and the
> >> result is the first SERVICE.
> >>
> >> Column ?x is only in the display because it is in "SELECT *"
> >>
> >>> Query engines, such as Jena, attempt to optimize execution. For
> >>> instance, in the following query,
> >>>
> >>> instead of retrieving all labels, jena uses each binding for a
> >>> Musical Artist to perform a lookup at the service.
> >>>
> >>> The result is semantically equivalent to bottom up evaluation
> >>> (without result set limits) - just much faster.
> >>>
> >>> SELECT * {
> >>>    SERVICE <https://dbpedia.org/sparql> { SELECT * { ?s a
> >>> <http://dbpedia.org/ontology/MusicalArtist> } LIMIT 5 }
> >>>    SERVICE <https://dbpedia.org/sparql> { ?s
> >>> <http://www.w3.org/2000/01/rdf-schema#label> ?x }
> >>> }
> >>>
> >>>
> >>> The main point:
> >>>
> >>> However, the following query with ARQ interestingly yields one
> >>> binding for every musical artist - which contradicts the bottom-up
> >>> paradigm:
> >>>
> >>> SELECT * {
> >>>    SERVICE <https://dbpedia.org/sparql> { SELECT * { ?s a
> >>> <http://dbpedia.org/ontology/MusicalArtist> } LIMIT 5 }
> >>>    SERVICE <https://dbpedia.org/sparql> { SELECT * { ?s
> >>> <http://www.w3.org/2000/01/rdf-schema#label> ?x } LIMIT 1 }
> >>> }
> >>>
> >>>
> >>> <http://dbpedia.org/resource/Aarti_Mukherjee> "Aarti Mukherjee"@en
> >>> <http://dbpedia.org/resource/Abatte_Barihun> "Abatte Barihun"@en
> >>> ... 3 more results ...
> >>>
> >>>
> >>> With bottom-up semantics, the second service clause would only fetch
> >>> a single binding so in the unlikely event that it happens to join
> >>> with a musical artist I'd expect at most one binding
> >>>
> >>> in the overall result set.
> >>>
> >>> Now I wonder whether this is a bug or a feature.
> >>>
> >>> I know that Jena's VarFinder is used to decide whether to perform a
> >>> bottom-up evaluation using OpJoin or a correlated join using
> >>> OpSequence which results in the different outcomes.
> >>>
> >>> The SPARQL spec doesn't say much about the semantics of Service
> >>> (https://www.w3.org/TR/sparql11-query/#sparqlAlgebraEval)
> >>
> >> It isn't about the semantics of SERVICE.  Its the (join) local-side.
> >>
> >>> So I wonder which behavior is expected when using SERVICE with
> >>> SLICE'd queries.
> >>
> >> "SERVICE { pattern }" executes "SELECT * { pattern }" at the far end,
> >> LIMITS and all.
> >>
> >>     Andy
> >>
> >>>
> >>>
> >>> Cheers,
> >>>
> >>> Claus
> >>>
> >>>
>

Reply via email to