4.2.0 change.

https://issues.apache.org/jira/browse/JENA-2150

OpVars now processes BIND variables (an imporvement) and so optimizer does generate the 4.2.0 form because it is a BIND.

Different patterns have different forms:

SELECT * {
  GRAPH ?g {
    ?s ?p ?o .
    FILTER ( <http://graphs/1> = ?g )
  }
}

==>

(assign ((?g ?*g0))
  (filter (= <http://graphs/1> ?g)
    (quadpattern (quad ?*g0 ?s ?p ?o))))

because ?g is not in scope in the FILTER.

    Andy

On 01/12/2021 11:20, Rob Vesse wrote:
Hey Folks

So I’ve been dragging some old code back up to date with Jena 4 and noticed an 
interesting behavioural change around quad form algebra generation and 
optimization that kinda took me aback when I first found it.  I think the 
algebras are semantically equivalent but at initial glance I wondered whether 
there is a potential scoping issue here.

Consider the following trivial dataset:

<http://a> <http://b> <http://c> <http://graphs/1> .

<http://d> <http://e> <http://c> <http://graphs/2> .

And the following query:

SELECT *

WHERE

{

   GRAPH ?g {

     ?s ?p ?o .

     BIND(<http://graphs/1> AS ?g)

   }

}

Obviously this is a somewhat odd query but it’s a simplification of the more 
general pattern of calculating the desired graph name inside of a GRAPH ?var 
block in order to produce only results from some subset of graphs

When translate into quads form with Jena 3.x I get the following:

(assign ((?g ?*g0))

   (extend ((?g <http://graphs/1>))

     (quadpattern (quad ?*g0 ?s ?p ?o))))

But with Jena 4.x I get this instead:

(extend ((?g <http://graphs/1>))

   (quadpattern (quad ?g ?s ?p ?o)))

Note that in 3.x it rewrites the graph node in the inner quad pattern and uses 
assign to filter those after the extend but in 4.x it does not rewrite the 
graph node.  While semantically these appear equivalent, and loading the test 
data into TDB 2 and uses tdb2.tdbquery to run a quad mode execution yields the 
same results in both cases, I do wonder if there’s a potential corner case here 
where scoping gets screwed up on more complex queries?

It seems in general that Jena 4.x less aggressively rewrites the graph name 
when translating OpGraph into quads form and maybe that’s perfectly fine but 
just wanted to check if this was an intended change and whether anybody else 
had encountered this.

For example rewriting the query to put the BIND first inside the GRAPH clause 
yields the following algebra on 3.x:

(assign ((?g ?*g0))

   (sequence

     (extend ((?g <http://graphs/1>))

       (table unit))

     (quadpattern (quad ?*g0 ?s ?p ?o))))

And on 4.x:

(sequence

   (extend ((?g <http://graphs/1>))

     (table unit))

   (quadpattern (quad ?g ?s ?p ?o)))

So again I think semantically equivalent but not rewriting the graph name.

Any thoughts/opinions on this?

Cheers,

Rob


Reply via email to