Initial Bindings in Query Evaluation

Rob Vesse Fri, 02 Aug 2013 09:55:14 -0700

Hi All

Holger's question 
(http://mail-archives.apache.org/mod_mbox/jena-users/201308.mbox/%[email protected]%3e)
 about a regression in ARQs treatment of initial bindings raises an interesting 
disconnect between the interpretation of SPARQL and the Initial Bindings API.


Initial bindings in their current form allows for users to essentially change 
the semantics of a query in a non-intuitive way.  Take his example query:

ASK { FILTER(?a = ?b) }

Intuitively that query MUST always return false yet with initial bindings in 
the mix the query can be made to return true, at least prior to 2.10.2 which 
introduces a new optimizer which includes special case recognition for this.

The problem is that using initial bindings can fundamentally change the 
semantics of queries in non-intuitive ways when I believe the intention of the 
API was merely to allow for improved performance by guiding the engine.

To me this suggests that initial bindings as currently implemented is 
fundamentally flawed and I would suggest that we think about re-architecting 
this feature in a future release (not the next release).  I believe there are 
probably several ways of doing this:

1 – Remove support for initial bindings on queries entirely (as we already did 
for updates) in favor of using ParameterizedSparqlString

2 – Change initial bindings to be a pre-optimization algebra transformation of 
the query

As we've discussed previously in the context of ParameterizedSparqlString there 
is potential to do the substitution at the algebra tree level rather than at 
the textual level.  This allows for stronger syntax checking and actually 
changes the query appropriately.  The problem with this is that it doesn't work 
if we want to inject multiple values for a variable, hence Option 3

3 – Change initial bindings to be done by injection of VALUES clauses

This approach is again by algebra transform and would involve inserting VALUES 
clauses at each leaf of the algebra tree.  So Holger's query with initial 
bindings applied would be rewritten like so:

ASK
{
  VALUES ( ?a ?b ) { ( true true ) }
  FILTER (?a = b)
}

However this approach might get rather complex for larger queries and also runs 
into issues of scope, what if we insert the VALUES clause inside of a sub-query 
which doesn't propagate those initial bindings outside of it etc.

4 – Skip optimization when initial bindings are involved

This is the easiest approach but we can't enforce this on other query engine 
implementations and it could seriously harm performance for those that use 
initial bindings extensively.

There may also be other approaches I haven't thought so please suggest anything 
that makes sense.  Bottom line is that initial bindings in its current form 
seems fundamentally broken to me and we should be thinking of how to fix this 
in the future.

Rob

Initial Bindings in Query Evaluation

Reply via email to