On 8/3/2013 9:50, Rob Vesse wrote:
Side Note - Initial bindings for updates was removed because it was a barrier to streaming updates (http://markmail.org/message/bazwh2exmcc5vmoh). Also as others noted in the discussion there initial bindings is a little murkier for updates since does it apply only to WHERE clauses, to all portions of requests, etc? Keeping the API as-is is always an option, if this ends up being the preference of the community then we definitely need to improve the documentation to note that there can be unintended interactions with other parts of the query engine such as the optimizer when initial bindings are used.
Yet I believe the handling of this removal for UPDATEs was a bit rushed. I don't have the whole background, so apologies if I miss something obvious, but it sounds like there was no real deprecation cycle because it would have complicated some *modes* of using UPDATEs (streaming, remote). However, these are just some modes among others, and it might have been possible to document it away for those modes, and throw an exception if someone uses a streaming update with initial bindings. Right now the API is inconsistent because bindings are still present for Queries, and all our usages of initial bindings with UPDATEs would have continued to work. We were unfortunately unable to upgrade to newer Jena versions earlier, because there have been some other show stopper bugs in the code. I am using the SNAPSHOT now to try to detect those before a release, and may have more reports in the next couple of days as I run through manual and automated test scenarios.
What you are talking about sounds much more like cached execution plans in SQL. I understand the analogy of a SPARQL query to a function but SPARQL variables were not intended to be function arguments, that you choose to treat them as such and that initial bindings lets you treat them as such is a perhaps unintended consequence of ARQ's API.
Yep, unintended things are often the best, because they open new doors. You may not be aware of how central SPARQL (and this feature) has become for our software stack and commercial product portfolio. There are literally thousands of SPARQL queries with varying complexity in our products, and they all execute within contexts of pre-bound variables. Even things like user interfaces are constructed with the help of running hundreds of queries per page request, so speed is crucial and has always been good enough.
So, whatever change is made, I would strongly favor a solution that preserves the semantics and doesn't negatively affect performance (of running many small queries).
Thanks Holger
