[ 
https://issues.apache.org/jira/browse/JENA-780?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14616748#comment-14616748
 ] 

Rob Vesse commented on JENA-780:
--------------------------------

Regression for JENA-779 is now addressed.  This optimisation is currently 
disabled by default and must be explicitly enabled like so:

{noformat}
ARQ.getContext().set(ARQ.optInlineAssignments, true);
// Optionally make it aggressive
ARQ.getContext().set(ARQ.optInlineAssignmentsAggressive, true);
{noformat}

Or with the command line tools:

{noformat}
--set arq:optInlineAssignments=true
{noformat}

There are some potential corner cases I have not yet explored which need more 
consideration

For example what happens if the single use is via a {{HAVING}} which of course 
becomes a {{filter}} in the algebra.  Is that likely to cause any issues 
because of moving the expression from within the aggregation to outside of it? 
I don't know that it would since the optimisation only inlines assignments 
which are currently in scope and anything that an assignment references would 
either be valid in the same scope (or unbound in which case the value of the 
expression is not going to change depending on where it is used).

The other possible corner case I've thought of is around assignments that occur 
in a specific branch of an operator (e.g. {{union}}, {{optional}}, {{join}}) 
since it could mean that the assignment would yield a different value when 
moved e.g.

{noformat}
SELECT ?s
WHERE
{
  { ?s ?p ?o }
  UNION
  { BIND(true) AS ?x) }
  FILTER(?x)
}
{noformat}

Would appear to change the semantics if we rewrite it as we do currently 
because as written the query should actually return only a single empty row 
because the solutions from the LHS of the {{UNION}} can never satisfy the 
{{FILTER}} but if inlined the {{FILTER}} becomes {{FILTER(true)}} which keeps 
all solutions.  This implies that we need further checking to validate if an 
assignment can be safely moved.

> Single use extend expressions could be substituted directly for their later 
> usage
> ---------------------------------------------------------------------------------
>
>                 Key: JENA-780
>                 URL: https://issues.apache.org/jira/browse/JENA-780
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ, Optimizer
>    Affects Versions: Jena 2.12.0
>            Reporter: Rob Vesse
>            Assignee: Rob Vesse
>            Priority: Minor
>         Attachments: JENA-780.patch
>
>
> This RFE is a follow on from JENA-779, the query with a sub-optimal plan 
> there uses a {{BIND}} to create a value which is then only used once in a 
> subsequent filter.
> Actually that query uses it twice but I think the general approach I am 
> trying to describe in this RFE bears consideration.  In this case it seems 
> like it would be possible to substitute the extend expression for the bound 
> variable in the filter expression.
> Simplified variant of original query such that the bound value is only used 
> once:
> {noformat}
> SELECT DISTINCT ?uri
> {
>   { ?uri ?p ?o }
>   UNION
>   {
>     ?sub ?p ?uri
>     FILTER(isIRI(?uri))
>   }
>   BIND(str(?uri) as ?s)
>   FILTER(STRSTARTS(?s, "http://";))
> }
> {noformat}
> Rewritten query:
> {noformat}
> SELECT DISTINCT ?uri
> {
>   { ?uri ?p ?o }
>   UNION
>   {
>     ?sub ?p ?uri
>     FILTER(isIRI(?uri))
>   }
>   FILTER(STRSTARTS(str(?uri), "http://";))
> }
> {noformat}
> Which avoids an extend expression whose value is only used once and will 
> ultimately be thrown away.
> From a {{Transform}} standpoint this is likely awkward to implement in a pure 
> transform since it requires knowledge about the query structure above the 
> {{FILTER}} i.e. whether the bound variable is used elsewhere and so would 
> need to use before and after visitors to track that additional state but I 
> think this is a feasible optimisation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to