[ 
https://issues.apache.org/jira/browse/JENA-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693533#comment-13693533
 ] 

ASF subversion and git services commented on JENA-473:
------------------------------------------------------

Commit 1496684 from [~rvesse]
[ https://svn.apache.org/r1496684 ]

Greatly simplify how TransformImplicitLeftJoin handles && conditions 
(ExprList.splitConjunction() is a life saver here) and update tests 
appropriately (JENA-473)
                
> ARQ should be able to optimize implicit joins and implicit left joins
> ---------------------------------------------------------------------
>
>                 Key: JENA-473
>                 URL: https://issues.apache.org/jira/browse/JENA-473
>             Project: Apache Jena
>          Issue Type: Improvement
>          Components: ARQ
>            Reporter: Rob Vesse
>            Assignee: Rob Vesse
>              Labels: optimization, sparql
>             Fix For: Jena 2.10.2
>
>         Attachments: impl-join.csv, impl-join-opt.csv, 
> impl-join-opt-linearized.csv
>
>
> There is a class of useful optimizations that currently ARQ does not even 
> attempt to apply which are usually referred to as implicit joins.
> A trivial example is as follows:
> SELECT *
> WHERE
> {
>   ?x ?p1 ?o1 .
>   ?y ?p2 ?o2 .
>   FILTER(?x = ?y)
> }
> Currently this requires us to compute a cross product and then apply the 
> filter, even with streaming evaluation this can be extremely costly.  The aim 
> of this optimization is to produce a query like the following:
> SELECT *
> WHERE
> {
>   ?x ?p1 ?o1 .
>   ?x ?p2 ?o2 .
>   BIND(?x AS ?y)
> }
> This optimization can also be applied to some left joins where the implicit 
> join applies across the join e.g.
> SELECT *
> WHERE
> {
>   ?x ?p1 ?o1 .
>   OPTIONAL
>   {
>     ?y ?p2 ?o2 .
>     FILTER(?x = ?y)
>   }
> }
> This can be thought of as a generalization of TransformFilterEquality except 
> covering the case where both items are variables.  Since both things are 
> variables we need to be careful about when we apply this optimization since 
> when = is used we need to guarantee that substituting one variable for the 
> other does not alter the semantics of the query.
> I believe the optimization is safe to apply providing that we can guarantee 
> (as far as possible) that one variable is non-literal.  This can be done by 
> inspecting the positions in which the mentioned variables are used and 
> ensuring that at least one of the variables occurs in the graph, subject or 
> predicate position.
> Safety for left joins is a little more complex since we must ensure that at 
> least one of the variables occurs in the RHS and we can only make the 
> substitution in the RHS as otherwise we change the join semantics.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to