[ https://issues.apache.org/jira/browse/JENA-473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13693342#comment-13693342 ]
Hudson commented on JENA-473: ----------------------------- Integrated in Jena__Development_Test #720 (See [https://builds.apache.org/job/Jena__Development_Test/720/]) Some code cleanup and TODO comments, update Release Notes (JENA-473) (Revision 1496611) Result = SUCCESS rvesse : Files : * /jena/trunk/jena-arq/ReleaseNotes.txt * /jena/trunk/jena-arq/src/main/java/com/hp/hpl/jena/sparql/algebra/optimize/TransformFilterEquality.java * /jena/trunk/jena-arq/src/main/java/com/hp/hpl/jena/sparql/algebra/optimize/TransformFilterImplicitJoin.java * /jena/trunk/jena-arq/src/main/java/com/hp/hpl/jena/sparql/algebra/optimize/TransformImplicitLeftJoin.java * /jena/trunk/jena-arq/src/test/java/com/hp/hpl/jena/sparql/algebra/optimize/TestTransformFilters.java > ARQ should be able to optimize implicit joins and implicit left joins > --------------------------------------------------------------------- > > Key: JENA-473 > URL: https://issues.apache.org/jira/browse/JENA-473 > Project: Apache Jena > Issue Type: Improvement > Components: ARQ > Reporter: Rob Vesse > Assignee: Rob Vesse > Labels: optimization, sparql > Fix For: Jena 2.10.2 > > Attachments: impl-join.csv, impl-join-opt.csv, > impl-join-opt-linearized.csv > > > There is a class of useful optimizations that currently ARQ does not even > attempt to apply which are usually referred to as implicit joins. > A trivial example is as follows: > SELECT * > WHERE > { > ?x ?p1 ?o1 . > ?y ?p2 ?o2 . > FILTER(?x = ?y) > } > Currently this requires us to compute a cross product and then apply the > filter, even with streaming evaluation this can be extremely costly. The aim > of this optimization is to produce a query like the following: > SELECT * > WHERE > { > ?x ?p1 ?o1 . > ?x ?p2 ?o2 . > BIND(?x AS ?y) > } > This optimization can also be applied to some left joins where the implicit > join applies across the join e.g. > SELECT * > WHERE > { > ?x ?p1 ?o1 . > OPTIONAL > { > ?y ?p2 ?o2 . > FILTER(?x = ?y) > } > } > This can be thought of as a generalization of TransformFilterEquality except > covering the case where both items are variables. Since both things are > variables we need to be careful about when we apply this optimization since > when = is used we need to guarantee that substituting one variable for the > other does not alter the semantics of the query. > I believe the optimization is safe to apply providing that we can guarantee > (as far as possible) that one variable is non-literal. This can be done by > inspecting the positions in which the mentioned variables are used and > ensuring that at least one of the variables occurs in the graph, subject or > predicate position. > Safety for left joins is a little more complex since we must ensure that at > least one of the variables occurs in the RHS and we can only make the > substitution in the RHS as otherwise we change the join semantics. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira