Note that I have made some changes to the decorrlation logic to call findBestExp() *after* the decorrelation is done and supply it the set of rules including FilterJoinRule. This does push the join condition into one part of the tree but it does not push it into all other parts where that join may have been copied during decorrelation. The main point is: we need to do the filter pushdown early rather than late.
Aman On Mon, May 11, 2015 at 10:16 AM, Aman Sinha <[email protected]> wrote: > I want to be able to push the join condition (=($7, $9)) highlighted into > the LogicalJoin that is right below the LogicalCorrelate. What's the right > way to do it ? > > The current method of first decorrelating and then pushing the filter (via > the FilterJoinRule) is not quite right because once decorrelation is done, > it may be too late to push the filter into the join. During decorrelation > we take that LogicalJoin (with its TRUE condition) and push it into other > places - for instance we call createDistinct() to build a distinct row set > on the result of this join but since the join has a true condition, the > distinct is created on a cartesian join. > > What I really need is something like a FilterJoinRule that allows pushing > it past a LogicalCorrelate. > > LogicalProject(EXPR$0=[1]): rowcount = 1.0, cumulative cost = 10.25, id = > 53 > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], > HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], > DEPTNO0=[$9], NAME=[$10], EXPR$0=[$11]): rowcount = 1.0, cumulative cost = > 9.25, id = 71 > * LogicalFilter(condition=[AND(=($7, $9), >($5, $11))]): rowcount = > 1.0, cumulative cost = 8.25, id = 68* > LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], > requiredColumns=[{0}]): rowcount = 1.0, cumulative cost = 7.25, id = 61 > LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0, > cumulative cost = 1.0, id = 42 > LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = 1.0, > cumulative cost = 0.0, id = 11 > LogicalTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = > 1.0, cumulative cost = 0.0, id = 12 > LogicalAggregate(group=[{}], EXPR$0=[AVG($5)]): rowcount = 1.0, > cumulative cost = 2.125, id = 47 > LogicalFilter(condition=[=($cor0.EMPNO, $0)]): rowcount = 1.0, > cumulative cost = 1.0, id = 45 > LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = > 1.0, cumulative cost = 0.0, id = 14 > > > Thanks, > Aman >
