Seems a bit of a stretch, since Join has other ways to represent SEMI and ANTI. Maybe a Correlate could have both a JoinType and a SemiJoinType?
Can you & Vladimir find a compromise for how to restore the missing functionality with no more copy-paste than necessary. It would help if we had a full list of rules which ought to work for Correlate. Julian On May 11, 2015, at 5:27 PM, Jinfeng Ni <[email protected]> wrote: > Can we extend Join.JoinType, so that it includes the SemiJointype (SEMI, > ANTI) represented by Correlate? That way, we could leverage the rule for > Join and apply them to Correlate as well, just like the way it used to > work. Otherwise, we have to come up with a new set of rules for Correlate, > to make thing work again. > > > > On Mon, May 11, 2015 at 5:02 PM, Julian Hyde <[email protected]> wrote: > >> This comment in Correlate seems to express Vladimir’s motivation: >> >>> Correlate is not a join since: typical rules should not match Correlate. >> >> I agree with him. For instance, Correlate.joinType is enum SemiJoinType { >> INNER, LEFT, SEMI, ANTI } and therefore different semantics to >> Join.joinType. >> >> It’s unfortunate that FilterJoinRule got broken. We should fix it. Any >> other rules that would be needed? Probably ProjectJoinTransposeRule, >> AggregateJoinTransposeRule. >> >> Julian >> >> >> On May 11, 2015, at 4:17 PM, Aman Sinha <[email protected]> wrote: >> >>> As part of CALCITE-483, the class hierarchy of CorrelateRel was changed >>> such that the new LogicalCorrelate is not a derived class of Join >> anymore. >>> Thus, any rule such as FilterJoinRule that used to push the filter down >>> into the Join (or a derived class of Join) does not apply anymore for the >>> LogicalCorrelate. >>> >>> I am continuing down the path of my proposal to have a version of the >> push >>> filter rule that allows pushing into/past a LogicalCorrelate. But >> perhaps >>> Vladimir can shed some light on the motivation for changing the class >>> hierarchy. >>> >>> thanks, >>> Aman >>> >>> >>> On Mon, May 11, 2015 at 10:21 AM, Aman Sinha <[email protected]> >> wrote: >>> >>>> Note that I have made some changes to the decorrlation logic to call >>>> findBestExp() *after* the decorrelation is done and supply it the set >> of >>>> rules including FilterJoinRule. This does push the join condition into >> one >>>> part of the tree but it does not push it into all other parts where that >>>> join may have been copied during decorrelation. The main point is: >> we >>>> need to do the filter pushdown early rather than late. >>>> >>>> Aman >>>> >>>> On Mon, May 11, 2015 at 10:16 AM, Aman Sinha <[email protected]> >> wrote: >>>> >>>>> I want to be able to push the join condition (=($7, $9)) highlighted >> into >>>>> the LogicalJoin that is right below the LogicalCorrelate. What's the >> right >>>>> way to do it ? >>>>> >>>>> The current method of first decorrelating and then pushing the filter >>>>> (via the FilterJoinRule) is not quite right because once decorrelation >> is >>>>> done, it may be too late to push the filter into the join. During >>>>> decorrelation we take that LogicalJoin (with its TRUE condition) and >> push >>>>> it into other places - for instance we call createDistinct() to build a >>>>> distinct row set on the result of this join but since the join has a >> true >>>>> condition, the distinct is created on a cartesian join. >>>>> >>>>> What I really need is something like a FilterJoinRule that allows >> pushing >>>>> it past a LogicalCorrelate. >>>>> >>>>> LogicalProject(EXPR$0=[1]): rowcount = 1.0, cumulative cost = 10.25, >> id = >>>>> 53 >>>>> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], >>>>> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], >>>>> DEPTNO0=[$9], NAME=[$10], EXPR$0=[$11]): rowcount = 1.0, cumulative >> cost = >>>>> 9.25, id = 71 >>>>> * LogicalFilter(condition=[AND(=($7, $9), >($5, $11))]): rowcount = >>>>> 1.0, cumulative cost = 8.25, id = 68* >>>>> LogicalCorrelate(correlation=[$cor0], joinType=[LEFT], >>>>> requiredColumns=[{0}]): rowcount = 1.0, cumulative cost = 7.25, id = 61 >>>>> LogicalJoin(condition=[true], joinType=[inner]): rowcount = 1.0, >>>>> cumulative cost = 1.0, id = 42 >>>>> LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = >>>>> 1.0, cumulative cost = 0.0, id = 11 >>>>> LogicalTableScan(table=[[CATALOG, SALES, DEPT]]): rowcount = >>>>> 1.0, cumulative cost = 0.0, id = 12 >>>>> LogicalAggregate(group=[{}], EXPR$0=[AVG($5)]): rowcount = 1.0, >>>>> cumulative cost = 2.125, id = 47 >>>>> LogicalFilter(condition=[=($cor0.EMPNO, $0)]): rowcount = 1.0, >>>>> cumulative cost = 1.0, id = 45 >>>>> LogicalTableScan(table=[[CATALOG, SALES, EMP]]): rowcount = >>>>> 1.0, cumulative cost = 0.0, id = 14 >>>>> >>>>> >>>>> Thanks, >>>>> Aman >>>>> >>>> >>>> >> >>
