I run some test in Calcite.

SQL:
select d.deptno, e.deptno
from sales.dept d, sales.emp e
where d.deptno = e.deptno

RelNode:
LogicalProject(DEPTNO=[$0], DEPTNO0=[$9])
  LogicalFilter(condition=[=($0, $9)])
    LogicalJoin(condition=[true], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])

After CoreRules.FILTER_INTO_JOIN and 
            CoreRules.JOIN_PUSH_EXPRESSIONS hep optimize

The RelNode is:
LogicalProject(DEPTNO=[$0], DEPTNO0=[$9])
  LogicalJoin(condition=[=($0, $9)], joinType=[inner])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
    LogicalTableScan(table=[[CATALOG, SALES, EMP]])

I agree with James Starr, The FilterJoinRule seems to be valid for this scene.

JingDas.


> 2023年7月28日 上午12:38,P.F. ZHAN <dethr...@gmail.com> 写道:
> 
> Suppose there is a sql like this:
> 
> select sum(sal), sum(case when hiredate = '2020-01-01' then sal end)
> from ( select hiredate, sal, job
>    from emp ) a
> where 1 = 1 and hiredate = ( select max(hiredate) from emp )
> 
> The optimized plan as follows:
> 
> LogicalAggregate(group=[{}], EXPR$0=[SUM($0)], EXPR$1=[SUM($1)])
>  LogicalProject(SAL=[$1], $f1=[CASE(=($0, 2020-01-01 00:00:00), $1,
> null:INTEGER)])
>    LogicalProject(HIREDATE=[$0], SAL=[$1], JOB=[$2])
>      LogicalFilter(condition=[=($0, $3)])
>        LogicalJoin(condition=[true], joinType=[left])
>          LogicalProject(HIREDATE=[$4], SAL=[$5], JOB=[$2])
>            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>          LogicalAggregate(group=[{}], EXPR$0=[MAX($0)])
>            LogicalProject(HIREDATE=[$4])
>              LogicalTableScan(table=[[CATALOG, SALES, EMP]])
> 
> Maybe we can push the LogicalFilter to the LogicalJoin, and get the
> LogicalJoin(condition=[=($0,
> $3)], joinType=[inner])
> It seems this is much better.
> 
> Can I do this? If so I will create an issue to resolve this problem.

Reply via email to