Suppose there is a sql like this:

select sum(sal), sum(case when hiredate = '2020-01-01' then sal end)
from ( select hiredate, sal, job
    from emp ) a
where 1 = 1 and hiredate = ( select max(hiredate) from emp )

The optimized plan as follows:

LogicalAggregate(group=[{}], EXPR$0=[SUM($0)], EXPR$1=[SUM($1)])
  LogicalProject(SAL=[$1], $f1=[CASE(=($0, 2020-01-01 00:00:00), $1,
null:INTEGER)])
    LogicalProject(HIREDATE=[$0], SAL=[$1], JOB=[$2])
      LogicalFilter(condition=[=($0, $3)])
        LogicalJoin(condition=[true], joinType=[left])
          LogicalProject(HIREDATE=[$4], SAL=[$5], JOB=[$2])
            LogicalTableScan(table=[[CATALOG, SALES, EMP]])
          LogicalAggregate(group=[{}], EXPR$0=[MAX($0)])
            LogicalProject(HIREDATE=[$4])
              LogicalTableScan(table=[[CATALOG, SALES, EMP]])

Maybe we can push the LogicalFilter to the LogicalJoin, and get the
LogicalJoin(condition=[=($0,
$3)], joinType=[inner])
It seems this is much better.

Can I do this? If so I will create an issue to resolve this problem.

Reply via email to