Hi,
I am following up on JIRA-4100.  I am trying to understand the plan generated 
by the SQL query select  e.empno, e.sal, e.deptno emp_dept, d.deptno dep_dept  
from emp e 
left join
dept d
    on e.deptno = (
         select max(sal)
         from emp
         where deptno = e.deptno)

The plan produced is as follows
LogicalProject(EMPNO=[$0], SAL=[$5], EMP_DEPT=[$7], DEP_DEPT=[$9])
  LogicalJoin(condition=[=($0, $7)], joinType=[left])
    LogicalTableScan(table=[[CATALOG, SALES, EMP]])
    LogicalTableScan(table=[[CATALOG, SALES, DEPT]])

It seems to me that the MAX operator is still needed in the correlated subquery 
before the join, but it is dropped in the aggregator evaluation.  

Interestingly, a similar query using EXISTS retains the aggregator operator:
Select * from emp where exists select 1 from dept where emp.deptno=dept.deptno

This results in the following plan
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8], DEPTNO0=[CAST($9):INTEGER], 
$f1=[CAST($10):BOOLEAN])
    LogicalJoin(condition=[=($7, $9)], joinType=[inner])
      LogicalTableScan(table=[[CATALOG, SALES, EMP]])
      LogicalAggregate(group=[{0}], agg#0=[MIN($1)])
        LogicalProject(DEPTNO=[$0], $f0=[true])
          LogicalTableScan(table=[[CATALOG, SALES, DEPT]])


Based on the exists example, I expect the logical plan should also contain 
something like after the scan on dept:
      LogicalAggregate(group=[{0}], agg#0=[MAX($5)])
        LogicalProject(DEPTNO=[$9], $7=[true])

Thanks in advance for your input,
Sean


Reply via email to