[jira] [Commented] (CALCITE-1483) Suboptimal plan for NOT IN query

Vineet Garg (JIRA) Tue, 15 Nov 2016 14:40:15 -0800

    [ 
https://issues.apache.org/jira/browse/CALCITE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15668608#comment-15668608
 ]


Vineet Garg commented on CALCITE-1483:
--------------------------------------

[~julianhyde]  Following are few more examples where inner query is guaranteed 
to return non-null values

{noformat} select * from scott.emp where deptno not in (select deptno  from 
scott.dept where deptno IS NOT NULL); {noformat}
{noformat} select * from scott.emp where deptno not in (select count(*)  from 
scott.dept); {noformat}
{noformat} select * from scott.emp where deptno not in (select 1+1  from 
scott.dept); {noformat}
{noformat} select * from scott.emp where deptno not in (select deptno from 
scott.dept sd where sd.deptno IN (select distinct deptno from scott.dept)); 
{noformat}

I'm trying to come up with more examples

> Suboptimal plan for NOT IN query
> --------------------------------
>
>                 Key: CALCITE-1483
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1483
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>            Reporter: Vineet Garg
>            Assignee: Julian Hyde
>
> Following query generates sub-optimal plan
> {code} explain plan for select * from scott.emp where deptno not in (select 
> deptno from scott.dept where deptno = 20); {code}
> Following is the plan
> {code}
> EnumerableCalc(expr#0..11=[{inputs}], expr#12=[0], expr#13=[=($t8, $t12)], 
> expr#14=[false], expr#15=[IS NOT NULL($t11)], expr#16=[true], expr#17=[IS 
> NULL($t7)], expr#18=[null], expr#19=[<($t9, $t8)], expr#20=[CASE($t13, $t14, 
> $t15, $t16, $t17, $t18, $t19, $t16, $t14)], expr#21=[NOT($t20)], 
> proj#0..7=[{exprs}], $condition=[$t21])
>   EnumerableJoin(condition=[=($7, $10)], joinType=[left])
>     EnumerableCalc(expr#0..9=[{inputs}], EMPNO=[$t2], ENAME=[$t3], JOB=[$t4], 
> MGR=[$t5], HIREDATE=[$t6], SAL=[$t7], COMM=[$t8], DEPTNO=[$t9], c=[$t0], 
> ck=[$t1])
>       EnumerableJoin(condition=[true], joinType=[inner])
>         JdbcToEnumerableConverter
>           JdbcAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
>             JdbcFilter(condition=[=(CAST($0):INTEGER NOT NULL, 20)])
>               JdbcTableScan(table=[[SCOTT, DEPT]])
>         JdbcToEnumerableConverter
>           JdbcTableScan(table=[[SCOTT, EMP]])
>     JdbcToEnumerableConverter
>       JdbcAggregate(group=[{0, 1}])
>         JdbcProject(DEPTNO=[$0], i=[true])
>           JdbcFilter(condition=[=(CAST($0):INTEGER NOT NULL, 20)])
>             JdbcTableScan(table=[[SCOTT, DEPT]])
> {code}
> As Julian pointed out in discussion on mailing list instead of two scans for 
> DEPT one is sufficient as clearly DEPTNO is never null.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (CALCITE-1483) Suboptimal plan for NOT IN query

Reply via email to