[ https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715312#comment-17715312 ]
Julian Hyde commented on CALCITE-5669: -------------------------------------- Generally I think it's worthwhile to expand the "correlating expression" as much as possible. (We may already do this; I haven't checked.) In this case, don't correlate on "{{e.empno}}" but instead correlate on "{{e.empno = null}}". Then that is an expression that can be more easily constant-reduced. (One counterpoint to that: in another case (CALCITE-5644) I am dealing correlated sub-query that has "{{t.a is not distinct from u.b}}", which is expanded to "{{t.a is null and u.b is null or t.a = t.b}}", and somehow "{{t.a is null}}" ends up being computed as a separate boolean column in the left input of the {{Correlate}} operator. Thus the correlation condition is split between two operators, and it is very hard to perform further simplification. The answer in this case is probably to keep {{is not distinct from}} intact, but even so, it emphasizes the importance of choosing the right grain for correlation variables.) But even with the best intentions we may end up with trivial {{Correlate}} operators. Eliminating variables that are not used, and converting {{Correlate}} operators with no variables into {{Join}} operators, seem to be simple and efficient rules. So, my vote would be to do that solution. > Remove trivial correlates from the query plan > --------------------------------------------- > > Key: CALCITE-5669 > URL: https://issues.apache.org/jira/browse/CALCITE-5669 > Project: Calcite > Issue Type: Improvement > Components: core > Reporter: Stamatis Zampetakis > Assignee: Stamatis Zampetakis > Priority: Major > Labels: pull-request-available > > Consider the following query correlated query. > {code:sql} > select * from emp as e where exists (select 1 from dept as d where e.empno = > null) > {code} > The query basically returns an empty result because {{e.empno = null}} is > always false. > The plan for the query after applying the sub-query remove rule is shown > below: > {noformat} > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], > SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8]) > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], > SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8]) > LogicalCorrelate(correlation=[$cor0], joinType=[inner], > requiredColumns=[{0}]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0}]) > LogicalProject(i=[true]) > LogicalFilter(condition=[=($cor0.EMPNO, null)]) > LogicalTableScan(table=[[CATALOG, SALES, DEPT]]) > {noformat} > After applying the reduce expressions rule the filter with the correlated > condition will become false and the resulting plan would be the following. > {noformat} > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], > SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8]) > LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], > SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8]) > LogicalCorrelate(correlation=[$cor0], joinType=[inner], > requiredColumns=[{0}]) > LogicalTableScan(table=[[CATALOG, SALES, EMP]]) > LogicalAggregate(group=[{0}]) > LogicalProject(i=[true]) > LogicalValues(tuples=[[]]) > {noformat} > Observe that now we have a {{LogicalCorrelate}} but there is no real > correlation in the plan since the correlation variable on the right side > disappeared. Depending on how rules are applied and which rules are used > similar "trivial" correlates may appear. > The goal of this ticket is to provide the means to get rid of them. > One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which > detects that a correlate does not have correlations in the right side and > turn the correlation to a join; then we could employ other existing rules > (such as PruneEmptyRules) for joins and remove the newly created join > altogether. > Another option, would be to introduce new pruning rule(s) for correlate > (similar to those for joins) that will remove the correlate when its input is > an empty values expression. -- This message was sent by Atlassian Jira (v8.20.10#820010)