[ 
https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715867#comment-17715867
 ] 

Stamatis Zampetakis commented on CALCITE-5669:
----------------------------------------------

Thanks for the review Julian! Indeed the {{CorrelateToJoinRule}} does not work 
in Volcano for the reason you mentioned.

The initial {{RelSet}} created for the 
{{LogicalFilter(condition=[=($cor0.EMPNO, null)])}} has a correlation variable. 
When the reduction rule applies and removes the filter along with the 
correlation the new {{LogicalValues}} expression will end up in the same 
{{{}RelSet{}}}. When the {{CorrelateToJoinRule}} fires and checks if there are 
correlations on the right input, the {{RelSet}} which contains both the 
{{LogicalFilter}} and the {{LogicalValues}} will say yes so the rule will not 
perform the transformation to {{{}Join{}}}. I enriched the PR 
([https://github.com/apache/calcite/pull/3170/commits/1fe8cb06d84e5f6df39df7da9ec12a9352e09506])
 with some tests showing the problem.

I am reluctant to change the Volcano internals putting seemingly equivalent 
expressions to different sets when they differ in the correlation. This would 
be a change with quite big impact and I don't have sufficient use-cases to 
justify this at this point.

The alternative proposal, adding pruning rules for Correlate when one of its 
inputs are empty, seems more promising 
([https://github.com/apache/calcite/pull/3170/commits/fe838385e13dd05bd6e8abc8c5b5dc359928f23c]).
 The pruning rules are more efficient since they have a more restrictive 
matching pattern and they do not require a full traversal of the right 
sub-tree. Additionally, they allow removing trivial correlates when the left 
input is empty. Finally, they work fine both with Hep and Volcano planner since 
the resulting transformation depends only on the direct inputs of the 
correlate. The issue of having relations with/without correlation variables in 
the same RelSet is still there but it is not something specific to this change.

> Remove trivial correlates from the query plan
> ---------------------------------------------
>
>                 Key: CALCITE-5669
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5669
>             Project: Calcite
>          Issue Type: Improvement
>          Components: core
>            Reporter: Stamatis Zampetakis
>            Assignee: Stamatis Zampetakis
>            Priority: Major
>              Labels: pull-request-available
>
> Consider the following query correlated query.
> {code:sql}
> select * from emp as e where exists (select 1 from dept as d where e.empno = 
> null)
> {code}
> The query basically returns an empty result because {{e.empno = null}} is 
> always false.
> The plan for the query after applying the sub-query remove rule is shown 
> below:
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(i=[true])
>           LogicalFilter(condition=[=($cor0.EMPNO, null)])
>             LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> After applying the reduce expressions rule the filter with the correlated 
> condition will become false and the resulting plan would be the following.
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>     LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>       LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>       LogicalAggregate(group=[{0}])
>         LogicalProject(i=[true])
>           LogicalValues(tuples=[[]])
> {noformat}
> Observe that now we have a {{LogicalCorrelate}} but there is no real 
> correlation in the plan since the correlation variable on the right side 
> disappeared. Depending on how rules are applied and which rules are used 
> similar "trivial" correlates may appear.
> The goal of this ticket is to provide the means to get rid of them.
> One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
> detects that a correlate does not have correlations in the right side and 
> turn the correlation to a join; then we could employ other existing rules 
> (such as PruneEmptyRules) for joins and remove the newly created join 
> altogether.
> Another option, would be to introduce new pruning rule(s) for correlate 
> (similar to those for joins) that will remove the correlate when its input is 
> an empty values expression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to