[jira] [Updated] (CALCITE-5664) Support CONVERT function with "USING" keyword

2023-04-22 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5664:

Labels: pull-request-available  (was: )

> Support CONVERT function with "USING" keyword
> -
>
> Key: CALCITE-5664
> URL: https://issues.apache.org/jira/browse/CALCITE-5664
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: ZheHu
>Assignee: ZheHu
>Priority: Minor
>  Labels: pull-request-available
>
> After CALCITE-111, we now support CONVERT(charValue, srcCharsetName, 
> destCharsetName) that works on converting character sets.
> CONVERT(charValue USING transcodingName) also fits the SQL standard, and 
> MySQL has  implement it. For example:
>  * select convert('employ_id' using utf16) as alia from employee
> Noted: Calcite already supports this from the parser.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5390) RelDecorrelator throws NullPointerException

2023-04-22 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715365#comment-17715365
 ] 

Benchao Li commented on CALCITE-5390:
-

Variables deduplication was done in CALCITE-4913, and I think it's outcome is 
correct because the two sub-queries is indeed referencing the same {{RelNode}}. 
For the {{SQL}} pasted above by Julian, the plan after converting to 
{{RelNode}} is below.

{code:xml}
LogicalProject(variablesSet=[[$cor0]], T1=[$SCALAR_QUERY({
LogicalSort(fetch=[1])
  LogicalProject(EXPR$0=[1])
LogicalFilter(condition=[=($2, $cor0.JOB)])
  LogicalTableScan(table=[[scott, EMP]])
})], T2=[$SCALAR_QUERY({
LogicalSort(fetch=[1])
  LogicalProject(EXPR$0=[=($cor0.JOB, 'PRESIDENT')])
LogicalTableScan(table=[[scott, EMP]])
})])
  LogicalTableScan(table=[[scott, EMP]])
{code}

After {{SubQueryRemoveRule}}, it would be transformed to below, which I think 
it's not correct since {{cor0}} is not clear. It may should be rewritten to 
different variables again when we are doing the sub-query removing.
{code:xml|}
LogicalProject(T1=[$8], T2=[$9])
  LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{2}])
LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[{2}])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
LogicalSort(fetch=[1])
  LogicalProject(EXPR$0=[1])
LogicalFilter(condition=[=($2, $cor0.JOB)])
  LogicalTableScan(table=[[scott, EMP]])
LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
  LogicalSort(fetch=[1])
LogicalProject(EXPR$0=[=($cor0.JOB, 'PRESIDENT')])
  LogicalTableScan(table=[[scott, EMP]])
{code}

> RelDecorrelator throws NullPointerException
> ---
>
> Key: CALCITE-5390
> URL: https://issues.apache.org/jira/browse/CALCITE-5390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Dmitry Sysolyatin
>Assignee: Dan Zou
>Priority: Major
>
> The current query throws NullPointerException
> {code:java}
> SELECT
>   (SELECT 1 FROM emp d WHERE d.job = a.job LIMIT 1) AS t1,
>   (SELECT a.job = 'PRESIDENT' FROM emp s LIMIT 1) as t2
> FROM emp a;
> {code}
> Test case - 
> [https://github.com/apache/calcite/commit/46fe9bc456f2d34cf7dccd29829c9e85abe69d5f]
> Logical plan before it fails:
> {code:java}
> LogicalProject(T1=[$8], T2=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f09=[$9])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f00=[$10])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{9}])
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f9=[=($2, 
> 'PRESIDENT')])
>           LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{2}])
>             LogicalTableScan(table=[[scott, EMP]])
>             LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>               LogicalSort(fetch=[1])
>                 LogicalProject(EXPR$0=[1])
>                   LogicalFilter(condition=[=($2, $cor0.JOB)])
>                     LogicalTableScan(table=[[scott, EMP]])
>         LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>           LogicalSort(fetch=[1])
>             LogicalProject(EXPR$0=[$cor0.$f9])
>               LogicalTableScan(table=[[scott, EMP]]) {code}
> Stack trace:
> {code:java}
>  Caused by: java.lang.NullPointerException
>   at java.util.Objects.requireNonNull(Objects.java:203)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateInputWithValueGenerator(RelDecorrelator.java:1028)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:764)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:738)
>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:464)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.jav

[jira] [Commented] (CALCITE-5390) RelDecorrelator throws NullPointerException

2023-04-22 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715343#comment-17715343
 ] 

Julian Hyde commented on CALCITE-5390:
--

[~zabetak], I'm investigating the NullPointerException caused by the first 
query. It seems related to the changes that you made in CALCITE-4560 (pushing 
correlation expressions) and also the logic to deduplicate correlation 
variables. Here's the intermediate plan:
{noformat}
LogicalProject(T1=[$8], T2=[$9])
  LogicalCorrelate(correlation=[$cor0], joinType=[left], requiredColumns=[{2}])
LogicalCorrelate(correlation=[$cor0], joinType=[left], 
requiredColumns=[{2}])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
LogicalSort(fetch=[1])
  LogicalProject(EXPR$0=[1])
LogicalFilter(condition=[=($2, $cor0.JOB)])
  LogicalTableScan(table=[[scott, EMP]])
LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
  LogicalSort(fetch=[1])
LogicalProject(EXPR$0=[=($cor0.JOB, 'PRESIDENT')])
  LogicalTableScan(table=[[scott, EMP]])
{noformat}

Note that $cor0 occurs on lines 2 and 3. But I believe it has different types 
on these lines - the occurrence on line 2 has more fields than line 3. When 
field #9 is referenced in a location that has only 7 fields, that's when the 
NPE occurs.

Do you agree that the different types are a problem?

By the way, you can reproduce by pasting the following into {{dummy.iq}}:
{noformat}
!use scott
# Throws: java.lang.NullPointerException
#   at java.base/java.util.Objects.requireNonNull(Objects.java:208)
#   at 
org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
SELECT
  (SELECT 1 FROM emp AS d WHERE d.job = a.job LIMIT 1) AS t1,
  (SELECT a.job = 'PRESIDENT' FROM emp AS s LIMIT 1) as t2
FROM emp AS a;
!ok
{noformat}


> RelDecorrelator throws NullPointerException
> ---
>
> Key: CALCITE-5390
> URL: https://issues.apache.org/jira/browse/CALCITE-5390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Dmitry Sysolyatin
>Assignee: Dan Zou
>Priority: Major
>
> The current query throws NullPointerException
> {code:java}
> SELECT
>   (SELECT 1 FROM emp d WHERE d.job = a.job LIMIT 1) AS t1,
>   (SELECT a.job = 'PRESIDENT' FROM emp s LIMIT 1) as t2
> FROM emp a;
> {code}
> Test case - 
> [https://github.com/apache/calcite/commit/46fe9bc456f2d34cf7dccd29829c9e85abe69d5f]
> Logical plan before it fails:
> {code:java}
> LogicalProject(T1=[$8], T2=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f09=[$9])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f00=[$10])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{9}])
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f9=[=($2, 
> 'PRESIDENT')])
>           LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{2}])
>             LogicalTableScan(table=[[scott, EMP]])
>             LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>               LogicalSort(fetch=[1])
>                 LogicalProject(EXPR$0=[1])
>                   LogicalFilter(condition=[=($2, $cor0.JOB)])
>                     LogicalTableScan(table=[[scott, EMP]])
>         LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>           LogicalSort(fetch=[1])
>             LogicalProject(EXPR$0=[$cor0.$f9])
>               LogicalTableScan(table=[[scott, EMP]]) {code}
> Stack trace:
> {code:java}
>  Caused by: java.lang.NullPointerException
>   at java.util.Objects.requireNonNull(Objects.java:203)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateInputWithValueGenerator(RelDecorrelator.java:1028)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:764)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:738)
>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:464)
>   at

[jira] [Resolved] (CALCITE-5656) RelToSqlConverter has an error using ordinal for ORDER BY

2023-04-22 Thread luoping.zhang (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5656?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

luoping.zhang resolved CALCITE-5656.

Resolution: Duplicate

CALCITE-5530 will solve this

> RelToSqlConverter has an error using ordinal for ORDER BY
> -
>
> Key: CALCITE-5656
> URL: https://issues.apache.org/jira/browse/CALCITE-5656
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: luoping.zhang
>Assignee: luoping.zhang
>Priority: Critical
>  Labels: pull-request-available
> Attachments: image-2023-04-19-12-26-32-719.png
>
>  Time Spent: 50m
>  Remaining Estimate: 0h
>
>  For example:
> Query SQL is 
> {code:java}
>  select a from table order by coalesce(a,b){code}
> When we convert this sql to relNode and convert it back to SQL by 
> RelToConverter, the result is:
> {code:java}
> select a from table order by 2{code}
> But this sql syntax is wrong because there is no ordinal 2



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5669) Remove trivial correlates from the query plan

2023-04-22 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715316#comment-17715316
 ] 

Julian Hyde edited comment on CALCITE-5669 at 4/22/23 7:17 PM:
---

There's a potential problem with the Volcano planner. Suppose that the 
right-hand input of a {{Correlate}} is a {{RelSet}} with several {{RelNode}} 
instances, some of which use the variable and some of which do not. The fact 
that one {{RelNode}} does not use the variable proves that it is not necessary. 
But nevertheless, the other {{RelNode}} instances are now invalid, because they 
reference a variable that is not defined.

I believe that the right-hand {{RelNode}} will need to be moved into a new 
{{RelSet}} (with one fewer variable used) at some point between when constant 
reduction occurs and when it is put under the {{Join}}. It's not clear from the 
code that that is happening. Please check that it is.


was (Author: julianhyde):
There's a potential problem with the Volcano planner. Suppose that the 
right-hand input of a Correlate is a RelSet with several RelNode instances, 
some of which use the variable and some of which do not. The fact that one 
RelNode does not use the variable proves that it is not necessary. But 
nevertheless, the other RelNode instances are now invalid, because they 
reference a variable that is not defined.

I believe that the right-hand {{RelNode}} will need to be moved into a new 
{{RelSet}} (with one fewer variable used) at some point between when constant 
reduction occurs and when it is put under the Join. It's not clear from the 
code that that is happening. Please check that it is.

> Remove trivial correlates from the query plan
> -
>
> Key: CALCITE-5669
> URL: https://issues.apache.org/jira/browse/CALCITE-5669
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query correlated query.
> {code:sql}
> select * from emp as e where exists (select 1 from dept as d where e.empno = 
> null)
> {code}
> The query basically returns an empty result because {{e.empno = null}} is 
> always false.
> The plan for the query after applying the sub-query remove rule is shown 
> below:
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalFilter(condition=[=($cor0.EMPNO, null)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> After applying the reduce expressions rule the filter with the correlated 
> condition will become false and the resulting plan would be the following.
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalValues(tuples=[[]])
> {noformat}
> Observe that now we have a {{LogicalCorrelate}} but there is no real 
> correlation in the plan since the correlation variable on the right side 
> disappeared. Depending on how rules are applied and which rules are used 
> similar "trivial" correlates may appear.
> The goal of this ticket is to provide the means to get rid of them.
> One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
> detects that a correlate does not have correlations in the right side and 
> turn the correlation to a join; then we could employ other existing rules 
> (such as PruneEmptyRules) for joins and remove the newly created join 
> altogether.
> Another option, would be to introduce new pruning rule(s) for correlate 
> (similar to those for joins) that will remove the correlate when its input is 
> an empty values expression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5669) Remove trivial correlates from the query plan

2023-04-22 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715316#comment-17715316
 ] 

Julian Hyde commented on CALCITE-5669:
--

There's a potential problem with the Volcano planner. Suppose that the 
right-hand input of a Correlate is a RelSet with several RelNode instances, 
some of which use the variable and some of which do not. The fact that one 
RelNode does not use the variable proves that it is not necessary. But 
nevertheless, the other RelNode instances are now invalid, because they 
reference a variable that is not defined.

I believe that the right-hand {{RelNode}} will need to be moved into a new 
{{RelSet}} (with one fewer variable used) at some point between when constant 
reduction occurs and when it is put under the Join. It's not clear from the 
code that that is happening. Please check that it is.

> Remove trivial correlates from the query plan
> -
>
> Key: CALCITE-5669
> URL: https://issues.apache.org/jira/browse/CALCITE-5669
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query correlated query.
> {code:sql}
> select * from emp as e where exists (select 1 from dept as d where e.empno = 
> null)
> {code}
> The query basically returns an empty result because {{e.empno = null}} is 
> always false.
> The plan for the query after applying the sub-query remove rule is shown 
> below:
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalFilter(condition=[=($cor0.EMPNO, null)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> After applying the reduce expressions rule the filter with the correlated 
> condition will become false and the resulting plan would be the following.
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalValues(tuples=[[]])
> {noformat}
> Observe that now we have a {{LogicalCorrelate}} but there is no real 
> correlation in the plan since the correlation variable on the right side 
> disappeared. Depending on how rules are applied and which rules are used 
> similar "trivial" correlates may appear.
> The goal of this ticket is to provide the means to get rid of them.
> One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
> detects that a correlate does not have correlations in the right side and 
> turn the correlation to a join; then we could employ other existing rules 
> (such as PruneEmptyRules) for joins and remove the newly created join 
> altogether.
> Another option, would be to introduce new pruning rule(s) for correlate 
> (similar to those for joins) that will remove the correlate when its input is 
> an empty values expression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5669) Remove trivial correlates from the query plan

2023-04-22 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715312#comment-17715312
 ] 

Julian Hyde commented on CALCITE-5669:
--

Generally I think it's worthwhile to expand the "correlating expression" as 
much as possible. (We may already do this; I haven't checked.) In this case, 
don't correlate on "{{e.empno}}" but instead correlate on "{{e.empno = null}}". 
Then that is an expression that can be more easily constant-reduced.

(One counterpoint to that: in another case (CALCITE-5644) I am dealing 
correlated sub-query that has "{{t.a is not distinct from u.b}}", which is 
expanded to "{{t.a is null and u.b is null or t.a = t.b}}", and somehow "{{t.a 
is null}}" ends up being computed as a separate boolean column in the left 
input of the {{Correlate}} operator. Thus the correlation condition is split 
between two operators, and it is very hard to perform further simplification. 
The answer in this case is probably to keep {{is not distinct from}} intact, 
but even so, it emphasizes the importance of choosing the right grain for 
correlation variables.)

But even with the best intentions we may end up with trivial {{Correlate}} 
operators. Eliminating variables that are not used, and converting 
{{Correlate}} operators with no variables into {{Join}} operators, seem to be 
simple and efficient rules. So, my vote would be to do that solution.

> Remove trivial correlates from the query plan
> -
>
> Key: CALCITE-5669
> URL: https://issues.apache.org/jira/browse/CALCITE-5669
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query correlated query.
> {code:sql}
> select * from emp as e where exists (select 1 from dept as d where e.empno = 
> null)
> {code}
> The query basically returns an empty result because {{e.empno = null}} is 
> always false.
> The plan for the query after applying the sub-query remove rule is shown 
> below:
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalFilter(condition=[=($cor0.EMPNO, null)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> After applying the reduce expressions rule the filter with the correlated 
> condition will become false and the resulting plan would be the following.
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalValues(tuples=[[]])
> {noformat}
> Observe that now we have a {{LogicalCorrelate}} but there is no real 
> correlation in the plan since the correlation variable on the right side 
> disappeared. Depending on how rules are applied and which rules are used 
> similar "trivial" correlates may appear.
> The goal of this ticket is to provide the means to get rid of them.
> One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
> detects that a correlate does not have correlations in the right side and 
> turn the correlation to a join; then we could employ other existing rules 
> (such as PruneEmptyRules) for joins and remove the newly created join 
> altogether.
> Another option, would be to introduce new pruning rule(s) for correlate 
> (similar to those for joins) that will remove the correlate when its input is 
> an empty values expression.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5669) Remove trivial correlates from the query plan

2023-04-22 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-5669:
-
Description: 
Consider the following query correlated query.
{code:sql}
select * from emp as e where exists (select 1 from dept as d where e.empno = 
null)
{code}
The query basically returns an empty result because {{e.empno = null}} is 
always false.

The plan for the query after applying the sub-query remove rule is shown below:
{noformat}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{0}])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalAggregate(group=[{0}])
LogicalProject(i=[true])
  LogicalFilter(condition=[=($cor0.EMPNO, null)])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{noformat}
After applying the reduce expressions rule the filter with the correlated 
condition will become false and the resulting plan would be the following.
{noformat}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{0}])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalAggregate(group=[{0}])
LogicalProject(i=[true])
  LogicalValues(tuples=[[]])
{noformat}
Observe that now we have a {{LogicalCorrelate}} but there is no real 
correlation in the plan since the correlation variable on the right side 
disappeared. Depending on how rules are applied and which rules are used 
similar "trivial" correlates may appear.

The goal of this ticket is to provide the means to get rid of them.

One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
detects that a correlate does not have correlations in the right side and turn 
the correlation to a join; then we could employ other existing rules (such as 
PruneEmptyRules) for joins and remove the newly created join altogether.

Another option, would be to introduce new pruning rule(s) for correlate 
(similar to those for joins) that will remove the correlate when its input is 
an empty values expression.

  was:
Consider the following query correlated query.
{code:sql}
select * from emp e where exists (select 1 from dept where empno=null)
{code}
The query basically returns an empty result cause {{empno=null}} is always 
false.

The plan for the query after applying the sub-query remove rule is shown below:
{noformat}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{0}])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalAggregate(group=[{0}])
LogicalProject(i=[true])
  LogicalFilter(condition=[=($cor0.EMPNO, null)])
LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
{noformat}
After applying the reduce expressions rule the filter with the correlated 
condition will become false and the resulting plan would be the following.
{noformat}
LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
  LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
requiredColumns=[{0}])
  LogicalTableScan(table=[[CATALOG, SALES, EMP]])
  LogicalAggregate(group=[{0}])
LogicalProject(i=[true])
  LogicalValues(tuples=[[]])
{noformat}
Observe that now we have a {{LogicalCorrelate}} but there is no real 
correlation in the plan since the correlation variable on the right side 
disappeared. Depending on how rules are applied and which rules are used 
similar "trivial" correlates may appear.

The goal of this ticket is to provide the means to get rid of them.

One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
detects that a correlate does not have correlations in the right side and turn 
the correlation to a join; then we could employ other existing rules (such as 
PruneEmptyRules) for joins and remove the newly created join altogether.

Another option, would be to introduce new pruning rule(s) for correlate 
(similar to those for joins) that will remove the correlate when its input is 
an empty values expressio