[jira] [Commented] (CALCITE-5674) cast expr to target type should respect nullable when the type is array

2023-04-24 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716064#comment-17716064
 ] 

jackylau commented on CALCITE-5674:
---

hi [~julianhyde] [~snuyanzin] [~libenchao]  do you have time to have a look , i 
can create pr to fix it

> cast expr to target type should respect nullable when the type is array
> ---
>
> Key: CALCITE-5674
> URL: https://issues.apache.org/jira/browse/CALCITE-5674
> Project: Calcite
>  Issue Type: Improvement
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>
> {code:java}
> // code placeholder
>  
> // code placeholder
> sql("select cast(array[1,null,2] as int array) from (values (1))")
> .columnType("INTEGER NOT NULL ARRAY NOT NULL");
> it is not correct, it should return INTEGER ARRAY NOT NULL   {code}
>  
> according the sql standard, cast to type, which can not be nulable and 
> nullable is tabel level attribute which can not changed by cast.
> current the calcite cast will using nullable of expr to replace target type , 
> it is true. but it shoud also set array element type
> {code:java}
> // code placeholder
> /** Derives the type of "CAST(expression AS targetType)". */
> public static RelDataType deriveType(RelDataTypeFactory typeFactory,
> RelDataType expressionType, RelDataType targetType, boolean safe) {
>   return typeFactory.createTypeWithNullability(targetType,
>   expressionType.isNullable() || safe);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5674) cast expr to target type should respect nullable when the type is array

2023-04-24 Thread jackylau (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jackylau updated CALCITE-5674:
--
Description: 
{code:java}
// code placeholder
 
// code placeholder
sql("select cast(array[1,null,2] as int array) from (values (1))")
.columnType("INTEGER NOT NULL ARRAY NOT NULL");

it is not correct, it should return INTEGER ARRAY NOT NULL   {code}
 

according the sql standard, cast to type, which can not be nulable and nullable 
is tabel level attribute which can not changed by cast.

current the calcite cast will using nullable of expr to replace target type , 
it is true. but it shoud also set array element type
{code:java}
// code placeholder
/** Derives the type of "CAST(expression AS targetType)". */
public static RelDataType deriveType(RelDataTypeFactory typeFactory,
RelDataType expressionType, RelDataType targetType, boolean safe) {
  return typeFactory.createTypeWithNullability(targetType,
  expressionType.isNullable() || safe);
} {code}
 

  was:
{code:java}
// code placeholder
sql("select cast(array[1,null,2] as int array) from (values (1))")
.columnType("INTEGER NOT NULL ARRAY NOT NULL");

it is not correct, it should return INTEGER ARRAY NOT NULL  {code}


> cast expr to target type should respect nullable when the type is array
> ---
>
> Key: CALCITE-5674
> URL: https://issues.apache.org/jira/browse/CALCITE-5674
> Project: Calcite
>  Issue Type: Improvement
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>
> {code:java}
> // code placeholder
>  
> // code placeholder
> sql("select cast(array[1,null,2] as int array) from (values (1))")
> .columnType("INTEGER NOT NULL ARRAY NOT NULL");
> it is not correct, it should return INTEGER ARRAY NOT NULL   {code}
>  
> according the sql standard, cast to type, which can not be nulable and 
> nullable is tabel level attribute which can not changed by cast.
> current the calcite cast will using nullable of expr to replace target type , 
> it is true. but it shoud also set array element type
> {code:java}
> // code placeholder
> /** Derives the type of "CAST(expression AS targetType)". */
> public static RelDataType deriveType(RelDataTypeFactory typeFactory,
> RelDataType expressionType, RelDataType targetType, boolean safe) {
>   return typeFactory.createTypeWithNullability(targetType,
>   expressionType.isNullable() || safe);
> } {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5674) cast expr to target type should respect nullable when the type is array

2023-04-24 Thread jackylau (Jira)
jackylau created CALCITE-5674:
-

 Summary: cast expr to target type should respect nullable when the 
type is array
 Key: CALCITE-5674
 URL: https://issues.apache.org/jira/browse/CALCITE-5674
 Project: Calcite
  Issue Type: Improvement
Affects Versions: 1.35.0
 Environment: {code:java}
// code placeholder
sql("select cast(array[1,null,2] as int array) from (values (1))")
.columnType("INTEGER NOT NULL ARRAY NOT NULL");

it is not correct, it should return INTEGER ARRAY NOT NULL {code}
Reporter: jackylau
 Fix For: 1.35.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5674) cast expr to target type should respect nullable when the type is array

2023-04-24 Thread jackylau (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jackylau updated CALCITE-5674:
--
Description: 
{code:java}
// code placeholder
sql("select cast(array[1,null,2] as int array) from (values (1))")
.columnType("INTEGER NOT NULL ARRAY NOT NULL");

it is not correct, it should return INTEGER ARRAY NOT NULL  {code}

> cast expr to target type should respect nullable when the type is array
> ---
>
> Key: CALCITE-5674
> URL: https://issues.apache.org/jira/browse/CALCITE-5674
> Project: Calcite
>  Issue Type: Improvement
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>
> {code:java}
> // code placeholder
> sql("select cast(array[1,null,2] as int array) from (values (1))")
> .columnType("INTEGER NOT NULL ARRAY NOT NULL");
> it is not correct, it should return INTEGER ARRAY NOT NULL  {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5674) cast expr to target type should respect nullable when the type is array

2023-04-24 Thread jackylau (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5674?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jackylau updated CALCITE-5674:
--
Environment: (was: {code:java}
// code placeholder
sql("select cast(array[1,null,2] as int array) from (values (1))")
.columnType("INTEGER NOT NULL ARRAY NOT NULL");

it is not correct, it should return INTEGER ARRAY NOT NULL {code})

> cast expr to target type should respect nullable when the type is array
> ---
>
> Key: CALCITE-5674
> URL: https://issues.apache.org/jira/browse/CALCITE-5674
> Project: Calcite
>  Issue Type: Improvement
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5655) Wrong plan for multiple IN/SOME sub-queries with OR predicate

2023-04-24 Thread Runkang He (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5655?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17716037#comment-17716037
 ] 

Runkang He commented on CALCITE-5655:
-

[~julianhyde] Thanks for the clarification. I will keep this principle for 
jiras in the future.

> Wrong plan for multiple IN/SOME sub-queries with OR predicate
> -
>
> Key: CALCITE-5655
> URL: https://issues.apache.org/jira/browse/CALCITE-5655
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Runkang He
>Assignee: Runkang He
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 2h 20m
>  Remaining Estimate: 0h
>
> When the query contains multiple IN/SOME sub-queries connected with OR 
> predicate in WHERE clause, the result is wrong. The minimal reproducer is 
> below:
> SQL:
> {code:sql}
> select empno from sales.empnullables
> where deptno in (
>   select deptno from sales.deptnullables where name = 'dept1')
> or deptno in (
>   select deptno from sales.deptnullables where name = 'dept2')
> {code}
> The Plan generated by calcite master branch: (Notice the bold part of *<>($2, 
> 0)* in the downstream LogicalFilter)
> {code:sql}
> LogicalProject(EMPNO=[$0])
>   LogicalProject(EMPNO=[$0], DEPTNO=[$1])
> LogicalFilter(condition=[OR(AND(<>($2, 0), IS NOT NULL($5), IS NOT 
> NULL($1)), AND(***<>($2, 0)***, IS NOT NULL($9), IS NOT NULL($1)))])
>   LogicalJoin(condition=[=($1, $8)], joinType=[left])
> LogicalJoin(condition=[true], joinType=[inner])
>   LogicalJoin(condition=[=($1, $4)], joinType=[left])
> LogicalJoin(condition=[true], joinType=[inner])
>   LogicalProject(EMPNO=[$0], DEPTNO=[$7])
> LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
>   LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[=($1, 'dept1')])
> LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
> LogicalProject(DEPTNO=[$0], i=[true])
>   LogicalFilter(condition=[=($1, 'dept1')])
> LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
>   LogicalAggregate(group=[{}], c=[COUNT()], ck=[COUNT($0)])
> LogicalProject(DEPTNO=[$0])
>   LogicalFilter(condition=[=($1, 'dept2')])
> LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
> LogicalProject(DEPTNO=[$0], i=[true])
>   LogicalFilter(condition=[=($1, 'dept2')])
> LogicalTableScan(table=[[CATALOG, SALES, DEPTNULLABLES]])
> {code}
> The wrong part is that when build the downstream LogicalFilter for the two 
> sub-queries, the filter for the second sub-query is AND(<>($2, 0), IS NOT 
> NULL($9), IS NOT NULL($1)), notice that *$2 should be the second sub-query's 
> intermediate table field ct.c(which field index is $6), but now the actual 
> reference is the first sub-query's*, this leads to wrong plan, and wrong 
> result.
> The root cause is that intermediate table alias is the same as the previous 
> sub-query's, but when lookup intermediate table field, it always returns the 
> previous one which is not belong to the current subquery.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5646) JoinDeriveIsNotNullFilterRule incorrectly handles COALESCE in join condition

2023-04-24 Thread Stamatis Zampetakis (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stamatis Zampetakis resolved CALCITE-5646.
--
Fix Version/s: 1.35.0
   Resolution: Fixed

Fixed in 
[https://github.com/apache/calcite/commit/8db5403d2e2278d25d98aa5c2f0a299574234b7c.]
 Thanks for the PR [~lchistov1987] and [~julianhyde] for the extra pair of eyes 
on this one!

> JoinDeriveIsNotNullFilterRule incorrectly handles COALESCE in join condition
> 
>
> Key: CALCITE-5646
> URL: https://issues.apache.org/jira/browse/CALCITE-5646
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.34.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 1h
>  Remaining Estimate: 0h
>
> Consider query
> {code:java}
> select t1.deptno from empnullables t1 inner join
> empnullables t2 on coalesce(t1.ename, t2.ename) = 'abc' {code}
> When JoinDeriveIsNotNullFilterRule is applied to it, it is incorrectly 
> transformed to query plan
> {code:java}
> LogicalProject(DEPTNO=[$7])
>  LogicalJoin(condition=[=(CASE(IS NOT NULL($1), $1, $10), 'abc')], 
> joinType=[inner])
>    LogicalFilter(condition=[IS NOT NULL($1)])
>  LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]])
>    LogicalFilter(condition=[IS NOT NULL($1)])
>  LogicalTableScan(table=[[CATALOG, SALES, EMPNULLABLES]]) {code}
> It is not valid to deduce that join keys from the both sides cannot have null 
> values. All that we can deduce from the join condition, is that they cannot 
> be null in the same time.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5390) RelDecorrelator throws NullPointerException

2023-04-24 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715923#comment-17715923
 ] 

Stamatis Zampetakis commented on CALCITE-5390:
--

[~julianhyde]  I didn't debug the example you provided yet but it is likely 
that the problem is somewhat related to field trimming. I remember some 
[discussion|https://github.com/apache/calcite/pull/2623#discussion_r823790744] 
with [~korlov] where he pointed out a problem in RelFieldTrimmer.

In PR#2623, there was also a 
[part|https://github.com/apache/calcite/pull/2623/commits/8fc0f2bc687142141ab04d63a67952b9022f2cf0#diff-a57af2470b3215be0ce7d94a226f1997d9417c613fd7f32097fd07543b30633a)]
 adding logic in {{SubQueryRemoveRule}} to do something similar to what 
[~libenchao] suggested above (rewrite to different variables when removing the 
sub-query) but was abandoned given that the root cause seemed to be in the 
trimmer.

> RelDecorrelator throws NullPointerException
> ---
>
> Key: CALCITE-5390
> URL: https://issues.apache.org/jira/browse/CALCITE-5390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Dmitry Sysolyatin
>Assignee: Dan Zou
>Priority: Major
>
> The current query throws NullPointerException
> {code:java}
> SELECT
>   (SELECT 1 FROM emp d WHERE d.job = a.job LIMIT 1) AS t1,
>   (SELECT a.job = 'PRESIDENT' FROM emp s LIMIT 1) as t2
> FROM emp a;
> {code}
> Test case - 
> [https://github.com/apache/calcite/commit/46fe9bc456f2d34cf7dccd29829c9e85abe69d5f]
> Logical plan before it fails:
> {code:java}
> LogicalProject(T1=[$8], T2=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f09=[$9])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f00=[$10])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{9}])
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f9=[=($2, 
> 'PRESIDENT')])
>           LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{2}])
>             LogicalTableScan(table=[[scott, EMP]])
>             LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>               LogicalSort(fetch=[1])
>                 LogicalProject(EXPR$0=[1])
>                   LogicalFilter(condition=[=($2, $cor0.JOB)])
>                     LogicalTableScan(table=[[scott, EMP]])
>         LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>           LogicalSort(fetch=[1])
>             LogicalProject(EXPR$0=[$cor0.$f9])
>               LogicalTableScan(table=[[scott, EMP]]) {code}
> Stack trace:
> {code:java}
>  Caused by: java.lang.NullPointerException
>   at java.util.Objects.requireNonNull(Objects.java:203)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateInputWithValueGenerator(RelDecorrelator.java:1028)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:764)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:738)
>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:464)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:512)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
>   at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUt

[jira] [Commented] (CALCITE-5669) Remove trivial correlates from the query plan

2023-04-24 Thread Stamatis Zampetakis (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5669?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715867#comment-17715867
 ] 

Stamatis Zampetakis commented on CALCITE-5669:
--

Thanks for the review Julian! Indeed the {{CorrelateToJoinRule}} does not work 
in Volcano for the reason you mentioned.

The initial {{RelSet}} created for the 
{{LogicalFilter(condition=[=($cor0.EMPNO, null)])}} has a correlation variable. 
When the reduction rule applies and removes the filter along with the 
correlation the new {{LogicalValues}} expression will end up in the same 
{{{}RelSet{}}}. When the {{CorrelateToJoinRule}} fires and checks if there are 
correlations on the right input, the {{RelSet}} which contains both the 
{{LogicalFilter}} and the {{LogicalValues}} will say yes so the rule will not 
perform the transformation to {{{}Join{}}}. I enriched the PR 
([https://github.com/apache/calcite/pull/3170/commits/1fe8cb06d84e5f6df39df7da9ec12a9352e09506])
 with some tests showing the problem.

I am reluctant to change the Volcano internals putting seemingly equivalent 
expressions to different sets when they differ in the correlation. This would 
be a change with quite big impact and I don't have sufficient use-cases to 
justify this at this point.

The alternative proposal, adding pruning rules for Correlate when one of its 
inputs are empty, seems more promising 
([https://github.com/apache/calcite/pull/3170/commits/fe838385e13dd05bd6e8abc8c5b5dc359928f23c]).
 The pruning rules are more efficient since they have a more restrictive 
matching pattern and they do not require a full traversal of the right 
sub-tree. Additionally, they allow removing trivial correlates when the left 
input is empty. Finally, they work fine both with Hep and Volcano planner since 
the resulting transformation depends only on the direct inputs of the 
correlate. The issue of having relations with/without correlation variables in 
the same RelSet is still there but it is not something specific to this change.

> Remove trivial correlates from the query plan
> -
>
> Key: CALCITE-5669
> URL: https://issues.apache.org/jira/browse/CALCITE-5669
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Stamatis Zampetakis
>Assignee: Stamatis Zampetakis
>Priority: Major
>  Labels: pull-request-available
>
> Consider the following query correlated query.
> {code:sql}
> select * from emp as e where exists (select 1 from dept as d where e.empno = 
> null)
> {code}
> The query basically returns an empty result because {{e.empno = null}} is 
> always false.
> The plan for the query after applying the sub-query remove rule is shown 
> below:
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalFilter(condition=[=($cor0.EMPNO, null)])
> LogicalTableScan(table=[[CATALOG, SALES, DEPT]])
> {noformat}
> After applying the reduce expressions rule the filter with the correlated 
> condition will become false and the resulting plan would be the following.
> {noformat}
> LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], SLACKER=[$8])
> LogicalCorrelate(correlation=[$cor0], joinType=[inner], 
> requiredColumns=[{0}])
>   LogicalTableScan(table=[[CATALOG, SALES, EMP]])
>   LogicalAggregate(group=[{0}])
> LogicalProject(i=[true])
>   LogicalValues(tuples=[[]])
> {noformat}
> Observe that now we have a {{LogicalCorrelate}} but there is no real 
> correlation in the plan since the correlation variable on the right side 
> disappeared. Depending on how rules are applied and which rules are used 
> similar "trivial" correlates may appear.
> The goal of this ticket is to provide the means to get rid of them.
> One option would be to add a new rule (e.g., {{CorrelateToJoinRule}}) which 
> detects that a correlate does not have correlations in the right side and 
> turn the correlation to a join; then we could employ other existing rules 
> (such as PruneEmptyRules) for joins and remove the newly created join 
> altogether.
> Another option, would be to introduce new pruning rule(s) for correlate 
> (similar to those for joins) that will remo

[jira] [Updated] (CALCITE-5642) Add SHA256, SHA512 functions (enabled in BigQuery and PostgreSQL libraries)

2023-04-24 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5642:
---
Summary: Add SHA256, SHA512 functions (enabled in BigQuery and PostgreSQL 
libraries)  (was: Add SHA256, SHA512 functions)

> Add SHA256, SHA512 functions (enabled in BigQuery and PostgreSQL libraries)
> ---
>
> Key: CALCITE-5642
> URL: https://issues.apache.org/jira/browse/CALCITE-5642
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Dan Zou
>Assignee: Dan Zou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add SHA256, SHA512 functions and enable them for BigQuery.
> SHA256: Computes the hash of the input using the [SHA-256 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 32 
> bytes.
> SHA512: Computes the hash of the input using the [SHA-512 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 64 
> bytes.
> See more details in [BigQuery 
> Doc|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#sha256]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5642) Add SHA256, SHA512 functions

2023-04-24 Thread Dan Zou (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715635#comment-17715635
 ] 

Dan Zou commented on CALCITE-5642:
--

[~rubenql] I have updated my PR, please have a look when you have time.

> Add SHA256, SHA512 functions
> 
>
> Key: CALCITE-5642
> URL: https://issues.apache.org/jira/browse/CALCITE-5642
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Dan Zou
>Assignee: Dan Zou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add SHA256, SHA512 functions and enable them for BigQuery.
> SHA256: Computes the hash of the input using the [SHA-256 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 32 
> bytes.
> SHA512: Computes the hash of the input using the [SHA-512 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 64 
> bytes.
> See more details in [BigQuery 
> Doc|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#sha256]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5673) FilterIntoJoinRule cannot pushdown filter to TableScan

2023-04-24 Thread Kelun Chai (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5673?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Kelun Chai updated CALCITE-5673:

Description: 
The current rule can only push down the join condition from one side to table 
scan. If the join condition comes from both sides, it cannot push down the 
filter, but extracts the relevant columns to Join through scan for calculation.

SQL Query:
{code:sql}
SELECT A.name AS cName, B.name AS fName FROM vehicles AS A JOIN dimTable as B 
ON ST_Contains(B.fence, ST_MakePoint(A.x, A.y));{code}
Query Plan:
{code:bash}
== Abstract Syntax Tree ==
LogicalProject(cName=[$1], fName=[$5])
+- LogicalJoin(condition=[ST_Contains($6, ST_MakePoint($2, $3))], 
joinType=[inner])
   :- LogicalTableScan(table=[[default_catalog, default_database, vehicles]])
   +- LogicalTableScan(table=[[default_catalog, default_database, 
dimTable]]){code}
The same query in postgres behaves as follows (w/ & w/o index):
{code:bash}
Nested Loop  (cost=0.00..18071570.38 rows=722 width=64)
   Join Filter: st_contains(b.fence, a.location)
   ->  Seq Scan on vehicles a  (cost=0.00..18.50 rows=850 width=64)
   ->  Materialize  (cost=0.00..22.75 rows=850 width=64)
         ->  Seq Scan on fences b  (cost=0.00..18.50 rows=850 width=64)
# Using GIST Index
Nested Loop  (cost=0.13..84.50 rows=1 width=64)
   ->  Seq Scan on fences b  (cost=0.00..1.03 rows=3 width=64)
   ->  Index Scan using g_idx on vehicles a  (cost=0.13..27.81 rows=1 width=64)
         Index Cond: (location @ b.fence)
         Filter: st_contains(b.fence, location){code}
We created an in-memory based index in TableScan, is there a way to convert the 
spatial join condition to NestedLoopJoin (LogicalCorrelate) and push down to 
the TableScan node?

If Calcite does not support such a design, can I ask what are the 
considerations/concerns?

  was:
The current rule can only push down the join condition from one side to table 
scan. If the join condition comes from both sides, it cannot push down the 
filter, but extracts the relevant columns to Join through scan for calculation.

SQL Query:
{code:sql}
SELECT A.name AS cName, B.name AS fName FROM vehicles AS A JOIN dimTable as B 
ON ST_Contains(B.fence, ST_MakePoint(A.x, A.y));{code}
Query Plan:
{code:bash}
== Abstract Syntax Tree ==
LogicalProject(cName=[$1], fName=[$5])
+- LogicalJoin(condition=[ST_Contains($6, ST_MakePoint($2, $3))], 
joinType=[inner])
   :- LogicalTableScan(table=[[default_catalog, default_database, vehicles]])
   +- LogicalTableScan(table=[[default_catalog, default_database, 
dimTable]]){code}
The same query in postgres behaves as follows (w/ & w/o index):
{code:bash}
Nested Loop  (cost=0.00..18071570.38 rows=722 width=64)
   Join Filter: st_contains(b.fence, a.location)
   ->  Seq Scan on vehicles a  (cost=0.00..18.50 rows=850 width=64)
   ->  Materialize  (cost=0.00..22.75 rows=850 width=64)
         ->  Seq Scan on fences b  (cost=0.00..18.50 rows=850 width=64)
# Using GIST Index
Nested Loop  (cost=0.13..84.50 rows=1 width=64)
   ->  Seq Scan on fences b  (cost=0.00..1.03 rows=3 width=64)
   ->  Index Scan using g_idx on vehicles a  (cost=0.13..27.81 rows=1 width=64)
         Index Cond: (location @ b.fence)
         Filter: st_contains(b.fence, location){code}
We created an in-memory based index in Tablescan, is there a way to convert the 
spatial join condition to NestedLoopJoin (LogicalCorrelate) and push down to 
the Tablescan node?


> FilterIntoJoinRule cannot pushdown filter to TableScan
> --
>
> Key: CALCITE-5673
> URL: https://issues.apache.org/jira/browse/CALCITE-5673
> Project: Calcite
>  Issue Type: Bug
>  Components: core, spatial
>Affects Versions: 1.26.0
>Reporter: Kelun Chai
>Priority: Major
>  Labels: features
>
> The current rule can only push down the join condition from one side to table 
> scan. If the join condition comes from both sides, it cannot push down the 
> filter, but extracts the relevant columns to Join through scan for 
> calculation.
> SQL Query:
> {code:sql}
> SELECT A.name AS cName, B.name AS fName FROM vehicles AS A JOIN dimTable as B 
> ON ST_Contains(B.fence, ST_MakePoint(A.x, A.y));{code}
> Query Plan:
> {code:bash}
> == Abstract Syntax Tree ==
> LogicalProject(cName=[$1], fName=[$5])
> +- LogicalJoin(condition=[ST_Contains($6, ST_MakePoint($2, $3))], 
> joinType=[inner])
>    :- LogicalTableScan(table=[[default_catalog, default_database, vehicles]])
>    +- LogicalTableScan(table=[[default_catalog, default_database, 
> dimTable]]){code}
> The same query in postgres behaves as follows (w/ & w/o index):
> {code:bash}
> Nested Loop  (cost=0.00..18071570.38 rows=722 width=64)
>    Join Filter: st_contains(b.fence, a.location)
>    ->  Seq Scan on vehicles a  (cost=0.00..18.50 rows=850 width=64)
>    ->  Materialize  (c

[jira] [Created] (CALCITE-5673) FilterIntoJoinRule cannot pushdown filter to TableScan

2023-04-24 Thread Kelun Chai (Jira)
Kelun Chai created CALCITE-5673:
---

 Summary: FilterIntoJoinRule cannot pushdown filter to TableScan
 Key: CALCITE-5673
 URL: https://issues.apache.org/jira/browse/CALCITE-5673
 Project: Calcite
  Issue Type: Bug
  Components: core, spatial
Affects Versions: 1.26.0
Reporter: Kelun Chai


The current rule can only push down the join condition from one side to table 
scan. If the join condition comes from both sides, it cannot push down the 
filter, but extracts the relevant columns to Join through scan for calculation.

SQL Query:
{code:sql}
SELECT A.name AS cName, B.name AS fName FROM vehicles AS A JOIN dimTable as B 
ON ST_Contains(B.fence, ST_MakePoint(A.x, A.y));{code}
Query Plan:
{code:bash}
== Abstract Syntax Tree ==
LogicalProject(cName=[$1], fName=[$5])
+- LogicalJoin(condition=[ST_Contains($6, ST_MakePoint($2, $3))], 
joinType=[inner])
   :- LogicalTableScan(table=[[default_catalog, default_database, vehicles]])
   +- LogicalTableScan(table=[[default_catalog, default_database, 
dimTable]]){code}
The same query in postgres behaves as follows (w/ & w/o index):
{code:bash}
Nested Loop  (cost=0.00..18071570.38 rows=722 width=64)
   Join Filter: st_contains(b.fence, a.location)
   ->  Seq Scan on vehicles a  (cost=0.00..18.50 rows=850 width=64)
   ->  Materialize  (cost=0.00..22.75 rows=850 width=64)
         ->  Seq Scan on fences b  (cost=0.00..18.50 rows=850 width=64)
# Using GIST Index
Nested Loop  (cost=0.13..84.50 rows=1 width=64)
   ->  Seq Scan on fences b  (cost=0.00..1.03 rows=3 width=64)
   ->  Index Scan using g_idx on vehicles a  (cost=0.13..27.81 rows=1 width=64)
         Index Cond: (location @ b.fence)
         Filter: st_contains(b.fence, location){code}
We created an in-memory based index in Tablescan, is there a way to convert the 
spatial join condition to NestedLoopJoin (LogicalCorrelate) and push down to 
the Tablescan node?



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5642) Add SHA256, SHA512 functions

2023-04-24 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5642:
---
Fix Version/s: 1.35.0

> Add SHA256, SHA512 functions
> 
>
> Key: CALCITE-5642
> URL: https://issues.apache.org/jira/browse/CALCITE-5642
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Dan Zou
>Assignee: Dan Zou
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add SHA256, SHA512 functions and enable them for BigQuery.
> SHA256: Computes the hash of the input using the [SHA-256 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 32 
> bytes.
> SHA512: Computes the hash of the input using the [SHA-512 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 64 
> bytes.
> See more details in [BigQuery 
> Doc|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#sha256]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5642) Add SHA256, SHA512 functions

2023-04-24 Thread Dan Zou (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17715610#comment-17715610
 ] 

Dan Zou commented on CALCITE-5642:
--

[~rubenql] Thanks for you review. We should enable these two functions for 
PostgreSQL after I recheck it, I will improve my PR soon. (I searched them on 
the wrong page, which led me to mistakenly believe that PostgreSQL does not 
support these two functions).

> Add SHA256, SHA512 functions
> 
>
> Key: CALCITE-5642
> URL: https://issues.apache.org/jira/browse/CALCITE-5642
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Dan Zou
>Assignee: Dan Zou
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 0.5h
>  Remaining Estimate: 0h
>
> Add SHA256, SHA512 functions and enable them for BigQuery.
> SHA256: Computes the hash of the input using the [SHA-256 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 32 
> bytes.
> SHA512: Computes the hash of the input using the [SHA-512 
> algorithm|https://en.wikipedia.org/wiki/SHA-2]. This function returns 64 
> bytes.
> See more details in [BigQuery 
> Doc|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#sha256]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)