[jira] [Commented] (CALCITE-5953) AggregateCaseToFilterRule may make inaccurate SUM transformation

2023-08-23 Thread Zoltan Haindrich (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758383#comment-17758383
 ] 

Zoltan Haindrich commented on CALCITE-5953:
---

I would also find it more natural if SUM would follow the mathematical 
definition (so that an empty sum is 0)  I believe when `SUM` was placed next to 
the other aggregates like `MIN` they have made some generalizations.

I was thinking about possible alternatives to fixing this:

* compute it from multiple aggs
{code:sql}
consider: select SUM(CASE WHEN x=1 THEN 2 ELSE 0 END from t;

select CASE WHEN s>0 THEN 2*s ELSE NULLIF(cnt,0) END
from (select count(1) filter (where x=1) as s,count(1) as cnt from t) tt;

or with ANY_VALUE:

select CASE WHEN s>0 THEN 2*s ELSE a END
from (select count(1) filter (where x=1) as s,any_value(0) as a from t) tt;

{code}
** so we are exchanging 1 agg with a complex CASE into 2 simpler aggs + a CASE 
on top of it
** pro of this approach:
*** it could be extended very easily to also cover the case when ELSE is not 0
*** possibly handle more branches for the same variable - at the cost of 1 more 
aggregation
*** it could possibly extended to rewrite multi branch cases - but every branch 
will cost 1 more aggregation
* consider only `SUM0` to optimize
** ProjectAggregateMergeRule could prepare SUM0 for this
** this fixes all above issues
*** but also disables it from the more natural usages

I'm still thinking about alternatives - let me know what you think about the 
above ; or if you have any other ideas!


> AggregateCaseToFilterRule may make inaccurate SUM transformation
> 
>
> Key: CALCITE-5953
> URL: https://issues.apache.org/jira/browse/CALCITE-5953
> Project: Calcite
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
> notice that this expression may only be null if there are no rows in the table
> {{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(1) 
> filter (where x=2)}} which broadens when it could be `null` to when there are 
> no matches to the filter
> * *A* is *0* correctly in this case; but I think it will be still *0* in case 
> there are 0 input rows
> * The result for *B* supposed to be *0* but since there are no matches by the 
> filter it becomes *null*
> * *C* is not touched
> {code}
> # Convert CASE to FILTER without matches
> select  sum(case when x = 1 then 1 else 0 end) as a,
> sum(case when x = 1 then 2 else 0 end) as b,
> sum(case when x = 1 then 3 else -1 end) as c
> from (values 0, null, 0, 2) as t(x);
> +---+---++
> | A | B | C  |
> +---+---++
> | 0 | 0 | -4 |
> +---+---++
> (1 row)
> !ok
> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
> B=[$t1], C=[$t2])
>   EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER 
> $3], C=[SUM($0)])
> EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
> expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
> expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
>   EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
> !plan
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5953) AggregateCaseToFilterRule may make inaccurate SUM transformation

2023-08-23 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758313#comment-17758313
 ] 

Benchao Li commented on CALCITE-5953:
-

bq. I think the SQL designers made a mistake when they decided the SUM over 0 
rows should return NULL

[~julianhyde], I'm not sure about this, per my understanding, {{NULL}} could 
exactly mean that all input is {{NULL}} (not necessarily 0 rows), but 0 can 
only mean that the sum of all values (maybe include zero or negative values) 
are 0. So returning {{NULL}} could give us more information than 0. Please 
correct me if I'm wrong.

> AggregateCaseToFilterRule may make inaccurate SUM transformation
> 
>
> Key: CALCITE-5953
> URL: https://issues.apache.org/jira/browse/CALCITE-5953
> Project: Calcite
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
> notice that this expression may only be null if there are no rows in the table
> {{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(1) 
> filter (where x=2)}} which broadens when it could be `null` to when there are 
> no matches to the filter
> * *A* is *0* correctly in this case; but I think it will be still *0* in case 
> there are 0 input rows
> * The result for *B* supposed to be *0* but since there are no matches by the 
> filter it becomes *null*
> * *C* is not touched
> {code}
> # Convert CASE to FILTER without matches
> select  sum(case when x = 1 then 1 else 0 end) as a,
> sum(case when x = 1 then 2 else 0 end) as b,
> sum(case when x = 1 then 3 else -1 end) as c
> from (values 0, null, 0, 2) as t(x);
> +---+---++
> | A | B | C  |
> +---+---++
> | 0 | 0 | -4 |
> +---+---++
> (1 row)
> !ok
> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
> B=[$t1], C=[$t2])
>   EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER 
> $3], C=[SUM($0)])
> EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
> expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
> expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
>   EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
> !plan
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5950) Default column constraint is erroneously processed.

2023-08-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758261#comment-17758261
 ] 

Julian Hyde commented on CALCITE-5950:
--

Which issues do you think are false positives? We have a fairly low level of 
flakiness (and yes: error-prone, checker-framework, autostyle, checkstyle, and 
lint are pedantic but we like it that way).

> Default column constraint is erroneously processed.
> ---
>
> Key: CALCITE-5950
> URL: https://issues.apache.org/jira/browse/CALCITE-5950
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: pull-request-available
>
> II change table.iq a bit and found a problem with processing default column 
> constraint:
> {code:java}
> create table tdef (i int not null, j int default 100);
> (0 rows modified)
> !update
> insert into tdef values (1, DEFAULT);
> (1 row modified)
> !update
> insert into tdef(i) values (2);
> (1 row modified)
> !update
> select * from tdef order by i;
> +---+-+
> | I | J   |
> +---+-+
> | 1 | 100 |
> | 2 | 100 |
> +---+-+
> (2 rows)
> !ok
> but obtain from calcite:
> +---+-+
> | I | J   |
> +---+-+
> | 1 | |
> | 2 | 100 |
> +---+-+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5914) Cache compiled regular expressions in SQL function runtime

2023-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5914:

Labels: pull-request-available  (was: )

> Cache compiled regular expressions in SQL function runtime
> --
>
> Key: CALCITE-5914
> URL: https://issues.apache.org/jira/browse/CALCITE-5914
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Priority: Major
>  Labels: pull-request-available
>
> Cache compiled regular expressions (and other amortized work) in SQL function 
> runtime. Compiling a regular expression to a pattern is expensive (compared 
> to the cost of matching, given an existing pattern) and therefore caching the 
> compiled form will yield performance benefits if the regular expression is 
> constant or has a small number of values.
> Consider the following query:
> {code:java}
> SELECT ename, job, RLIKE(ename, 'A.*'), RLIKE(ename, job || '.*')
> FROM emp
> {code}
> The first regular expression, '{{A.\*}}', is constant and can be compiled at 
> prepare time or at the start of execution; the second regular expression, 
> {{job || '.\*'}}, might vary from one row to the next. However if the {{job}} 
> column has a small number of values it still might be beneficial to cache the 
> compiled regular expression.
> If {{SqlFunctions.rlike}} could use a cache (mapping from {{String}} to 
> {{{}java.util.regex.Pattern{}}}) then it would achieve benefits in both the 
> constant and non-constant cases.
> The cache needs to:
>  * be thread-safe (in case queries are executing using multiple threads),
>  * return thread-safe objects (as is {{{}Pattern{}}}),
>  * have bounded space (so that a query doesn't blow memory with 1 million 
> distinct regular expressions),
>  * disposed after the query has terminated,
>  * (ideally) share with regexes of the same language in the same query,
>  * not conflict with regexes of different languages in the same query.
> One possible implementation is to add an {{interface FunctionState}}, with 
> subclasses including {{class RegexpCache}}, and if argument 1 of a function 
> is a subclass of {{FunctionState}} the compiler would initialize the state in 
> the generated code. The function can rely on the state argument being 
> initialized, and being the same object from one call to the next. Example:
> {code:java}
> interface FunctionState {
> }
> class RegexpCache implements FunctionState {
>   final Cache cache = ...;
> }
> {code}
> This change should install the cache for all applicable functions, including 
> LIKE, ILIKE, RLIKE, SIMILAR, posix regex, REGEXP_REPLACE (MySQL, Oracle), 
> REGEXP_CONTAINS (BigQuery), other BigQuery REGEXP_ functions, PARSE_URL, 
> JSON_REPLACE, PARSE_TIMESTAMP.
> It can also be used for functions that have mutable state, e.g. {{RANDOM}} 
> with and without a seed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5953) AggregateCaseToFilterRule may make inaccurate SUM transformation

2023-08-23 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758220#comment-17758220
 ] 

Julian Hyde commented on CALCITE-5953:
--

Good one. (I think the SQL designers made a mistake when they decided the SUM 
over 0 rows should return NULL, but we have to live with their decision.)

> AggregateCaseToFilterRule may make inaccurate SUM transformation
> 
>
> Key: CALCITE-5953
> URL: https://issues.apache.org/jira/browse/CALCITE-5953
> Project: Calcite
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
> notice that this expression may only be null if there are no rows in the table
> {{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(1) 
> filter (where x=2)}} which broadens when it could be `null` to when there are 
> no matches to the filter
> * *A* is *0* correctly in this case; but I think it will be still *0* in case 
> there are 0 input rows
> * The result for *B* supposed to be *0* but since there are no matches by the 
> filter it becomes *null*
> * *C* is not touched
> {code}
> # Convert CASE to FILTER without matches
> select  sum(case when x = 1 then 1 else 0 end) as a,
> sum(case when x = 1 then 2 else 0 end) as b,
> sum(case when x = 1 then 3 else -1 end) as c
> from (values 0, null, 0, 2) as t(x);
> +---+---++
> | A | B | C  |
> +---+---++
> | 0 | 0 | -4 |
> +---+---++
> (1 row)
> !ok
> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
> B=[$t1], C=[$t2])
>   EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER 
> $3], C=[SUM($0)])
> EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
> expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
> expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
>   EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
> !plan
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5949) RexExecutable correct handling of invalid constant expressions

2023-08-23 Thread Claude Brisson (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758147#comment-17758147
 ] 

Claude Brisson commented on CALCITE-5949:
-

Summary updated.

> RexExecutable correct handling of invalid constant expressions
> --
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Priority: Major
>  Labels: pull-request-available
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5953) AggregateCaseToFilterRule may make inaccurate SUM transformation

2023-08-23 Thread Zoltan Haindrich (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zoltan Haindrich updated CALCITE-5953:
--
Description: 
consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
notice that this expression may only be null if there are no rows in the table

{{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(1) filter 
(where x=2)}} which broadens when it could be `null` to when there are no 
matches to the filter

* *A* is *0* correctly in this case; but I think it will be still *0* in case 
there are 0 input rows
* The result for *B* supposed to be *0* but since there are no matches by the 
filter it becomes *null*
* *C* is not touched

{code}
# Convert CASE to FILTER without matches
select  sum(case when x = 1 then 1 else 0 end) as a,
sum(case when x = 1 then 2 else 0 end) as b,
sum(case when x = 1 then 3 else -1 end) as c
from (values 0, null, 0, 2) as t(x);
+---+---++
| A | B | C  |
+---+---++
| 0 | 0 | -4 |
+---+---++
(1 row)

!ok
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
B=[$t1], C=[$t2])
  EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER $3], 
C=[SUM($0)])
EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
  EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
!plan
{code}

  was:
consider: `sum(case when x = 1 then 2 else 0 end) as b`
notice that this expression may only be null if there are no rows in the table

`AggregateCaseToFilterRule` rewrites the above expression to `sum(1) filter 
(where x=2)` which broadens when it could be `null` to when there are no 
matches to the filter

* `A` is `0` correctly in this case; but I think it will be still `0` in case 
there are 0 input rows
* The result for `B` supposed to be `0` but since there are no matches by the 
filter it becomes `null`
* `C` is not touched

```
# Convert CASE to FILTER without matches
select  sum(case when x = 1 then 1 else 0 end) as a,
sum(case when x = 1 then 2 else 0 end) as b,
sum(case when x = 1 then 3 else -1 end) as c
from (values 0, null, 0, 2) as t(x);
+---+---++
| A | B | C  |
+---+---++
| 0 | 0 | -4 |
+---+---++
(1 row)

!ok
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
B=[$t1], C=[$t2])
  EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER $3], 
C=[SUM($0)])
EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
  EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
!plan
```


> AggregateCaseToFilterRule may make inaccurate SUM transformation
> 
>
> Key: CALCITE-5953
> URL: https://issues.apache.org/jira/browse/CALCITE-5953
> Project: Calcite
>  Issue Type: Bug
>Reporter: Zoltan Haindrich
>Assignee: Zoltan Haindrich
>Priority: Major
>
> consider: {{sum(case when x = 1 then 2 else 0 end) as b}}
> notice that this expression may only be null if there are no rows in the table
> {{AggregateCaseToFilterRule}} rewrites the above expression to {{sum(1) 
> filter (where x=2)}} which broadens when it could be `null` to when there are 
> no matches to the filter
> * *A* is *0* correctly in this case; but I think it will be still *0* in case 
> there are 0 input rows
> * The result for *B* supposed to be *0* but since there are no matches by the 
> filter it becomes *null*
> * *C* is not touched
> {code}
> # Convert CASE to FILTER without matches
> select  sum(case when x = 1 then 1 else 0 end) as a,
> sum(case when x = 1 then 2 else 0 end) as b,
> sum(case when x = 1 then 3 else -1 end) as c
> from (values 0, null, 0, 2) as t(x);
> +---+---++
> | A | B | C  |
> +---+---++
> | 0 | 0 | -4 |
> +---+---++
> (1 row)
> !ok
> EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
> B=[$t1], C=[$t2])
>   EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER 
> $3], C=[SUM($0)])
> EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
> expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
> expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
>   EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
> !plan
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5953) AggregateCaseToFilterRule may make inaccurate SUM transformation

2023-08-23 Thread Zoltan Haindrich (Jira)
Zoltan Haindrich created CALCITE-5953:
-

 Summary: AggregateCaseToFilterRule may make inaccurate SUM 
transformation
 Key: CALCITE-5953
 URL: https://issues.apache.org/jira/browse/CALCITE-5953
 Project: Calcite
  Issue Type: Bug
Reporter: Zoltan Haindrich
Assignee: Zoltan Haindrich


consider: `sum(case when x = 1 then 2 else 0 end) as b`
notice that this expression may only be null if there are no rows in the table

`AggregateCaseToFilterRule` rewrites the above expression to `sum(1) filter 
(where x=2)` which broadens when it could be `null` to when there are no 
matches to the filter

* `A` is `0` correctly in this case; but I think it will be still `0` in case 
there are 0 input rows
* The result for `B` supposed to be `0` but since there are no matches by the 
filter it becomes `null`
* `C` is not touched

```
# Convert CASE to FILTER without matches
select  sum(case when x = 1 then 1 else 0 end) as a,
sum(case when x = 1 then 2 else 0 end) as b,
sum(case when x = 1 then 3 else -1 end) as c
from (values 0, null, 0, 2) as t(x);
+---+---++
| A | B | C  |
+---+---++
| 0 | 0 | -4 |
+---+---++
(1 row)

!ok
EnumerableCalc(expr#0..2=[{inputs}], expr#3=[CAST($t0):INTEGER], A=[$t3], 
B=[$t1], C=[$t2])
  EnumerableAggregate(group=[{}], A=[COUNT() FILTER $1], B=[SUM($2) FILTER $3], 
C=[SUM($0)])
EnumerableCalc(expr#0=[{inputs}], expr#1=[1], expr#2=[=($t0, $t1)], 
expr#3=[3], expr#4=[-1], expr#5=[CASE($t2, $t3, $t4)], expr#6=[IS TRUE($t2)], 
expr#7=[2], $f2=[$t5], $f3=[$t6], $f4=[$t7], $f5=[$t6])
  EnumerableValues(tuples=[[{ 0 }, { null }, { 0 }, { 2 }]])
!plan
```



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5950) Default column constraint is erroneously processed.

2023-08-23 Thread Evgeny Stanilovsky (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758123#comment-17758123
 ] 

Evgeny Stanilovsky commented on CALCITE-5950:
-

PR is ready for review, besides some CI / CheckerFramework issues, seems they 
are false positive, did i need to fix all of them ?

> Default column constraint is erroneously processed.
> ---
>
> Key: CALCITE-5950
> URL: https://issues.apache.org/jira/browse/CALCITE-5950
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: pull-request-available
>
> II change table.iq a bit and found a problem with processing default column 
> constraint:
> {code:java}
> create table tdef (i int not null, j int default 100);
> (0 rows modified)
> !update
> insert into tdef values (1, DEFAULT);
> (1 row modified)
> !update
> insert into tdef(i) values (2);
> (1 row modified)
> !update
> select * from tdef order by i;
> +---+-+
> | I | J   |
> +---+-+
> | 1 | 100 |
> | 2 | 100 |
> +---+-+
> (2 rows)
> !ok
> but obtain from calcite:
> +---+-+
> | I | J   |
> +---+-+
> | 1 | |
> | 2 | 100 |
> +---+-+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5950) Default column constraint is erroneously processed.

2023-08-23 Thread ASF GitHub Bot (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5950?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5950:

Labels: pull-request-available  (was: )

> Default column constraint is erroneously processed.
> ---
>
> Key: CALCITE-5950
> URL: https://issues.apache.org/jira/browse/CALCITE-5950
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: pull-request-available
>
> II change table.iq a bit and found a problem with processing default column 
> constraint:
> {code:java}
> create table tdef (i int not null, j int default 100);
> (0 rows modified)
> !update
> insert into tdef values (1, DEFAULT);
> (1 row modified)
> !update
> insert into tdef(i) values (2);
> (1 row modified)
> !update
> select * from tdef order by i;
> +---+-+
> | I | J   |
> +---+-+
> | 1 | 100 |
> | 2 | 100 |
> +---+-+
> (2 rows)
> !ok
> but obtain from calcite:
> +---+-+
> | I | J   |
> +---+-+
> | 1 | |
> | 2 | 100 |
> +---+-+
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758083#comment-17758083
 ] 

Ruben Q L commented on CALCITE-5952:


The rule-based test that you proposed is totally relevant and should be 
included. My idea was to complement it with another "result-based" test. 
I don't know if you can "force" the bug with a quidem test, maybe you can try. 
Otherwise, perhaps it is easier to reproduce the bug in somewhere like 
{{EnumerableJoinTest.java}}? (it already contains other RelBuilder-based join 
tests, some of them adding/removing specific rules, where the resulted rows are 
checked).

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17758074#comment-17758074
 ] 

Leonid Chistov commented on CALCITE-5952:
-

[~rubenql]

Can you suggest which Calcite infrastructure to use to write such test (or 
where to look for the analogous examples) that will do the planning and execute 
the plan?

I have tried to look on Quidem tests, but they seem to be more 
frontend-oriented, I didn't find a way to make them run with specific 
optimization enabled.

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5909) SqlParserTest.testNoUnintendedNewReservedKeywords fails in IDE while passes in command line

2023-08-23 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757960#comment-17757960
 ] 

Ran Tao edited comment on CALCITE-5909 at 8/23/23 12:02 PM:


[~libenchao] thanks. yes, this case always fail. agree with you. I have changed 
the commit name and Jira name.


was (Author: lemonjing):
[~libenchao] thanks. yes, this case always fail. agree with you.

> SqlParserTest.testNoUnintendedNewReservedKeywords fails in IDE while passes 
> in command line
> ---
>
> Key: CALCITE-5909
> URL: https://issues.apache.org/jira/browse/CALCITE-5909
> Project: Calcite
>  Issue Type: Bug
>  Components: tests
>Reporter: LakeShen
>Assignee: Ran Tao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-08-08-23-32-55-466.png
>
>
> When I run the SqlParserTest,the testNoUnintendedNewReservedKeywords method 
> failed,the exception like this:
> {code:java}
> java.lang.AssertionError: The parser has at least one new reserved keyword. 
> Are you sure it should be reserved? Difference: {code}
> The picture like this:
> !image-2023-08-08-23-32-55-466.png|width=1543,height=496!
> I could fix this problem.More importantly, why is this method failing, but 
> the Calcite pipeline is passing? I think we should look at something we missed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5909) SqlParserTest.testNoUnintendedNewReservedKeywords fails in IDE while passes in command line

2023-08-23 Thread Ran Tao (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated CALCITE-5909:
-
Summary: SqlParserTest.testNoUnintendedNewReservedKeywords fails in IDE 
while passes in command line  (was: Sometimes 
SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE but passes 
when run from the command line)

> SqlParserTest.testNoUnintendedNewReservedKeywords fails in IDE while passes 
> in command line
> ---
>
> Key: CALCITE-5909
> URL: https://issues.apache.org/jira/browse/CALCITE-5909
> Project: Calcite
>  Issue Type: Bug
>  Components: tests
>Reporter: LakeShen
>Assignee: Ran Tao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-08-08-23-32-55-466.png
>
>
> When I run the SqlParserTest,the testNoUnintendedNewReservedKeywords method 
> failed,the exception like this:
> {code:java}
> java.lang.AssertionError: The parser has at least one new reserved keyword. 
> Are you sure it should be reserved? Difference: {code}
> The picture like this:
> !image-2023-08-08-23-32-55-466.png|width=1543,height=496!
> I could fix this problem.More importantly, why is this method failing, but 
> the Calcite pipeline is passing? I think we should look at something we missed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5909) Sometimes SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE but passes when run from the command line

2023-08-23 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757960#comment-17757960
 ] 

Ran Tao edited comment on CALCITE-5909 at 8/23/23 10:51 AM:


[~libenchao] thanks. yes, this case always fail. agree with you.


was (Author: lemonjing):
[~libenchao] thanks. agree with you. this case always fail.

> Sometimes SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE 
> but passes when run from the command line
> --
>
> Key: CALCITE-5909
> URL: https://issues.apache.org/jira/browse/CALCITE-5909
> Project: Calcite
>  Issue Type: Bug
>  Components: tests
>Reporter: LakeShen
>Assignee: Ran Tao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-08-08-23-32-55-466.png
>
>
> When I run the SqlParserTest,the testNoUnintendedNewReservedKeywords method 
> failed,the exception like this:
> {code:java}
> java.lang.AssertionError: The parser has at least one new reserved keyword. 
> Are you sure it should be reserved? Difference: {code}
> The picture like this:
> !image-2023-08-08-23-32-55-466.png|width=1543,height=496!
> I could fix this problem.More importantly, why is this method failing, but 
> the Calcite pipeline is passing? I think we should look at something we missed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5909) Sometimes SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE but passes when run from the command line

2023-08-23 Thread Ran Tao (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757960#comment-17757960
 ] 

Ran Tao commented on CALCITE-5909:
--

[~libenchao] thanks. agree with you. this case always fail.

> Sometimes SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE 
> but passes when run from the command line
> --
>
> Key: CALCITE-5909
> URL: https://issues.apache.org/jira/browse/CALCITE-5909
> Project: Calcite
>  Issue Type: Bug
>  Components: tests
>Reporter: LakeShen
>Assignee: Ran Tao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-08-08-23-32-55-466.png
>
>
> When I run the SqlParserTest,the testNoUnintendedNewReservedKeywords method 
> failed,the exception like this:
> {code:java}
> java.lang.AssertionError: The parser has at least one new reserved keyword. 
> Are you sure it should be reserved? Difference: {code}
> The picture like this:
> !image-2023-08-08-23-32-55-466.png|width=1543,height=496!
> I could fix this problem.More importantly, why is this method failing, but 
> the Calcite pipeline is passing? I think we should look at something we missed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757957#comment-17757957
 ] 

Ruben Q L commented on CALCITE-5952:


I meant showing e.g. a query/plan with employee, department, bonus values, with 
the rows returned by the problematic plan vs the rows that should be returned.

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757955#comment-17757955
 ] 

Leonid Chistov edited comment on CALCITE-5952 at 8/23/23 10:41 AM:
---

[~rubenql] 

The expected result is mentioned in issue description (I have called it 
"original plan", meaning that plan that is constructed by RelBuilder is not 
supposed to be changed by SEMI_JOIN_JOIN_TRANSPOSE):

 
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]])  {code}


was (Author: JIRAUSER298393):
[~rubenql] 

The expected result is mentioned in issue description (I have called it 
"original plan", meaning that plan that is constructed by RelBuilder is not 
supposed to be changed by SEMI_JOIN_JOIN_TRANSPOSE):

 
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]])  {code}
 

 

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757955#comment-17757955
 ] 

Leonid Chistov edited comment on CALCITE-5952 at 8/23/23 10:41 AM:
---

[~rubenql] 

The expected result is mentioned in issue description (I have called it 
"original plan", meaning that plan that is constructed by RelBuilder is not 
supposed to be changed by SEMI_JOIN_JOIN_TRANSPOSE):
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]])  {code}


was (Author: JIRAUSER298393):
[~rubenql] 

The expected result is mentioned in issue description (I have called it 
"original plan", meaning that plan that is constructed by RelBuilder is not 
supposed to be changed by SEMI_JOIN_JOIN_TRANSPOSE):

 
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]])  {code}

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757955#comment-17757955
 ] 

Leonid Chistov commented on CALCITE-5952:
-

[~rubenql] 

The expected result is mentioned in issue description (I have called it 
"original plan", meaning that plan that is constructed by RelBuilder is not 
supposed to be changed by SEMI_JOIN_JOIN_TRANSPOSE):

 
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]])  {code}
 

 

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757919#comment-17757919
 ] 

Ruben Q L commented on CALCITE-5952:


Would it be possible to have a test showing the expected results vs actual 
results?

> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     LogicalTableScan(table=[[scott, DEPT]])
>   LogicalTableScan(table=[[scott, BONUS]]) {code}
> This is not correct - in general case it is not correct to push semi-join to 
> right side of left-join.
> The reason is the following:
> Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
> will have *nulls* for *DEPT* columns in the result of left-join and they will 
> be rejected by the top semi-join.
> But if we push semi-join to RHS of left-join, we are going to see rows from 
> *EMP* with *nulls* on the *DEPT* side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5732) EnumerableHashJoin and EnumerableMergeJoin on composite key return rows matching condition 'null = null'

2023-08-23 Thread Ruben Q L (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757918#comment-17757918
 ] 

Ruben Q L commented on CALCITE-5732:


If someone could please help me out reviewing the PR, I'd really like to 
correct this issue for the next release. Thanks.

> EnumerableHashJoin and EnumerableMergeJoin on composite key return rows 
> matching condition 'null = null'
> 
>
> Key: CALCITE-5732
> URL: https://issues.apache.org/jira/browse/CALCITE-5732
> Project: Calcite
>  Issue Type: Bug
>  Components: core, linq4j
>Reporter: Viggo Chen
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> In `EnumerableJoinTest#testMergeJoinWithCompositeKeyAndNullValues`, the query 
> is like 
> {code:java}
> select 
>   emps.empid
> from 
>   emps a join emps b
> on a.deptno = b.deptno
> and a.commission = b.commission;{code}
> and the data is like 
> {code:java}
>   INSERT INTO "emps" VALUES (100, 10, 'Bill', 1, 1000);
>   INSERT INTO "emps" VALUES (200, 20, 'Eric', 8000, 500);
>   INSERT INTO "emps" VALUES (150, 10, 'Sebastian', 7000, null);
>   INSERT INTO "emps" VALUES (110, 10, 'Theodore', 11500, 250); {code}
> And row with empid = 150 is in expected result. Is this the expected result 
> of join with null condition.
> Whats more hash join result with condition a.deptno = b.deptno and 
> a.commission = b.commission is same as merge join. And if there is just one 
> condition a.commission = b.commission, the result do not include empid = 150.
>  
> Here is a unit test for it
> {code:java}
> @Test void testHashJoinWithCompositeKeyAndNullValues() {
>   // Both join side 'commission' a limited to null, so a.commission = 
> b.commission should always be false.
>   // So all columns in right table b are expected to be null, this sql should 
> result in 0 rows.
>   final String sql = "select * from\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as a\n"
>   + " left join\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as b\n"
>   + " on a.salary = b.salary and a.commission = b.commission\n"
>   + " where b.empid is not null";
>   CalciteAssert.that()
>   .with(CalciteConnectionProperty.LEX, Lex.JAVA)
>   .with(CalciteConnectionProperty.FORCE_DECORRELATE, false)
>   .withSchema("s", new ReflectiveSchema(new HrSchemaBig()))
>   .query(sql)
>   .withHook(Hook.PLANNER, (Consumer) planner -> {
> planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
> planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
>   })
>   .explainContains("EnumerableHashJoin")
>   .returnsCount(0)
>   ;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonid Chistov updated CALCITE-5952:

Description: 
The following test will fail if added to RelOptRulesTest.java
{code:java}
@Test void testCanNotPushSemiJoinToRightJoinBranch() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.equals(
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO"))
  )
  .scan("BONUS")
  // semi join only relates to RHS fields of left join
  .semiJoin(b.equals(
  b.field(2, 0, "DNAME"),
  b.field(2, 1, "JOB")))
  .build();
  relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
} {code}
Produced plan will look like:
{code:java}
LogicalJoin(condition=[=($7, $8)], joinType=[left])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalJoin(condition=[=($1, $4)], joinType=[semi])
    LogicalTableScan(table=[[scott, DEPT]])
    LogicalTableScan(table=[[scott, BONUS]]) {code}
Which is different from the original plan:
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]]) {code}
This is not correct - in general case it is not correct to push semi-join to 
right side of left-join.

The reason is the following:

Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
will have *nulls* for *DEPT* columns in the result of left-join and they will 
be rejected by the top semi-join.

But if we push semi-join to RHS of left-join, we are going to see rows from 
*EMP* with *nulls* on the *DEPT* side in the final result.

  was:
The following test will fail if added to RelOptRulesTest.java
{code:java}
@Test void testCanNotPushSemiJoinToRightJoinBranch() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.equals(
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO"))
  )
  .scan("BONUS")
  .semiJoin(b.equals(
  b.field(2, 0, "DNAME"),
  b.field(2, 1, "JOB")))
  .build();
  relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
} {code}
Produced plan will look like:
{code:java}
LogicalJoin(condition=[=($7, $8)], joinType=[left])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalJoin(condition=[=($1, $4)], joinType=[semi])
    LogicalTableScan(table=[[scott, DEPT]])
    LogicalTableScan(table=[[scott, BONUS]]) {code}
Which is different from the original plan:
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]]) {code}
This is not correct - in general case it is not correct to push semi-join to 
right side of left-join.

The reason is the following:

Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
will have *nulls* for *DEPT* columns in the result of left-join and they will 
be rejected by the top semi-join.

But if we push semi-join to RHS of left-join, we are going to see rows from 
*EMP* with *nulls* on the *DEPT* side in the final result.


> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   // semi join only relates to RHS fields of left join
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)]

[jira] [Updated] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Leonid Chistov updated CALCITE-5952:

Description: 
The following test will fail if added to RelOptRulesTest.java
{code:java}
@Test void testCanNotPushSemiJoinToRightJoinBranch() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.equals(
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO"))
  )
  .scan("BONUS")
  .semiJoin(b.equals(
  b.field(2, 0, "DNAME"),
  b.field(2, 1, "JOB")))
  .build();
  relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
} {code}
Produced plan will look like:
{code:java}
LogicalJoin(condition=[=($7, $8)], joinType=[left])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalJoin(condition=[=($1, $4)], joinType=[semi])
    LogicalTableScan(table=[[scott, DEPT]])
    LogicalTableScan(table=[[scott, BONUS]]) {code}
Which is different from the original plan:
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]]) {code}
This is not correct - in general case it is not correct to push semi-join to 
right side of left-join.

The reason is the following:

Consider rows from *EMP* that have no matching rows in {*}DEPT{*}. These rows 
will have *nulls* for *DEPT* columns in the result of left-join and they will 
be rejected by the top semi-join.

But if we push semi-join to RHS of left-join, we are going to see rows from 
*EMP* with *nulls* on the *DEPT* side in the final result.

  was:
The following test will fail if added to RelOptRulesTest.java
{code:java}
@Test void testCanNotPushSemiJoinToRightJoinBranch() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.equals(
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO"))
  )
  .scan("BONUS")
  .semiJoin(b.equals(
  b.field(2, 0, "DNAME"),
  b.field(2, 1, "JOB")))
  .build();
  relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
} {code}
Produced plan will look like:
{code:java}
LogicalJoin(condition=[=($7, $8)], joinType=[left])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalJoin(condition=[=($1, $4)], joinType=[semi])
    LogicalTableScan(table=[[scott, DEPT]])
    LogicalTableScan(table=[[scott, BONUS]]) {code}
Which is different from the original plan:
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]]) {code}
This is not correct - in general case it is not correct to push semi-join to 
right side of left-join.

The reason is the following:

Consider rows from `EMP` that have no matching rows in DEPT. These rows will 
have `nulls` for `DEPT` columns in the result of left-join and they will be 
rejected by the top semi-join.

But if we push semi-join to RHS of left-join, we are going to see rows from 
`EMP` with `nulls` on the `DEPT` side in the final result.


> Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule
> ---
>
> Key: CALCITE-5952
> URL: https://issues.apache.org/jira/browse/CALCITE-5952
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.35.0
>Reporter: Leonid Chistov
>Assignee: Leonid Chistov
>Priority: Major
>
> The following test will fail if added to RelOptRulesTest.java
> {code:java}
> @Test void testCanNotPushSemiJoinToRightJoinBranch() {
>   final Function relFn = b -> b
>   .scan("EMP")
>   .scan("DEPT")
>   .join(JoinRelType.LEFT,
>   b.equals(
>   b.field(2, 0, "DEPTNO"),
>   b.field(2, 1, "DEPTNO"))
>   )
>   .scan("BONUS")
>   .semiJoin(b.equals(
>   b.field(2, 0, "DNAME"),
>   b.field(2, 1, "JOB")))
>   .build();
>   relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
> } {code}
> Produced plan will look like:
> {code:java}
> LogicalJoin(condition=[=($7, $8)], joinType=[left])
>   LogicalTableScan(table=[[scott, EMP]])
>   LogicalJoin(condition=[=($1, $4)], joinType=[semi])
>     LogicalTableScan(table=[[scott, DEPT]])
>     LogicalTableScan(table=[[scott, BONUS]]) {code}
> Which is different from the original plan:
> {code:java}
> LogicalJoin(condition=[=($9, $12)], joinType=[semi])
>   LogicalJoin(condition=[=($7, $8)], joinType=[left])
>     LogicalTableScan(table=[[scott, EMP]])
>     

[jira] [Created] (CALCITE-5952) Semi-Join incorrectly reordered with Left-Join by SemiJoinJoinTransposeRule

2023-08-23 Thread Leonid Chistov (Jira)
Leonid Chistov created CALCITE-5952:
---

 Summary: Semi-Join incorrectly reordered with Left-Join by 
SemiJoinJoinTransposeRule
 Key: CALCITE-5952
 URL: https://issues.apache.org/jira/browse/CALCITE-5952
 Project: Calcite
  Issue Type: Bug
Affects Versions: 1.35.0
Reporter: Leonid Chistov
Assignee: Leonid Chistov


The following test will fail if added to RelOptRulesTest.java
{code:java}
@Test void testCanNotPushSemiJoinToRightJoinBranch() {
  final Function relFn = b -> b
  .scan("EMP")
  .scan("DEPT")
  .join(JoinRelType.LEFT,
  b.equals(
  b.field(2, 0, "DEPTNO"),
  b.field(2, 1, "DEPTNO"))
  )
  .scan("BONUS")
  .semiJoin(b.equals(
  b.field(2, 0, "DNAME"),
  b.field(2, 1, "JOB")))
  .build();
  relFn(relFn).withRule(CoreRules.SEMI_JOIN_JOIN_TRANSPOSE).checkUnchanged();
} {code}
Produced plan will look like:
{code:java}
LogicalJoin(condition=[=($7, $8)], joinType=[left])
  LogicalTableScan(table=[[scott, EMP]])
  LogicalJoin(condition=[=($1, $4)], joinType=[semi])
    LogicalTableScan(table=[[scott, DEPT]])
    LogicalTableScan(table=[[scott, BONUS]]) {code}
Which is different from the original plan:
{code:java}
LogicalJoin(condition=[=($9, $12)], joinType=[semi])
  LogicalJoin(condition=[=($7, $8)], joinType=[left])
    LogicalTableScan(table=[[scott, EMP]])
    LogicalTableScan(table=[[scott, DEPT]])
  LogicalTableScan(table=[[scott, BONUS]]) {code}
This is not correct - in general case it is not correct to push semi-join to 
right side of left-join.

The reason is the following:

Consider rows from `EMP` that have no matching rows in DEPT. These rows will 
have `nulls` for `DEPT` columns in the result of left-join and they will be 
rejected by the top semi-join.

But if we push semi-join to RHS of left-join, we are going to see rows from 
`EMP` with `nulls` on the `DEPT` side in the final result.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5909) Sometimes SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE but passes when run from the command line

2023-08-23 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17757903#comment-17757903
 ] 

Benchao Li commented on CALCITE-5909:
-

[~taoran] I see used a different commit message from the Jira title, usually we 
keep them the same, concisely describe the symptoms or what should be the 
expected behavior.

I presume that "Sometimes" is not needed, it should "always" fail if you run it 
in IDE directly, am I right? Then I propose to change the Jira title and commit 
message to "SqlParserTest.testNoUnintendedNewReservedKeywords fails in IDE 
while passes in command line", if you and others agree, I'll go on and merge it 
with this message.

> Sometimes SqlParserTest.testNoUnintendedNewReservedKeywords fails in the IDE 
> but passes when run from the command line
> --
>
> Key: CALCITE-5909
> URL: https://issues.apache.org/jira/browse/CALCITE-5909
> Project: Calcite
>  Issue Type: Bug
>  Components: tests
>Reporter: LakeShen
>Assignee: Ran Tao
>Priority: Minor
>  Labels: pull-request-available
> Attachments: image-2023-08-08-23-32-55-466.png
>
>
> When I run the SqlParserTest,the testNoUnintendedNewReservedKeywords method 
> failed,the exception like this:
> {code:java}
> java.lang.AssertionError: The parser has at least one new reserved keyword. 
> Are you sure it should be reserved? Difference: {code}
> The picture like this:
> !image-2023-08-08-23-32-55-466.png|width=1543,height=496!
> I could fix this problem.More importantly, why is this method failing, but 
> the Calcite pipeline is passing? I think we should look at something we missed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5732) EnumerableHashJoin and EnumerableMergeJoin on composite key return rows matching condition 'null = null'

2023-08-23 Thread Ruben Q L (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5732?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5732:
---
Component/s: core

> EnumerableHashJoin and EnumerableMergeJoin on composite key return rows 
> matching condition 'null = null'
> 
>
> Key: CALCITE-5732
> URL: https://issues.apache.org/jira/browse/CALCITE-5732
> Project: Calcite
>  Issue Type: Bug
>  Components: core, linq4j
>Reporter: Viggo Chen
>Assignee: Ruben Q L
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> In `EnumerableJoinTest#testMergeJoinWithCompositeKeyAndNullValues`, the query 
> is like 
> {code:java}
> select 
>   emps.empid
> from 
>   emps a join emps b
> on a.deptno = b.deptno
> and a.commission = b.commission;{code}
> and the data is like 
> {code:java}
>   INSERT INTO "emps" VALUES (100, 10, 'Bill', 1, 1000);
>   INSERT INTO "emps" VALUES (200, 20, 'Eric', 8000, 500);
>   INSERT INTO "emps" VALUES (150, 10, 'Sebastian', 7000, null);
>   INSERT INTO "emps" VALUES (110, 10, 'Theodore', 11500, 250); {code}
> And row with empid = 150 is in expected result. Is this the expected result 
> of join with null condition.
> Whats more hash join result with condition a.deptno = b.deptno and 
> a.commission = b.commission is same as merge join. And if there is just one 
> condition a.commission = b.commission, the result do not include empid = 150.
>  
> Here is a unit test for it
> {code:java}
> @Test void testHashJoinWithCompositeKeyAndNullValues() {
>   // Both join side 'commission' a limited to null, so a.commission = 
> b.commission should always be false.
>   // So all columns in right table b are expected to be null, this sql should 
> result in 0 rows.
>   final String sql = "select * from\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as a\n"
>   + " left join\n"
>   + " (select empid, salary, commission from emps where commission is 
> null) as b\n"
>   + " on a.salary = b.salary and a.commission = b.commission\n"
>   + " where b.empid is not null";
>   CalciteAssert.that()
>   .with(CalciteConnectionProperty.LEX, Lex.JAVA)
>   .with(CalciteConnectionProperty.FORCE_DECORRELATE, false)
>   .withSchema("s", new ReflectiveSchema(new HrSchemaBig()))
>   .query(sql)
>   .withHook(Hook.PLANNER, (Consumer) planner -> {
> planner.removeRule(EnumerableRules.ENUMERABLE_JOIN_RULE);
> planner.addRule(EnumerableRules.ENUMERABLE_MERGE_JOIN_RULE);
>   })
>   .explainContains("EnumerableHashJoin")
>   .returnsCount(0)
>   ;
> } {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)