[jira] [Commented] (CALCITE-5698) EXTRACT from INTERVAL partially does not follow the SQL standard

2023-05-16 Thread Evgeny Stanilovsky (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723347#comment-17723347
 ] 

Evgeny Stanilovsky commented on CALCITE-5698:
-

all done, thanks.

> EXTRACT from INTERVAL partially does not follow the SQL standard
> 
>
> Key: CALCITE-5698
> URL: https://issues.apache.org/jira/browse/CALCITE-5698
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: pull-request-available
>
> From SQL standard we can found:
> {noformat}
> ISO/IEC 9075-2:1999 (E)
> 6.17 :
> If  is a , then the result is the 
> value of the datetime
> field identified by that  and has the same sign as 
> the  source>.{noformat}
> other data bases are follow this rule, i.e. :
> {noformat}
> SELECT
>   EXTRACT (MONTH FROM INTERVAL '-1' MONTH)
> FROM
>   DUAL;{noformat}
> returns: *-1*
> and  **  the other one:
> {noformat}
> SELECT EXTRACT(MONTH FROM INTERVAL '-1 MONTHS'){noformat}
> return the same result,
> while calcite:
> {noformat}
> SELECT EXTRACT(MONTH FROM INTERVAL -1 MONTHS) AS extracted;{noformat}
> {noformat}
> < +-+
> < |      -1 |
> < +-+
> ---
> > +---+
> > |        11 |
> > +---+{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5704) Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect

2023-05-16 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723334#comment-17723334
 ] 

jackylau commented on CALCITE-5704:
---

[~julianhyde] but i don't find the definitions of these function in sql 
standard.

when i develops it i find the MULTISET_UNION/MULTISET_INERSECT/.., so i 
feferenced some contents

> Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect
> ---
>
> Key: CALCITE-5704
> URL: https://issues.apache.org/jira/browse/CALCITE-5704
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>
> array_union(array1, array2) - Returns an array of the elements in the union 
> of array1 and array2, without duplicates.
> array_intersect(array1, array2) - Returns an array of the elements in the 
> intersection of array1 and array2, without duplicates.
> array_except(array1, array2) - Returns an array of the elements in array1 but 
> not in array2, without duplicates.
> For more details
> [https://spark.apache.org/docs/latest/api/sql/index.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5657) Add ARRAY_DISTINCT for Spark dialect

2023-05-16 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723329#comment-17723329
 ] 

jackylau commented on CALCITE-5657:
---

[~libenchao] thanks very much

> Add ARRAY_DISTINCT for Spark dialect
> 
>
> Key: CALCITE-5657
> URL: https://issues.apache.org/jira/browse/CALCITE-5657
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{ARRAY_DISTINCT}} - Returns the input ARRAY with unique elements. If the 
> array itself is null, the function will return null. Keeps ordering of 
> elements.
> For more details
> https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5657) Add ARRAY_DISTINCT for Spark dialect

2023-05-16 Thread Benchao Li (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723321#comment-17723321
 ] 

Benchao Li commented on CALCITE-5657:
-

[~jackylau] I reopened the issue, and marked it as "resolved" instead of 
"closed" (that's we usually do in Calcite). Besides, I've added your Jira 
account as Calcite Contributor, and assigned this issue to you.

> Add ARRAY_DISTINCT for Spark dialect
> 
>
> Key: CALCITE-5657
> URL: https://issues.apache.org/jira/browse/CALCITE-5657
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{ARRAY_DISTINCT}} - Returns the input ARRAY with unique elements. If the 
> array itself is null, the function will return null. Keeps ordering of 
> elements.
> For more details
> https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CALCITE-5657) Add ARRAY_DISTINCT for Spark dialect

2023-05-16 Thread Benchao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benchao Li reassigned CALCITE-5657:
---

Assignee: jackylau

> Add ARRAY_DISTINCT for Spark dialect
> 
>
> Key: CALCITE-5657
> URL: https://issues.apache.org/jira/browse/CALCITE-5657
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{ARRAY_DISTINCT}} - Returns the input ARRAY with unique elements. If the 
> array itself is null, the function will return null. Keeps ordering of 
> elements.
> For more details
> https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5657) Add ARRAY_DISTINCT for Spark dialect

2023-05-16 Thread Benchao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benchao Li resolved CALCITE-5657.
-
Fix Version/s: 1.35.0
   Resolution: Fixed

> Add ARRAY_DISTINCT for Spark dialect
> 
>
> Key: CALCITE-5657
> URL: https://issues.apache.org/jira/browse/CALCITE-5657
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: jackylau
>Assignee: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{ARRAY_DISTINCT}} - Returns the input ARRAY with unique elements. If the 
> array itself is null, the function will return null. Keeps ordering of 
> elements.
> For more details
> https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Reopened] (CALCITE-5657) Add ARRAY_DISTINCT for Spark dialect

2023-05-16 Thread Benchao Li (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Benchao Li reopened CALCITE-5657:
-

> Add ARRAY_DISTINCT for Spark dialect
> 
>
> Key: CALCITE-5657
> URL: https://issues.apache.org/jira/browse/CALCITE-5657
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 1h 50m
>  Remaining Estimate: 0h
>
> {{ARRAY_DISTINCT}} - Returns the input ARRAY with unique elements. If the 
> array itself is null, the function will return null. Keeps ordering of 
> elements.
> For more details
> https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5700) Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect

2023-05-16 Thread Jiajun Xie (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723315#comment-17723315
 ] 

Jiajun Xie edited comment on CALCITE-5700 at 5/17/23 2:01 AM:
--

[~jackylau] ,Thanks for PR.

Fixed in 
[48f51ea.|https://github.com/apache/calcite/commit/48f51ea5d9443fb11e070eea094bf551c8ff22fc]


was (Author: jiajunbernoulli):
[~jackylau] ,Thanks for PR.

Fixed in 
[48f51ea.|https://github.com/apache/calcite/commit/48f51ea5d9443fb11e070eea094bf551c8ff22fc]

 

> Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect
> -
>
> Key: CALCITE-5700
> URL: https://issues.apache.org/jira/browse/CALCITE-5700
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> ARRAY_SIZE - Returns the size of an array. 
> ARRAY_REPEAT - Returns the array containing element count times.
>  
> For more details
> https://spark.apache.org/docs/latest/api/sql/index.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5700) Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect

2023-05-16 Thread Jiajun Xie (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723315#comment-17723315
 ] 

Jiajun Xie edited comment on CALCITE-5700 at 5/17/23 2:01 AM:
--

[~jackylau] ,Thanks for PR.

Fixed in 
[48f51ea.|https://github.com/apache/calcite/commit/48f51ea5d9443fb11e070eea094bf551c8ff22fc]

 


was (Author: jiajunbernoulli):
Fixed in 
[48f51ea|https://github.com/apache/calcite/commit/48f51ea5d9443fb11e070eea094bf551c8ff22fc].

> Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect
> -
>
> Key: CALCITE-5700
> URL: https://issues.apache.org/jira/browse/CALCITE-5700
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> ARRAY_SIZE - Returns the size of an array. 
> ARRAY_REPEAT - Returns the array containing element count times.
>  
> For more details
> https://spark.apache.org/docs/latest/api/sql/index.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5706) Add class PairList

2023-05-16 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde updated CALCITE-5706:
-
Description: 
Add a {{class PairList}} which is an implementation of {{List}} 
backed by a single list. Each entry to the {{PairList}} corresponds to two 
entries in the backing list, but we save ourselves the effort of creating 
{{Map.Entry}} wrappers.

A {{PairList}} can be used to build two lists in parallel (e.g. a list of field 
types and field names that will be converted to a struct type); it can also be 
used to build maps.

It has a {{forEach(BiConsumer)}} method to allow the list to be 
deconstructed without creating intermediate entries.

Potentially also {{toImmutableMap}} and {{toHashMap}} methods.

  was:
Add a {{class PairList}} which an implementation of {{List}} 
backed by a single list. It can be used to build two lists in parallel (e.g. a 
list of field types and field names that will be converted to a struct type); 
it can also be used to build maps.

It has a {{forEach(BiConsumer)}} method to allow the list to be 
deconstructed without creating intermediate entries.

Potentially also {{toImmutableMap}} and {{toHashMap}} methods.


> Add class PairList
> --
>
> Key: CALCITE-5706
> URL: https://issues.apache.org/jira/browse/CALCITE-5706
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>
> Add a {{class PairList}} which is an implementation of {{List V>}} backed by a single list. Each entry to the {{PairList}} corresponds to 
> two entries in the backing list, but we save ourselves the effort of creating 
> {{Map.Entry}} wrappers.
> A {{PairList}} can be used to build two lists in parallel (e.g. a list of 
> field types and field names that will be converted to a struct type); it can 
> also be used to build maps.
> It has a {{forEach(BiConsumer)}} method to allow the list to be 
> deconstructed without creating intermediate entries.
> Potentially also {{toImmutableMap}} and {{toHashMap}} methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Resolved] (CALCITE-5700) Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect

2023-05-16 Thread Jiajun Xie (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jiajun Xie resolved CALCITE-5700.
-
Resolution: Fixed

> Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect
> -
>
> Key: CALCITE-5700
> URL: https://issues.apache.org/jira/browse/CALCITE-5700
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> ARRAY_SIZE - Returns the size of an array. 
> ARRAY_REPEAT - Returns the array containing element count times.
>  
> For more details
> https://spark.apache.org/docs/latest/api/sql/index.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5700) Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect

2023-05-16 Thread Jiajun Xie (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723315#comment-17723315
 ] 

Jiajun Xie commented on CALCITE-5700:
-

Fixed in 
[48f51ea|https://github.com/apache/calcite/commit/48f51ea5d9443fb11e070eea094bf551c8ff22fc].

> Add ARRAY_SIZE and ARRAY_REPEAT for Spark dialect
> -
>
> Key: CALCITE-5700
> URL: https://issues.apache.org/jira/browse/CALCITE-5700
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> ARRAY_SIZE - Returns the size of an array. 
> ARRAY_REPEAT - Returns the array containing element count times.
>  
> For more details
> https://spark.apache.org/docs/latest/api/sql/index.html



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Assigned] (CALCITE-5706) Add class PairList

2023-05-16 Thread Julian Hyde (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5706?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Julian Hyde reassigned CALCITE-5706:


Assignee: Julian Hyde

> Add class PairList
> --
>
> Key: CALCITE-5706
> URL: https://issues.apache.org/jira/browse/CALCITE-5706
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
>
> Add a {{class PairList}} which an implementation of {{List}} 
> backed by a single list. It can be used to build two lists in parallel (e.g. 
> a list of field types and field names that will be converted to a struct 
> type); it can also be used to build maps.
> It has a {{forEach(BiConsumer)}} method to allow the list to be 
> deconstructed without creating intermediate entries.
> Potentially also {{toImmutableMap}} and {{toHashMap}} methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5706) Add class PairList

2023-05-16 Thread Julian Hyde (Jira)
Julian Hyde created CALCITE-5706:


 Summary: Add class PairList
 Key: CALCITE-5706
 URL: https://issues.apache.org/jira/browse/CALCITE-5706
 Project: Calcite
  Issue Type: Bug
Reporter: Julian Hyde


Add a {{class PairList}} which an implementation of {{List}} 
backed by a single list. It can be used to build two lists in parallel (e.g. a 
list of field types and field names that will be converted to a struct type); 
it can also be used to build maps.

It has a {{forEach(BiConsumer)}} method to allow the list to be 
deconstructed without creating intermediate entries.

Potentially also {{toImmutableMap}} and {{toHashMap}} methods.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-4555) Invalid zero literal value is used for TIMESTAMP WITH LOCAL TIME ZONE type in RexBuilder

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723302#comment-17723302
 ] 

Julian Hyde commented on CALCITE-4555:
--

You're right; sorry. I think you should commit a fix without a SQL test case.

> Invalid zero literal value is used for TIMESTAMP WITH LOCAL TIME ZONE type in 
> RexBuilder
> 
>
> Key: CALCITE-4555
> URL: https://issues.apache.org/jira/browse/CALCITE-4555
> Project: Calcite
>  Issue Type: Bug
>Reporter: Leonard Xu
>Priority: Major
>
>  The zero literal value for TIMESTAMP WITH LOCAL TIME ZONE type is used in 
> `org.apache.calcite.rex.RexBuilder`
> {code:java}
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   return new TimestampString(0, 0, 0, 0, 0, 0);
>//TimestampString(int year, int month, int day, int h, int m, int s)
> {code}
> the month and day should never be zero, I think the zero value should be 
> '1970-01-01 00:00:00'(epoch 0 second).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5695) Add MAP_KEYS and MAP_VALUES for Spark dialect

2023-05-16 Thread jackylau (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723301#comment-17723301
 ] 

jackylau commented on CALCITE-5695:
---

hi [~julianhyde] fixed the description of issue and Pr, do you have time to 
review it?

> Add MAP_KEYS and MAP_VALUES for Spark dialect
> -
>
> Key: CALCITE-5695
> URL: https://issues.apache.org/jira/browse/CALCITE-5695
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> {{MAP_KEYS}} - Returns the keys of the map as an array, the order of the 
> entries is not defined
> {{MAP_VALUES}} - Returns the values of the map as an array, the order of the 
> entries is not defined
> For more details
> [https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5695) Add MAP_KEYS and MAP_VALUES for Spark dialect

2023-05-16 Thread jackylau (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jackylau updated CALCITE-5695:
--
Description: 
{{MAP_KEYS}} - Returns the keys of the map as an array, the order of the 
entries is not defined

{{MAP_VALUES}} - Returns the values of the map as an array, the order of the 
entries is not defined

For more details
[https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]

  was:
{{MAP_KEYS}} - Returns the keys of the map as an array; the order of the 
entries is not defined

{{MAP_VALUES}} - Returns the values of the map as an array; the order of the 
entries is not defined

For more details
[https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]


> Add MAP_KEYS and MAP_VALUES for Spark dialect
> -
>
> Key: CALCITE-5695
> URL: https://issues.apache.org/jira/browse/CALCITE-5695
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> {{MAP_KEYS}} - Returns the keys of the map as an array, the order of the 
> entries is not defined
> {{MAP_VALUES}} - Returns the values of the map as an array, the order of the 
> entries is not defined
> For more details
> [https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5695) Add MAP_KEYS and MAP_VALUES for Spark dialect

2023-05-16 Thread jackylau (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jackylau updated CALCITE-5695:
--
Description: 
{{MAP_KEYS}} - Returns the keys of the map as an array; the order of the 
entries is not defined

{{MAP_VALUES}} - Returns the values of the map as an array; the order of the 
entries is not defined

For more details
[https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]

  was:
{{MAP_KEYS}} - Returns an unordered array containing the keys of the map. If 
the map itself is null, the function will return null. 

{{MAP_VALUES}} - Returns an unordered array containing the values of the map. 
If the map itself is null, the function will return null. 

For more details
[https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]


> Add MAP_KEYS and MAP_VALUES for Spark dialect
> -
>
> Key: CALCITE-5695
> URL: https://issues.apache.org/jira/browse/CALCITE-5695
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> {{MAP_KEYS}} - Returns the keys of the map as an array; the order of the 
> entries is not defined
> {{MAP_VALUES}} - Returns the values of the map as an array; the order of the 
> entries is not defined
> For more details
> [https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-4555) Invalid zero literal value is used for TIMESTAMP WITH LOCAL TIME ZONE type in RexBuilder

2023-05-16 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723290#comment-17723290
 ] 

Sergey Nuyanzin commented on CALCITE-4555:
--

yeah, I also noticed that query

However further analysis shows that 
{{org.apache.calcite.rex.RexBuilder#makeZeroLiteral}} is called only from 
1. {{org.apache.calcite.rex.RexBuilder#makeCastExactToBoolean}}
2. {{org.apache.calcite.rex.RexBuilder#makeCastBooleanToExact}}
both called from {{org.apache.calcite.rex.RexBuilder#makeCas}} in case of 
boolean casting to exact number or exact number casting to boolean. This is 
what happens in the mentioned BigQuery's SQL.

3. {{org.apache.calcite.sql2rel.RelFieldTrimmer#trimChildRestore}} which is 
called from +unused+ public method 
{{org.apache.calcite.sql2rel.RelFieldTrimmer#trimFields}}

Thus there is no way to invoke this method for any query related to {{TIMESTAMP 
WITH LOCAL TIME ZONE}} since for that case 
{{org.apache.calcite.rex.RexBuilder#makeAbstractCast}} is called

> Invalid zero literal value is used for TIMESTAMP WITH LOCAL TIME ZONE type in 
> RexBuilder
> 
>
> Key: CALCITE-4555
> URL: https://issues.apache.org/jira/browse/CALCITE-4555
> Project: Calcite
>  Issue Type: Bug
>Reporter: Leonard Xu
>Priority: Major
>
>  The zero literal value for TIMESTAMP WITH LOCAL TIME ZONE type is used in 
> `org.apache.calcite.rex.RexBuilder`
> {code:java}
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   return new TimestampString(0, 0, 0, 0, 0, 0);
>//TimestampString(int year, int month, int day, int h, int m, int s)
> {code}
> the month and day should never be zero, I think the zero value should be 
> '1970-01-01 00:00:00'(epoch 0 second).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5701) Add NAMED_STRUCT function (enabled in Spark library)

2023-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CALCITE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Massé updated CALCITE-5701:
-
Description: 
[https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]

 
{code:java}
spark.sql("""select named_struct("a", 1, "b", 2)""")
res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct]


Calcite:
SELECT named_struct('a', 1, 'b", 2);
type: row(a int not null, b int not null){code}
 

It's also possible to be nested:
{code:java}
spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
2)): struct>] {code}
{code:java}
Calcite:
SELECT named_struct('a', 1, 'b', named_struct('c', 2));
type: row(a int not null, b row(c int not null) not null){code}

  was:
[https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]

 
{code:java}
spark.sql("""select named_struct("a", 1, "b", 2)""")
res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct]


Calcite:
SELECT named_struct("a", 1, "b", 2);
type: row(a int not null, b int not null){code}
 

It's also possible to be nested:
{code:java}
spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
2)): struct>] {code}
{code:java}
Calcite:
SELECT named_struct("a", 1, "b", named_struct("c", 2));
type: row(a int not null, b row(c int not null) not null){code}


> Add NAMED_STRUCT function (enabled in Spark library)
> 
>
> Key: CALCITE-5701
> URL: https://issues.apache.org/jira/browse/CALCITE-5701
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Guillaume Massé
>Priority: Minor
>
> [https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]
>  
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", 2)""")
> res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct int, b: int>]
> Calcite:
> SELECT named_struct('a', 1, 'b", 2);
> type: row(a int not null, b int not null){code}
>  
> It's also possible to be nested:
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
> res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
> 2)): struct>] {code}
> {code:java}
> Calcite:
> SELECT named_struct('a', 1, 'b', named_struct('c', 2));
> type: row(a int not null, b row(c int not null) not null){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5698) EXTRACT from INTERVAL partially does not follow the SQL standard

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5698?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723285#comment-17723285
 ] 

Julian Hyde commented on CALCITE-5698:
--

I reviewed, and suggested minor changes. "EXTRACT from INTERVAL should return 
negative numbers if interval is negative" would be a better summary and commit 
message, in my opinion.

> EXTRACT from INTERVAL partially does not follow the SQL standard
> 
>
> Key: CALCITE-5698
> URL: https://issues.apache.org/jira/browse/CALCITE-5698
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: pull-request-available
>
> From SQL standard we can found:
> {noformat}
> ISO/IEC 9075-2:1999 (E)
> 6.17 :
> If  is a , then the result is the 
> value of the datetime
> field identified by that  and has the same sign as 
> the  source>.{noformat}
> other data bases are follow this rule, i.e. :
> {noformat}
> SELECT
>   EXTRACT (MONTH FROM INTERVAL '-1' MONTH)
> FROM
>   DUAL;{noformat}
> returns: *-1*
> and  **  the other one:
> {noformat}
> SELECT EXTRACT(MONTH FROM INTERVAL '-1 MONTHS'){noformat}
> return the same result,
> while calcite:
> {noformat}
> SELECT EXTRACT(MONTH FROM INTERVAL -1 MONTHS) AS extracted;{noformat}
> {noformat}
> < +-+
> < |      -1 |
> < +-+
> ---
> > +---+
> > |        11 |
> > +---+{noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5564) Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as in BigQuery)

2023-05-16 Thread Tanner Clary (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723284#comment-17723284
 ] 

Tanner Clary commented on CALCITE-5564:
---

[~julianhyde] Description should be updated now.

> Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as 
> in BigQuery)
> 
>
> Key: CALCITE-5564
> URL: https://issues.apache.org/jira/browse/CALCITE-5564
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Tanner Clary
>Assignee: Tanner Clary
>Priority: Major
>
> Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
> {{PERCENTILE_DISC}} functions. Their syntax may be found 
> [here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
>  
> BigQuery offers these functions as well, but the syntax is slightly 
> different, and may be found 
> [here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
>  The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
> array is passed in directly as the first argument to the function.
> BigQuery Syntax Example: {{SELECT PERCENTILE_CONT(x, .5) OVER() FROM 
> UNNEST([1,2,3,4]) as x;}} would return the median, 2.5.
> Standard Syntax Example: {{SELECT PERCENTILE_CONT(.5) WITHIN GROUP (ORDER BY 
> [some column])}}
> Parsing and validation for the standard functions was added in CALCITE-4644.
> The actual implementation for both the standard and BigQuery forms is covered 
> under CALCITE-4666.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5564) Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as in BigQuery)

2023-05-16 Thread Tanner Clary (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Tanner Clary updated CALCITE-5564:
--
Description: 
Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
{{PERCENTILE_DISC}} functions. Their syntax may be found 
[here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
 

BigQuery offers these functions as well, but the syntax is slightly different, 
and may be found 
[here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
 The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
array is passed in directly as the first argument to the function.

BigQuery Syntax Example: {{SELECT PERCENTILE_CONT(x, .5) OVER() FROM 
UNNEST([1,2,3,4]) as x;}} would return the median, 2.5.

Standard Syntax Example: {{SELECT PERCENTILE_CONT(.5) WITHIN GROUP (ORDER BY 
[some column])}}

Parsing and validation for the standard functions was added in CALCITE-4644.

The actual implementation for both the standard and BigQuery forms is covered 
under CALCITE-4666.

  was:
Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
{{PERCENTILE_DISC}} functions. Their syntax may be found 
[here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
 

BigQuery offers these functions as well, but the syntax is slightly different, 
and may be found 
[here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
 The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
array is passed in directly as the first argument to the function.


> Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as 
> in BigQuery)
> 
>
> Key: CALCITE-5564
> URL: https://issues.apache.org/jira/browse/CALCITE-5564
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Tanner Clary
>Assignee: Tanner Clary
>Priority: Major
>
> Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
> {{PERCENTILE_DISC}} functions. Their syntax may be found 
> [here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
>  
> BigQuery offers these functions as well, but the syntax is slightly 
> different, and may be found 
> [here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
>  The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
> array is passed in directly as the first argument to the function.
> BigQuery Syntax Example: {{SELECT PERCENTILE_CONT(x, .5) OVER() FROM 
> UNNEST([1,2,3,4]) as x;}} would return the median, 2.5.
> Standard Syntax Example: {{SELECT PERCENTILE_CONT(.5) WITHIN GROUP (ORDER BY 
> [some column])}}
> Parsing and validation for the standard functions was added in CALCITE-4644.
> The actual implementation for both the standard and BigQuery forms is covered 
> under CALCITE-4666.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5699) Posix regex expressions failed while NOT operator is executed with null literals.

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723283#comment-17723283
 ] 

Julian Hyde commented on CALCITE-5699:
--

Your fix, using null policy, looks good. Can you merge the several queries in 
{{select.iq}} into one. Also add a test similar to 
{{SqlOperatorTest.testLikeOperator}}; it only needs to be about 6 lines long.

> Posix regex expressions failed while NOT operator is executed with null 
> literals.
> -
>
> Key: CALCITE-5699
> URL: https://issues.apache.org/jira/browse/CALCITE-5699
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: pull-request-available
>
> Operations like :
> {noformat}
> SELECT null !~ 'ab[cd]'
> SELECT 'abcd' !~ null
> SELECT null !~ null
> SELECT null !~* 'ab[cd]'
> SELECT 'abcd' !~* null
> SELECT null !~* null{noformat}
> is not possible for now, NPE is raised :
> {noformat}
> Caused by: java.lang.NullPointerException
>      at java.base/java.util.regex.Matcher.getTextLength(Matcher.java:1770)
>      at java.base/java.util.regex.Matcher.reset(Matcher.java:416)
>      at java.base/java.util.regex.Matcher.(Matcher.java:253)
>      at java.base/java.util.regex.Pattern.matcher(Pattern.java:1134)
>      at 
> org.apache.calcite.runtime.SqlFunctions.posixRegex(SqlFunctions.java:864){noformat}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-4555) Invalid zero literal value is used for TIMESTAMP WITH LOCAL TIME ZONE type in RexBuilder

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-4555?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723279#comment-17723279
 ] 

Julian Hyde commented on CALCITE-4555:
--

If I make {{RexBuilder.zeroValue}} throw unconditionally, then [one query in 
big-query.iq|https://github.com/apache/calcite/blob/b0b27e8872b33c5ab203e0e2365d267a594c80be/babel/src/test/resources/sql/big-query.iq#L991C7-L1012C3]
 fails. I think you could test this fix by modifying that query.

> Invalid zero literal value is used for TIMESTAMP WITH LOCAL TIME ZONE type in 
> RexBuilder
> 
>
> Key: CALCITE-4555
> URL: https://issues.apache.org/jira/browse/CALCITE-4555
> Project: Calcite
>  Issue Type: Bug
>Reporter: Leonard Xu
>Priority: Major
>
>  The zero literal value for TIMESTAMP WITH LOCAL TIME ZONE type is used in 
> `org.apache.calcite.rex.RexBuilder`
> {code:java}
> case TIMESTAMP_WITH_LOCAL_TIME_ZONE:
>   return new TimestampString(0, 0, 0, 0, 0, 0);
>//TimestampString(int year, int month, int day, int h, int m, int s)
> {code}
> the month and day should never be zero, I think the zero value should be 
> '1970-01-01 00:00:00'(epoch 0 second).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5687) lazy get scheme

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723274#comment-17723274
 ] 

Julian Hyde commented on CALCITE-5687:
--

I reviewed the PR. I think there's too much in it:
 * Adding Lazy variants in the example/csv directory will confuse people
* Adding case-sensitivity to the {{getTable}} API should be a different Jira 
case. It's not clear what the API should be. If you call 'getTable("foo", 
false)' then it might return more than one table.

Amongst all the other stuff. I couldn't see where the 'laziness' was added.

> lazy get scheme
> ---
>
> Key: CALCITE-5687
> URL: https://issues.apache.org/jira/browse/CALCITE-5687
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Affects Versions: 1.34.0
>Reporter: yiku123
>Assignee: yiku123
>Priority: Major
> Fix For: 1.35.0
>
>
> scence:now i use calcite on my web application
> use and problem :when rewrite 
> org.apache.calcite.schema.impl.AbstractSchema#getTableMap method ,i must to 
> provide  all table scheme informations,but i  only want to provide scheme 
> when my SQL actually use a table。because i don't want my application load 
> many schemes event if not to use.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Comment Edited] (CALCITE-5390) RelDecorrelator throws NullPointerException

2023-05-16 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723271#comment-17723271
 ] 

Sergey Nuyanzin edited comment on CALCITE-5390 at 5/16/23 8:01 PM:
---

I noticed that for the first query it considers correlation as a trivial one at 
{{org.apache.calcite.sql2rel.CorrelateProjectExtractor#visit}}
{code:java}
Set callsWithCorrelationInRight =
findCorrelationDependentCalls(correlate.getCorrelationId(), right);
boolean isTrivialCorrelation =
callsWithCorrelationInRight.stream().allMatch(exp -> exp instanceof 
RexFieldAccess);
// Early exit condition
if (isTrivialCorrelation) {
  ...
}
{code}

at the same time it is still questionable whether it could be trivial since  
there is no condition for {{t2}} column... ?
also disable of {{early exit}} for this query makes the query working


was (Author: sergey nuyanzin):
I noticed that for the first query it considers correlation as a trivial one at 
{{org.apache.calcite.sql2rel.CorrelateProjectExtractor#visit}}
{code:java}
Set callsWithCorrelationInRight =
findCorrelationDependentCalls(correlate.getCorrelationId(), right);
boolean isTrivialCorrelation =
callsWithCorrelationInRight.stream().allMatch(exp -> exp instanceof 
RexFieldAccess);
// Early exit condition
if (isTrivialCorrelation) {
  ...
}
{code}

at the same time it is still questionable whether it could be trivial since  
there is no condition for {{t2}} column... 
also disable of {{early exit}} for this query makes the query working

> RelDecorrelator throws NullPointerException
> ---
>
> Key: CALCITE-5390
> URL: https://issues.apache.org/jira/browse/CALCITE-5390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Dmitry Sysolyatin
>Assignee: Dan Zou
>Priority: Major
>
> The current query throws NullPointerException
> {code:java}
> SELECT
>   (SELECT 1 FROM emp d WHERE d.job = a.job LIMIT 1) AS t1,
>   (SELECT a.job = 'PRESIDENT' FROM emp s LIMIT 1) as t2
> FROM emp a;
> {code}
> Test case - 
> [https://github.com/apache/calcite/commit/46fe9bc456f2d34cf7dccd29829c9e85abe69d5f]
> Logical plan before it fails:
> {code:java}
> LogicalProject(T1=[$8], T2=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f09=[$9])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f00=[$10])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{9}])
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f9=[=($2, 
> 'PRESIDENT')])
>           LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{2}])
>             LogicalTableScan(table=[[scott, EMP]])
>             LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>               LogicalSort(fetch=[1])
>                 LogicalProject(EXPR$0=[1])
>                   LogicalFilter(condition=[=($2, $cor0.JOB)])
>                     LogicalTableScan(table=[[scott, EMP]])
>         LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>           LogicalSort(fetch=[1])
>             LogicalProject(EXPR$0=[$cor0.$f9])
>               LogicalTableScan(table=[[scott, EMP]]) {code}
> Stack trace:
> {code:java}
>  Caused by: java.lang.NullPointerException
>   at java.util.Objects.requireNonNull(Objects.java:203)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateInputWithValueGenerator(RelDecorrelator.java:1028)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:764)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:738)
>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:464)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)

[jira] [Commented] (CALCITE-5390) RelDecorrelator throws NullPointerException

2023-05-16 Thread Sergey Nuyanzin (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723271#comment-17723271
 ] 

Sergey Nuyanzin commented on CALCITE-5390:
--

I noticed that for the first query it considers correlation as a trivial one at 
{{org.apache.calcite.sql2rel.CorrelateProjectExtractor#visit}}
{code:java}
Set callsWithCorrelationInRight =
findCorrelationDependentCalls(correlate.getCorrelationId(), right);
boolean isTrivialCorrelation =
callsWithCorrelationInRight.stream().allMatch(exp -> exp instanceof 
RexFieldAccess);
// Early exit condition
if (isTrivialCorrelation) {
  ...
}
{code}

at the same time it is still questionable whether it could be trivial since  
there is no condition for {{t2}} column... 
also disable of {{early exit}} for this query makes the query working

> RelDecorrelator throws NullPointerException
> ---
>
> Key: CALCITE-5390
> URL: https://issues.apache.org/jira/browse/CALCITE-5390
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: Dmitry Sysolyatin
>Assignee: Dan Zou
>Priority: Major
>
> The current query throws NullPointerException
> {code:java}
> SELECT
>   (SELECT 1 FROM emp d WHERE d.job = a.job LIMIT 1) AS t1,
>   (SELECT a.job = 'PRESIDENT' FROM emp s LIMIT 1) as t2
> FROM emp a;
> {code}
> Test case - 
> [https://github.com/apache/calcite/commit/46fe9bc456f2d34cf7dccd29829c9e85abe69d5f]
> Logical plan before it fails:
> {code:java}
> LogicalProject(T1=[$8], T2=[$9])
>   LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f09=[$9])
>     LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], HIREDATE=[$4], 
> SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f00=[$10])
>       LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{9}])
>         LogicalProject(EMPNO=[$0], ENAME=[$1], JOB=[$2], MGR=[$3], 
> HIREDATE=[$4], SAL=[$5], COMM=[$6], DEPTNO=[$7], $f0=[$8], $f9=[=($2, 
> 'PRESIDENT')])
>           LogicalCorrelate(correlation=[$cor0], joinType=[left], 
> requiredColumns=[{2}])
>             LogicalTableScan(table=[[scott, EMP]])
>             LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>               LogicalSort(fetch=[1])
>                 LogicalProject(EXPR$0=[1])
>                   LogicalFilter(condition=[=($2, $cor0.JOB)])
>                     LogicalTableScan(table=[[scott, EMP]])
>         LogicalAggregate(group=[{}], agg#0=[SINGLE_VALUE($0)])
>           LogicalSort(fetch=[1])
>             LogicalProject(EXPR$0=[$cor0.$f9])
>               LogicalTableScan(table=[[scott, EMP]]) {code}
> Stack trace:
> {code:java}
>  Caused by: java.lang.NullPointerException
>   at java.util.Objects.requireNonNull(Objects.java:203)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.createValueGenerator(RelDecorrelator.java:833)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateInputWithValueGenerator(RelDecorrelator.java:1028)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:764)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:738)
>   at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:464)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.getInvoke(RelDecorrelator.java:707)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:512)
>   at 
> org.apache.calcite.sql2rel.RelDecorrelator.decorrelateRel(RelDecorrelator.java:495)
>   at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:498)
>   at org.apache.calcite.util.ReflectUtil$2.invoke(ReflectUtil.java:531)
>   at 
> 

[jira] [Commented] (CALCITE-5695) Add MAP_KEYS and MAP_VALUES for Spark dialect

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723264#comment-17723264
 ] 

Julian Hyde commented on CALCITE-5695:
--

I see. Maybe change {quote}Returns an unordered array containing the keys of 
the map. If the map itself is null, the function will return null.{quote} to 
{quote}Returns the key values of the map as an array; the order of the entries 
is not defined{quote}. There's no need to state that null input results in null 
output; that is assumed for all functions. 

> Add MAP_KEYS and MAP_VALUES for Spark dialect
> -
>
> Key: CALCITE-5695
> URL: https://issues.apache.org/jira/browse/CALCITE-5695
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.35.0
>
>
> {{MAP_KEYS}} - Returns an unordered array containing the keys of the map. If 
> the map itself is null, the function will return null. 
> {{MAP_VALUES}} - Returns an unordered array containing the values of the map. 
> If the map itself is null, the function will return null. 
> For more details
> [https://spark.apache.org/docs/latest/sql-ref-functions-builtin.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5564) Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as in BigQuery)

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723262#comment-17723262
 ] 

Julian Hyde commented on CALCITE-5564:
--

{quote}
I was wondering whether it makes to sense to have this issue be the addition of 
parsing/validation for BigQuery syntax and then the implementation for both the 
standard and BigQuery could be added under CALCITE-4666.
{quote}
I agree. That's a good separation of work.

> Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as 
> in BigQuery)
> 
>
> Key: CALCITE-5564
> URL: https://issues.apache.org/jira/browse/CALCITE-5564
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Tanner Clary
>Assignee: Tanner Clary
>Priority: Major
>
> Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
> {{PERCENTILE_DISC}} functions. Their syntax may be found 
> [here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
>  
> BigQuery offers these functions as well, but the syntax is slightly 
> different, and may be found 
> [here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
>  The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
> array is passed in directly as the first argument to the function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5703) Reduce amount of generated runtime code

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723261#comment-17723261
 ] 

Julian Hyde commented on CALCITE-5703:
--

In the current PR I'm not seeing much reduction in code volume (just a few null 
casts removed). If there are significant changes in code generation I'd like to 
see these evidenced in tests.

I totally agree with the goal. I think the code we generate is much too 
verbose. However, there is good reason for the current state of the code. For 
example, consider common subexpression elimination in the following expression:
{code:sql}
case
when x < 10 then y + 1
when x is null then 0
when x < 20 or (y + 1) > 100 then 2
else 5 - (y + 1)
end 
{code}
After the 2nd branch, and in the 'then' of the first branch, we know that x is 
not null, and therefore we should skip the 'is null' check. The subexpression 
'y + 1' is used in three different places, but each place cannot guarantee that 
the previous occurrence has been executed.

> Reduce amount of generated runtime code
> ---
>
> Key: CALCITE-5703
> URL: https://issues.apache.org/jira/browse/CALCITE-5703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: patch-available
>
> In some cases runtime generates code like :
> {noformat}
> return case_when_value == null ? (String) null : some_oparation();
> or
> return input_value == null ? (Long) null : Long.valueOf(...;
> {noformat}
> this redundant casting probably not harmful, but there is another side - 
> maximum method size, this size jdk[1], janino [2] throws : *Code grows beyond 
> 64 KB* . This PR reduces code generated by calcite runtime thus more huge 
> expressions can be executed.
> [1] 
> https://github.com/openjdk/jdk/blob/d22bcc813eea719b817d3d541a843594675c0ca9/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/ClassFile.java#L101
> [2] 
> https://github.com/janino-compiler/janino/blob/e69022f5aaabd36edc08a2074360d62514493a19/janino/src/main/java/org/codehaus/janino/CodeContext.java#L699



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5704) Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723260#comment-17723260
 ] 

Julian Hyde commented on CALCITE-5704:
--

These should be functions in {{SqlLibraryOperators}}, but should desugar to 
operators such as {{ARRAY_UNION}} in {{SqlStdOperatorTable}}. Such operators 
should be in {{SqlStdOperatorTable}} because {{ARRAY UNION}} etc. are defined 
in the SQL standard. They are not currently implemented, but very similar 
operators {{SqlStdOperatorTable.MULTISET_UNION}} were added already (in 
CALCITE-2355).

> Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect
> ---
>
> Key: CALCITE-5704
> URL: https://issues.apache.org/jira/browse/CALCITE-5704
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>
> array_union(array1, array2) - Returns an array of the elements in the union 
> of array1 and array2, without duplicates.
> array_intersect(array1, array2) - Returns an array of the elements in the 
> intersection of array1 and array2, without duplicates.
> array_except(array1, array2) - Returns an array of the elements in array1 but 
> not in array2, without duplicates.
> For more details
> [https://spark.apache.org/docs/latest/api/sql/index.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5690) ConcurrentModificationException during validation of table with DynamicRecordType

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723258#comment-17723258
 ] 

Julian Hyde commented on CALCITE-5690:
--

Your 'here' links got corrupted, but I think I got your point. It probably 
makes sense to stop interning {{DynamicRecordTypeImpl}}, but not for the 
reasons you say.

In your case, you seem to be using the same {{Table}} object and its embedded 
{{RelDataTypeHolder}} from two different queries, in different threads. As 
separate statements, those queries must have distinct type factories. And yet 
the {{Table}} is returning the same type object to both statements. There is a 
contradiction: that type object cannot belong to two different type factories.

The solution, I think, is to remove {{RelDataTypeFactory}} from 
{{RelDataTypeHolder}} and make its {{fields}} field thread-safe.

With other types, even though they are immutable, we should not assume that 
they are thread-safe; they probably are, but that wasn't a design goal. If 
there was some concrete goal - e.g. multi-threaded execution of VolcanoPlanner 
to optimize a particular query faster - then we should declare that as an 
explicit goal and write the necessary tests to ensure that it is met.

> ConcurrentModificationException during validation of table with 
> DynamicRecordType
> -
>
> Key: CALCITE-5690
> URL: https://issues.apache.org/jira/browse/CALCITE-5690
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.32.0
>Reporter: Abhishek Singh Chouhan
>Priority: Major
>
> When multiple threads are doing validation on a Table with 
> DynamicRecordTypeImpl, we run into a ConcurrentModificationException. One of 
> the instances where this happens is when two validations are happening in 
> parallel 
> Thread1 - 
> SELECT DS_GET_QUANTILE(DS_QUANTILES_SKETCH(COL1), 0.9) FROM TABLE1 GROUP BY 
> COL1, COL2, COL3
> Thread2 - SELECT DS_GET_QUANTILE(DS_QUANTILES_SKETCH(COL1), 0.9) FROM TABLE1 
> GROUP BY COL1, COL2
> While expanding the groupby in thread1 we intern the type at multiple places, 
> eg. DelegatingScope#fullyQualify -> SqlValidatorScope#rowType -> 
> RelDataTypeFactoryImpl#createTypeWithNullability -> canonize
> During the validation in thread1 we cache DynamicRecordTypeImpl with Col1 and 
> Col2 as the fields.
> The race condition happens when thread2 gets a hold of this(abovementioned) 
> cached DynamicRecordTypeImpl during its validation and is iterating over the 
> fields eg during SqlValidatorImpl.validateGroupItem and thread 1 then goes 
> and inserts Col3 field in the holder.
> {code:java}
> java.util.ConcurrentModificationException
>     at 
> java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1043)
>     at java.base/java.util.ArrayList$Itr.next(ArrayList.java:997)
>     at 
> org.apache.calcite.rel.type.RelDataTypeHolder.getFieldOrInsert(RelDataTypeHolder.java:56)
>     at 
> org.apache.calcite.rel.type.DynamicRecordTypeImpl.getField(DynamicRecordTypeImpl.java:59)
>     at 
> org.apache.calcite.sql.validate.SqlNameMatchers$BaseMatcher.field(SqlNameMatchers.java:126)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:6455)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:6360)
>     at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:324)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1869)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1854)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateGroupItem(SqlValidatorImpl.java:4393)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateGroupClause(SqlValidatorImpl.java:4366)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3701)
>     at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:64)
>     at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:89)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1107)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:1078)
>     at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:248)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1053)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:759)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5690) ConcurrentModificationException during validation of table with DynamicRecordType

2023-05-16 Thread Abhishek Singh Chouhan (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5690?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723250#comment-17723250
 ] 

Abhishek Singh Chouhan commented on CALCITE-5690:
-

> Does {{RelDataTypeHolder}} claim to be thread-safe? Is its implementation 
>actually thread-safe? Does the code using it need to be thread-safe, and if 
>so, did it make a mistake using a non-thread-safe class?

RelDataTypeHolder is not thread safe. None of the associated documentation 
suggests/claims thread-safety as well. 

I believe RelDataTypeFactoryImpl assumes that the type is immutable 
([here|[https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/type/RelDataTypeFactoryImpl.java#L402]]
 and 
[here|[https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/rel/type/RelDataTypeFactoryImpl.java#L421]).]
  The datatype cache is interning the type assuming immutable objects, however 
the DynamicRecordTypeImpl is mutable. Maybe we should skip interning 
DynamicRecordTypeImpl?

> ConcurrentModificationException during validation of table with 
> DynamicRecordType
> -
>
> Key: CALCITE-5690
> URL: https://issues.apache.org/jira/browse/CALCITE-5690
> Project: Calcite
>  Issue Type: Bug
>Affects Versions: 1.32.0
>Reporter: Abhishek Singh Chouhan
>Priority: Major
>
> When multiple threads are doing validation on a Table with 
> DynamicRecordTypeImpl, we run into a ConcurrentModificationException. One of 
> the instances where this happens is when two validations are happening in 
> parallel 
> Thread1 - 
> SELECT DS_GET_QUANTILE(DS_QUANTILES_SKETCH(COL1), 0.9) FROM TABLE1 GROUP BY 
> COL1, COL2, COL3
> Thread2 - SELECT DS_GET_QUANTILE(DS_QUANTILES_SKETCH(COL1), 0.9) FROM TABLE1 
> GROUP BY COL1, COL2
> While expanding the groupby in thread1 we intern the type at multiple places, 
> eg. DelegatingScope#fullyQualify -> SqlValidatorScope#rowType -> 
> RelDataTypeFactoryImpl#createTypeWithNullability -> canonize
> During the validation in thread1 we cache DynamicRecordTypeImpl with Col1 and 
> Col2 as the fields.
> The race condition happens when thread2 gets a hold of this(abovementioned) 
> cached DynamicRecordTypeImpl during its validation and is iterating over the 
> fields eg during SqlValidatorImpl.validateGroupItem and thread 1 then goes 
> and inserts Col3 field in the holder.
> {code:java}
> java.util.ConcurrentModificationException
>     at 
> java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1043)
>     at java.base/java.util.ArrayList$Itr.next(ArrayList.java:997)
>     at 
> org.apache.calcite.rel.type.RelDataTypeHolder.getFieldOrInsert(RelDataTypeHolder.java:56)
>     at 
> org.apache.calcite.rel.type.DynamicRecordTypeImpl.getField(DynamicRecordTypeImpl.java:59)
>     at 
> org.apache.calcite.sql.validate.SqlNameMatchers$BaseMatcher.field(SqlNameMatchers.java:126)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:6455)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:6360)
>     at org.apache.calcite.sql.SqlIdentifier.accept(SqlIdentifier.java:324)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1869)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveType(SqlValidatorImpl.java:1854)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateGroupItem(SqlValidatorImpl.java:4393)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateGroupClause(SqlValidatorImpl.java:4366)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3701)
>     at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:64)
>     at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:89)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1107)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:1078)
>     at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:248)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:1053)
>     at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:759)
>  {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5564) Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as in BigQuery)

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723246#comment-17723246
 ] 

Julian Hyde commented on CALCITE-5564:
--

[~tanclary], Can you link related bugs. (I was just in CALCITE-4666 but 
couldn't find this case because you hadn't linked.)

Can you edit the description and add an example in both the BigQuery syntax and 
the regular syntax. It will help me and others understand the mapping between 
the two syntaxes.

> Support 2-argument PERCENTILE_CONT, PERCENTILE_DISC aggregate functions (as 
> in BigQuery)
> 
>
> Key: CALCITE-5564
> URL: https://issues.apache.org/jira/browse/CALCITE-5564
> Project: Calcite
>  Issue Type: Improvement
>Reporter: Tanner Clary
>Assignee: Tanner Clary
>Priority: Major
>
> Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
> {{PERCENTILE_DISC}} functions. Their syntax may be found 
> [here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
>  
> BigQuery offers these functions as well, but the syntax is slightly 
> different, and may be found 
> [here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
>  The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
> array is passed in directly as the first argument to the function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5701) Implement Apache Spark named_struct

2023-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CALCITE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Massé updated CALCITE-5701:
-
Labels:   (was: Spark)

> Implement Apache Spark named_struct
> ---
>
> Key: CALCITE-5701
> URL: https://issues.apache.org/jira/browse/CALCITE-5701
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Guillaume Massé
>Priority: Minor
>
> [https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]
>  
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", 2)""")
> res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct int, b: int>]
> Calcite:
> SELECT named_struct("a", 1, "b", 2);
> type: row(a int not null, b int not null){code}
>  
> It's also possible to be nested:
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
> res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
> 2)): struct>] {code}
> {code:java}
> Calcite:
> SELECT named_struct("a", 1, "b", named_struct("c", 2));
> type: row(a int not null, b row(c int not null) not null){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5701) Add NAMED_STRUCT function (enabled in Spark library)

2023-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CALCITE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Massé updated CALCITE-5701:
-
Summary: Add NAMED_STRUCT function (enabled in Spark library)  (was: 
Implement Apache Spark named_struct)

> Add NAMED_STRUCT function (enabled in Spark library)
> 
>
> Key: CALCITE-5701
> URL: https://issues.apache.org/jira/browse/CALCITE-5701
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Guillaume Massé
>Priority: Minor
>
> [https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]
>  
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", 2)""")
> res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct int, b: int>]
> Calcite:
> SELECT named_struct("a", 1, "b", 2);
> type: row(a int not null, b int not null){code}
>  
> It's also possible to be nested:
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
> res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
> 2)): struct>] {code}
> {code:java}
> Calcite:
> SELECT named_struct("a", 1, "b", named_struct("c", 2));
> type: row(a int not null, b row(c int not null) not null){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5701) Implement Apache Spark named_struct

2023-05-16 Thread Julian Hyde (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723243#comment-17723243
 ] 

Julian Hyde commented on CALCITE-5701:
--

I saw you added a 'Spark' label. This might make people think that it relates 
to the Spark adapter, which it does not. If you want to make this issue clear 
and searchable, use the same summary format as other similar issues such as 
CALCITE-5657, 'Add NAMED_STRUCT function (enabled in Spark library)'.

> Implement Apache Spark named_struct
> ---
>
> Key: CALCITE-5701
> URL: https://issues.apache.org/jira/browse/CALCITE-5701
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Guillaume Massé
>Priority: Minor
>  Labels: Spark
>
> [https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]
>  
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", 2)""")
> res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct int, b: int>]
> Calcite:
> SELECT named_struct("a", 1, "b", 2);
> type: row(a int not null, b int not null){code}
>  
> It's also possible to be nested:
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
> res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
> 2)): struct>] {code}
> {code:java}
> Calcite:
> SELECT named_struct("a", 1, "b", named_struct("c", 2));
> type: row(a int not null, b row(c int not null) not null){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5701) Implement Apache Spark named_struct

2023-05-16 Thread Jira


 [ 
https://issues.apache.org/jira/browse/CALCITE-5701?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guillaume Massé updated CALCITE-5701:
-
Labels: Spark  (was: )

> Implement Apache Spark named_struct
> ---
>
> Key: CALCITE-5701
> URL: https://issues.apache.org/jira/browse/CALCITE-5701
> Project: Calcite
>  Issue Type: New Feature
>  Components: core
>Reporter: Guillaume Massé
>Priority: Minor
>  Labels: Spark
>
> [https://spark.apache.org/docs/3.4.0/api/sql/index.html#named_struct]
>  
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", 2)""")
> res4: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, 2): struct int, b: int>]
> Calcite:
> SELECT named_struct("a", 1, "b", 2);
> type: row(a int not null, b int not null){code}
>  
> It's also possible to be nested:
> {code:java}
> spark.sql("""select named_struct("a", 1, "b", named_struct("c", 2))""")
> res5: org.apache.spark.sql.DataFrame = [named_struct(a, 1, b, named_struct(c, 
> 2)): struct>] {code}
> {code:java}
> Calcite:
> SELECT named_struct("a", 1, "b", named_struct("c", 2));
> type: row(a int not null, b row(c int not null) not null){code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5705) Generalize RemoveEmptySingleRule to work with arbitrary relations and pruning configurations

2023-05-16 Thread Stamatis Zampetakis (Jira)
Stamatis Zampetakis created CALCITE-5705:


 Summary: Generalize RemoveEmptySingleRule to work with arbitrary 
relations and pruning configurations
 Key: CALCITE-5705
 URL: https://issues.apache.org/jira/browse/CALCITE-5705
 Project: Calcite
  Issue Type: Improvement
  Components: core
Reporter: Stamatis Zampetakis
Assignee: Stamatis Zampetakis
 Fix For: 1.35.0


Currently 
[RemoveEmptySingleRule|https://github.com/apache/calcite/blob/b0b27e8872b33c5ab203e0e2365d267a594c80be/core/src/main/java/org/apache/calcite/rel/rules/PruneEmptyRules.java#LL285C23-L285C44]
 can only transform {{SingleRel}} relations to empty. However, the logic inside 
the {{matches}} method is at the most part capable of handling any kind of 
relation including those that have multiple children.

By generalizing the {{RemoveEmptySingleRule}} to work with arbitrary relations 
we can refactor other pruning rules such as those created by 
[CorrelateLeftEmptyRuleConfig|https://github.com/apache/calcite/blob/b0b27e8872b33c5ab203e0e2365d267a594c80be/core/src/main/java/org/apache/calcite/rel/rules/PruneEmptyRules.java#L588]
 without the need for creating more classes and duplicating code.

Moreover by changing the constructor to accept {{PruneEmptyRule.Config}} 
instead of {{RemoveEmptySingleRuleConfig}} we can reduce code duplication 
further since configurations such as {{ZeroMaxRowsRuleConfig}} and 
{{SortFetchZeroRuleConfig}} could be modified to create instances of 
{{RemoveEmptySingleRule}}.

This is mainly a refactoring to simplify pruning rules and remove duplicate 
logic. The change is fully backwards compatible.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5704) Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect

2023-05-16 Thread jackylau (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

jackylau updated CALCITE-5704:
--
Description: 
array_union(array1, array2) - Returns an array of the elements in the union of 
array1 and array2, without duplicates.

array_intersect(array1, array2) - Returns an array of the elements in the 
intersection of array1 and array2, without duplicates.

array_except(array1, array2) - Returns an array of the elements in array1 but 
not in array2, without duplicates.

For more details
[https://spark.apache.org/docs/latest/api/sql/index.html]

> Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect
> ---
>
> Key: CALCITE-5704
> URL: https://issues.apache.org/jira/browse/CALCITE-5704
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: jackylau
>Priority: Major
> Fix For: 1.35.0
>
>
> array_union(array1, array2) - Returns an array of the elements in the union 
> of array1 and array2, without duplicates.
> array_intersect(array1, array2) - Returns an array of the elements in the 
> intersection of array1 and array2, without duplicates.
> array_except(array1, array2) - Returns an array of the elements in array1 but 
> not in array2, without duplicates.
> For more details
> [https://spark.apache.org/docs/latest/api/sql/index.html]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Created] (CALCITE-5704) Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for Spark dialect

2023-05-16 Thread jackylau (Jira)
jackylau created CALCITE-5704:
-

 Summary: Add ARRAY_EXCEPT, ARRAY_INTERSECT and ARRAY_UNION for 
Spark dialect
 Key: CALCITE-5704
 URL: https://issues.apache.org/jira/browse/CALCITE-5704
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.35.0
Reporter: jackylau
 Fix For: 1.35.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (CALCITE-5703) Reduce amount of generated runtime code

2023-05-16 Thread Dan Zou (Jira)


[ 
https://issues.apache.org/jira/browse/CALCITE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17723011#comment-17723011
 ] 

Dan Zou commented on CALCITE-5703:
--

This is an interesting topic, and we have met similar problems in Flink (see in 
[1] and [2]). Remove redundant code such as redundant cast or null check may 
work in some cases, and Flink handle most of the cases by splitting methods(but 
this is not a panacea, we have encountered the problem of the split method 
itself being time-consuming in some corner cases).
[1] 
https://issues.apache.org/jira/browse/FLINK-19363?jql=project%20%3D%20FLINK%20%20and%20summary%20~%20%2764%20KB%27%20ORDER%20BY%20created%20DESC
[2] https://issues.apache.org/jira/browse/FLINK-23007

> Reduce amount of generated runtime code
> ---
>
> Key: CALCITE-5703
> URL: https://issues.apache.org/jira/browse/CALCITE-5703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: patch-available
>
> In some cases runtime generates code like :
> {noformat}
> return case_when_value == null ? (String) null : some_oparation();
> or
> return input_value == null ? (Long) null : Long.valueOf(...;
> {noformat}
> this redundant casting probably not harmful, but there is another side - 
> maximum method size, this size jdk[1], janino [2] throws : *Code grows beyond 
> 64 KB* . This PR reduces code generated by calcite runtime thus more huge 
> expressions can be executed.
> [1] 
> https://github.com/openjdk/jdk/blob/d22bcc813eea719b817d3d541a843594675c0ca9/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/ClassFile.java#L101
> [2] 
> https://github.com/janino-compiler/janino/blob/e69022f5aaabd36edc08a2074360d62514493a19/janino/src/main/java/org/codehaus/janino/CodeContext.java#L699



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5703) Reduce amount of generated runtime code

2023-05-16 Thread Evgeny Stanilovsky (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky updated CALCITE-5703:

Labels: patch-available  (was: )

> Reduce amount of generated runtime code
> ---
>
> Key: CALCITE-5703
> URL: https://issues.apache.org/jira/browse/CALCITE-5703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>  Labels: patch-available
>
> In some cases runtime generates code like :
> {noformat}
> return case_when_value == null ? (String) null : some_oparation();
> or
> return input_value == null ? (Long) null : Long.valueOf(...;
> {noformat}
> this redundant casting probably not harmful, but there is another side - 
> maximum method size, this size jdk[1], janino [2] throws : *Code grows beyond 
> 64 KB* . This PR reduces code generated by calcite runtime thus more huge 
> expressions can be executed.
> [1] 
> https://github.com/openjdk/jdk/blob/d22bcc813eea719b817d3d541a843594675c0ca9/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/ClassFile.java#L101
> [2] 
> https://github.com/janino-compiler/janino/blob/e69022f5aaabd36edc08a2074360d62514493a19/janino/src/main/java/org/codehaus/janino/CodeContext.java#L699



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Updated] (CALCITE-5703) Reduce amount of generated runtime code

2023-05-16 Thread Evgeny Stanilovsky (Jira)


 [ 
https://issues.apache.org/jira/browse/CALCITE-5703?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Evgeny Stanilovsky updated CALCITE-5703:

Description: 
In some cases runtime generates code like :
{noformat}
return case_when_value == null ? (String) null : some_oparation();
or
return input_value == null ? (Long) null : Long.valueOf(...;
{noformat}
this redundant casting probably not harmful, but there is another side - 
maximum method size, this size jdk[1], janino [2] throws : *Code grows beyond 
64 KB* . This PR reduces code generated by calcite runtime thus more huge 
expressions can be executed.

[1] 
https://github.com/openjdk/jdk/blob/d22bcc813eea719b817d3d541a843594675c0ca9/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/ClassFile.java#L101
[2] 
https://github.com/janino-compiler/janino/blob/e69022f5aaabd36edc08a2074360d62514493a19/janino/src/main/java/org/codehaus/janino/CodeContext.java#L699

  was:
In some cases runtime generates code like :
{noformat}
return case_when_value == null ? (String) null : some_oparation();
or
return input_value == null ? (Long) null : Long.valueOf(...;
{noformat}
this redundant casting probably not harmful, but there is another side - 
maximum method size, this size jdk[1], janino [2] throws : *Code grows beyond 
64 KB* . This PR reduces code generated by calcite runtime thus more huge 
expressions could be executed.

[1] 
https://github.com/openjdk/jdk/blob/d22bcc813eea719b817d3d541a843594675c0ca9/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/ClassFile.java#L101
[2] 
https://github.com/janino-compiler/janino/blob/e69022f5aaabd36edc08a2074360d62514493a19/janino/src/main/java/org/codehaus/janino/CodeContext.java#L699


> Reduce amount of generated runtime code
> ---
>
> Key: CALCITE-5703
> URL: https://issues.apache.org/jira/browse/CALCITE-5703
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Evgeny Stanilovsky
>Assignee: Evgeny Stanilovsky
>Priority: Major
>
> In some cases runtime generates code like :
> {noformat}
> return case_when_value == null ? (String) null : some_oparation();
> or
> return input_value == null ? (Long) null : Long.valueOf(...;
> {noformat}
> this redundant casting probably not harmful, but there is another side - 
> maximum method size, this size jdk[1], janino [2] throws : *Code grows beyond 
> 64 KB* . This PR reduces code generated by calcite runtime thus more huge 
> expressions can be executed.
> [1] 
> https://github.com/openjdk/jdk/blob/d22bcc813eea719b817d3d541a843594675c0ca9/src/jdk.compiler/share/classes/com/sun/tools/javac/jvm/ClassFile.java#L101
> [2] 
> https://github.com/janino-compiler/janino/blob/e69022f5aaabd36edc08a2074360d62514493a19/janino/src/main/java/org/codehaus/janino/CodeContext.java#L699



--
This message was sent by Atlassian Jira
(v8.20.10#820010)