date:20230911

[jira] [Closed] (CALCITE-6000) There should be a SqlParserFixture which parses again after unparsing

2023-09-11 Thread Mihai Budiu (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihai Budiu closed CALCITE-6000.

Resolution: Invalid

> There should be a SqlParserFixture which parses again after unparsing
> -
>
> Key: CALCITE-6000
> URL: https://issues.apache.org/jira/browse/CALCITE-6000
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> SqlParserTests parse and unparse queries.
> But the unparsed result is not validated.
> A new fixture should parse the final result again to validate that it's legal 
> SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-6000) There should be a SqlParserFixture which parses again after unparsing

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764004#comment-17764004
 ] 

Mihai Budiu commented on CALCITE-6000:
--

I will close this issue, because indeed the UnparsingTesterImpl does exactly 
this.
However, in the case of the expression "convert('abc' using utf8)", which is 
unparsed as "translate('abc', 'UTF-8')", the parser accepts the unparsed form, 
although it's not legal SQL. I guess it looks like legal SQL, and only type 
inference can reject this construct.

> There should be a SqlParserFixture which parses again after unparsing
> -
>
> Key: CALCITE-6000
> URL: https://issues.apache.org/jira/browse/CALCITE-6000
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> SqlParserTests parse and unparse queries.
> But the unparsed result is not validated.
> A new fixture should parse the final result again to validate that it's legal 
> SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-6000) There should be a SqlParserFixture which parses again after unparsing

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-6000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17764001#comment-17764001
 ] 

Mihai Budiu commented on CALCITE-6000:
--

In fact, there is such a tester, the UnparsingTesterImpl.
But for some reason it hasn't found [CALCITE-5996], for example, which is 
exactly what it should be testing for.
So perhaps there is something wrong with the SqlUnparserTest.


> There should be a SqlParserFixture which parses again after unparsing
> -
>
> Key: CALCITE-6000
> URL: https://issues.apache.org/jira/browse/CALCITE-6000
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> SqlParserTests parse and unparse queries.
> But the unparsed result is not validated.
> A new fixture should parse the final result again to validate that it's legal 
> SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread xiaogang zhou (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763991#comment-17763991
 ] 

xiaogang zhou edited comment on CALCITE-5995 at 9/12/23 3:37 AM:
-

[~julianhyde] yes, I think this is very similar to 
https://issues.apache.org/jira/browse/CALCITE-5914

 

I don't understand how to convert the expression to constant, as the second 
input which stand for various json field   is different and A is different in 
every data row. I think the expression need to be calculated at runtime. Please 
correct me if I am wrong

 

And I tried a few alternatives to solve this issue like:
 # extract the dejsonized object in the generated code projection operator 
(performance is not ideal as there are a lot of convertion for flink string)
 # convert multiple json_value field to table function using a optimization 
rule (too complicate to traverse all the call , filter parts, and no 
significant improvement compared to cache solution)

 

if anybody is interested, I can attach some evidence. But in brief it turned 
out that using cache is the most economic solution. 


was (Author: zhoujira86):
[~julianhyde] yes, I think this is very similar to 
https://issues.apache.org/jira/browse/CALCITE-5914

and I don't understand how to convert the expression to constant, as the second 
input which stand for various json field   is different and A is different in 
every data row. I think the expression need to be calculated at runtime. Please 
correct me if I am wrong

 

And I tried a few alternatives to solve this issue like:
 # extract the dejsonized object in the generated code projection operator 
(performance is not ideal as there are a lot of convertion for flink string)
 # convert multiple json_value field to table function using a optimization 
rule (too complicate to traverse all the call , filter parts, and no 
significant improvement compared to cache solution)

 

if anybody is interested, I can attach some evidence. But in brief it turned 
out that using cache is the most economic solution. 

> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below as the dejsonize function being 
> called repeatedly unnecessarily.  
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5996) TRANSLATE operator is incorrectly unparsed

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5996:

Labels: pull-request-available  (was: )

> TRANSLATE operator is incorrectly unparsed
> --
>
> Key: CALCITE-5996
> URL: https://issues.apache.org/jira/browse/CALCITE-5996
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> This query
> {code}
> select translate(col using utf8)
> from (select 'a' as col
>  from (values(true)))
> {code}
> if converted to SqlNode and back produces:
> {code}
> SELECT TRANSLATE("COL", "UTF8")
> FROM (SELECT 'a' AS "COL"
> FROM (VALUES ROW(TRUE)))
> {code}
> which is no longer correct SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CALCITE-6000) There should be a SqlParserFixture which parses again after unparsing

2023-09-11 Thread Mihai Budiu (Jira)

Mihai Budiu created CALCITE-6000:


 Summary: There should be a SqlParserFixture which parses again 
after unparsing
 Key: CALCITE-6000
 URL: https://issues.apache.org/jira/browse/CALCITE-6000
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.35.0
Reporter: Mihai Budiu


SqlParserTests parse and unparse queries.
But the unparsed result is not validated.
A new fixture should parse the final result again to validate that it's legal 
SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5997) OFFSET operator is incorrectly unparsed

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5997:

Labels: pull-request-available  (was: )

> OFFSET operator is incorrectly unparsed
> ---
>
> Key: CALCITE-5997
> URL: https://issues.apache.org/jira/browse/CALCITE-5997
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> The following query:
> {code:sql}
> select ARRAY[p2,p1,p0][OFFSET(2)] from (values (6, 4, 2)) as t(p0, p1, p2)
> {code}
> when parsed as a SqlNode and then unparsed produces:
> {code:sql}
> SELECT ARRAY["P2", "P1", "P0"][2]
> FROM (VALUES ROW(6, 4, 2)) AS "T" ("P0", "P1", "P2")
> {code}
> which no longer produces the same result (the OFFSET function call is 
> missing).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Comment Edited] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread xiaogang zhou (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763991#comment-17763991
 ] 

xiaogang zhou edited comment on CALCITE-5995 at 9/12/23 2:55 AM:
-

[~julianhyde] yes, I think this is very similar to 
https://issues.apache.org/jira/browse/CALCITE-5914

and I don't understand how to convert the expression to constant, as the second 
input which stand for various json field   is different and A is different in 
every data row. I think the expression need to be calculated at runtime. Please 
correct me if I am wrong

 

And I tried a few alternatives to solve this issue like:
 # extract the dejsonized object in the generated code projection operator 
(performance is not ideal as there are a lot of convertion for flink string)
 # convert multiple json_value field to table function using a optimization 
rule (too complicate to traverse all the call , filter parts, and no 
significant improvement compared to cache solution)

 

if anybody is interested, I can attach some evidence. But in brief it turned 
out that using cache is the most economic solution. 


was (Author: zhoujira86):
[~julianhyde] yes, I think this is very similar to 
https://issues.apache.org/jira/browse/CALCITE-5914

and I don't understand how to convert the expression to constant, as the second 
input which stand for various json field   is different and A is different in 
every data row. I think the expression need to be calculated at runtime. Please 
correct me if I am wrong

 

And I tried a few alternatives to solve this issue like:
 # extract the dejsonized object in the generated code projection operator 
(performance is not ideal as there are a lot of convertion for flink string)
 # convert multiple json_value field to table function using a optimization 
rule (too complicate to traverse all the call , filter parts, and no 
significant improvement compared to cache solution)

 

if anybody is interested, I can attach some evidence. But it turned out that 
using cache is the most economic solution. 

> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below as the dejsonize function being 
> called repeatedly unnecessarily.  
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread xiaogang zhou (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763991#comment-17763991
 ] 

xiaogang zhou commented on CALCITE-5995:


[~julianhyde] yes, I think this is very similar to 
https://issues.apache.org/jira/browse/CALCITE-5914

and I don't understand how to convert the expression to constant, as the second 
input which stand for various json field   is different and A is different in 
every data row. I think the expression need to be calculated at runtime. Please 
correct me if I am wrong

 

And I tried a few alternatives to solve this issue like:
 # extract the dejsonized object in the generated code projection operator 
(performance is not ideal as there are a lot of convertion for flink string)
 # convert multiple json_value field to table function using a optimization 
rule (too complicate to traverse all the call , filter parts, and no 
significant improvement compared to cache solution)

 

if anybody is interested, I can attach some evidence. But it turned out that 
using cache is the most economic solution. 

> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below as the dejsonize function being 
> called repeatedly unnecessarily.  
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5999) DECIMAL literals as sometimes unparsed looking as DOUBLE literals

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5999:

Labels: pull-request-available  (was: )

> DECIMAL literals as sometimes unparsed looking as DOUBLE literals
> -
>
> Key: CALCITE-5999
> URL: https://issues.apache.org/jira/browse/CALCITE-5999
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> Consider a DECIMAL literal such as "0.1"
> When unparsed this will show up as 1E-17, which is interpreted by SQL as a 
> double literal.
> The bug is in the function SqlNumericLiteral.toValue(). The function calls 
> toString() on a BigDecimal value, but it should probably call toPlainString() 
> instead.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5997) OFFSET operator is incorrectly unparsed

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763988#comment-17763988
 ] 

Mihai Budiu commented on CALCITE-5997:
--

No tests failed for the other operators.
I have a fix, I will submit it as soon as I figure out how to write the unit 
test.

> OFFSET operator is incorrectly unparsed
> ---
>
> Key: CALCITE-5997
> URL: https://issues.apache.org/jira/browse/CALCITE-5997
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> The following query:
> {code:sql}
> select ARRAY[p2,p1,p0][OFFSET(2)] from (values (6, 4, 2)) as t(p0, p1, p2)
> {code}
> when parsed as a SqlNode and then unparsed produces:
> {code:sql}
> SELECT ARRAY["P2", "P1", "P0"][2]
> FROM (VALUES ROW(6, 4, 2)) AS "T" ("P0", "P1", "P2")
> {code}
> which no longer produces the same result (the OFFSET function call is 
> missing).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5998) The SAFE_OFFSET operator can cause an index out of bounds exception

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763987#comment-17763987
 ] 

Mihai Budiu commented on CALCITE-5998:
--

This test is equivalent to the original test

{code:java}
f.checkScalar("ARRAY[2,4,6][SAFE_OFFSET(-1)]", isNullValue(), "INTEGER");
{code}

and it was generated by parsing and unparsing the corresponding SQL.

> The SAFE_OFFSET operator can cause an index out of bounds exception
> ---
>
> Key: CALCITE-5998
> URL: https://issues.apache.org/jira/browse/CALCITE-5998
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> The following query, when added as a SqlOperatorTest:
> {code:sql}
> select ARRAY[p3,p2,p1][SAFE_OFFSET(p0)] from (values (-1, 6, 4, 2)) as t(p0, 
> p1, p2, p3)
> {code}
> causes an exception. Here is the top of the stack trace:
> {code:java}
> Array index -1 is out of bounds
> org.apache.calcite.runtime.CalciteException: Array index -1 is out of bounds
>   at 
> java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> java.base@11.0.18/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 
> java.base@11.0.18/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at 
> app//org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:507)
>   at 
> app//org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:601)
>   at 
> app//org.apache.calcite.runtime.SqlFunctions.arrayItem(SqlFunctions.java:4742)
>   at 
> app//org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(SqlFunctions.java:4780)
>   at Baz$1$1.current(Unknown Source)
>   at 
> app//org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:687)
>   at 
> app//org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
>   at 
> app//org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:219)
>   at 
> app//org.apache.calcite.sql.test.ResultCheckers.compareResultSet(ResultCheckers.java:128)
>   at 
> app//org.apache.calcite.sql.test.ResultCheckers$RefSetResultChecker.checkResult(ResultCheckers.java:336)
>   at 
> app//org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlOperatorTest.java:12987)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5862) Incorrect semantics of ARRAY function (Spark library) when elements have Numeric and Character types

2023-09-11 Thread Ran Tao (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5862?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ran Tao updated CALCITE-5862:
-
Summary: Incorrect semantics of ARRAY function (Spark library) when 
elements have Numeric and Character types  (was: Incorrect semantics of ARRAY 
function(Spark library) when elements have Numeric and Character types)

> Incorrect semantics of ARRAY function (Spark library) when elements have 
> Numeric and Character types
> 
>
> Key: CALCITE-5862
> URL: https://issues.apache.org/jira/browse/CALCITE-5862
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.34.0
>Reporter: Ran Tao
>Assignee: Ran Tao
>Priority: Major
>  Labels: pull-request-available
>
> when run select array(1, 2, 'Hi')
> spark-sql (default)> select array(1, 2, 'Hi');
> ["1","2","Hi"] 
> and calcite will cause {*}java.lang.NullPointerException: inferred array 
> element type{*}.
> Spark supports both character and numeric types in the array, and the return 
> type is character.
> In fact, calcite also allows both character and numeric types to exist in the 
> operand type checker, but there is no corresponding processing logic in the 
> return type inference, so an error is reported.
> We should fix this bug.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5997) OFFSET operator is incorrectly unparsed

2023-09-11 Thread Tanner Clary (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763983#comment-17763983
 ] 

Tanner Clary commented on CALCITE-5997:
---

Is this true of the other index operators?

> OFFSET operator is incorrectly unparsed
> ---
>
> Key: CALCITE-5997
> URL: https://issues.apache.org/jira/browse/CALCITE-5997
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> The following query:
> {code:sql}
> select ARRAY[p2,p1,p0][OFFSET(2)] from (values (6, 4, 2)) as t(p0, p1, p2)
> {code}
> when parsed as a SqlNode and then unparsed produces:
> {code:sql}
> SELECT ARRAY["P2", "P1", "P0"][2]
> FROM (VALUES ROW(6, 4, 2)) AS "T" ("P0", "P1", "P2")
> {code}
> which no longer produces the same result (the OFFSET function call is 
> missing).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5998) The SAFE_OFFSET operator can cause an index out of bounds exception

2023-09-11 Thread Tanner Clary (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763982#comment-17763982
 ] 

Tanner Clary commented on CALCITE-5998:
---

Ah good catch. So what should it be expected to return? Null?

> The SAFE_OFFSET operator can cause an index out of bounds exception
> ---
>
> Key: CALCITE-5998
> URL: https://issues.apache.org/jira/browse/CALCITE-5998
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> The following query, when added as a SqlOperatorTest:
> {code:sql}
> select ARRAY[p3,p2,p1][SAFE_OFFSET(p0)] from (values (-1, 6, 4, 2)) as t(p0, 
> p1, p2, p3)
> {code}
> causes an exception. Here is the top of the stack trace:
> {code:java}
> Array index -1 is out of bounds
> org.apache.calcite.runtime.CalciteException: Array index -1 is out of bounds
>   at 
> java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> java.base@11.0.18/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 
> java.base@11.0.18/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at 
> app//org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:507)
>   at 
> app//org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:601)
>   at 
> app//org.apache.calcite.runtime.SqlFunctions.arrayItem(SqlFunctions.java:4742)
>   at 
> app//org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(SqlFunctions.java:4780)
>   at Baz$1$1.current(Unknown Source)
>   at 
> app//org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:687)
>   at 
> app//org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
>   at 
> app//org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:219)
>   at 
> app//org.apache.calcite.sql.test.ResultCheckers.compareResultSet(ResultCheckers.java:128)
>   at 
> app//org.apache.calcite.sql.test.ResultCheckers$RefSetResultChecker.checkResult(ResultCheckers.java:336)
>   at 
> app//org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlOperatorTest.java:12987)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CALCITE-5999) DECIMAL literals as sometimes unparsed looking as DOUBLE literals

2023-09-11 Thread Mihai Budiu (Jira)

Mihai Budiu created CALCITE-5999:


 Summary: DECIMAL literals as sometimes unparsed looking as DOUBLE 
literals
 Key: CALCITE-5999
 URL: https://issues.apache.org/jira/browse/CALCITE-5999
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Mihai Budiu


Consider a DECIMAL literal such as "0.1"

When unparsed this will show up as 1E-17, which is interpreted by SQL as a 
double literal.

The bug is in the function SqlNumericLiteral.toValue(). The function calls 
toString() on a BigDecimal value, but it should probably call toPlainString() 
instead.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5996) TRANSLATE operator is incorrectly unparsed

2023-09-11 Thread ZheHu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763981#comment-17763981
 ] 

ZheHu commented on CALCITE-5996:


I think the SqlParserTest is the way to go.

> TRANSLATE operator is incorrectly unparsed
> --
>
> Key: CALCITE-5996
> URL: https://issues.apache.org/jira/browse/CALCITE-5996
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> This query
> {code}
> select translate(col using utf8)
> from (select 'a' as col
>  from (values(true)))
> {code}
> if converted to SqlNode and back produces:
> {code}
> SELECT TRANSLATE("COL", "UTF8")
> FROM (SELECT 'a' AS "COL"
> FROM (VALUES ROW(TRUE)))
> {code}
> which is no longer correct SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5998) The SAFE_OFFSET operator can cause an index out of bounds exception

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763980#comment-17763980
 ] 

Mihai Budiu commented on CALCITE-5998:
--

[~tanclary] I think you added this code recently.

> The SAFE_OFFSET operator can cause an index out of bounds exception
> ---
>
> Key: CALCITE-5998
> URL: https://issues.apache.org/jira/browse/CALCITE-5998
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> The following query, when added as a SqlOperatorTest:
> {code:sql}
> select ARRAY[p3,p2,p1][SAFE_OFFSET(p0)] from (values (-1, 6, 4, 2)) as t(p0, 
> p1, p2, p3)
> {code}
> causes an exception. Here is the top of the stack trace:
> {code:java}
> Array index -1 is out of bounds
> org.apache.calcite.runtime.CalciteException: Array index -1 is out of bounds
>   at 
> java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
>  Method)
>   at 
> java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
>   at 
> java.base@11.0.18/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
>   at 
> java.base@11.0.18/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
>   at 
> app//org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:507)
>   at 
> app//org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:601)
>   at 
> app//org.apache.calcite.runtime.SqlFunctions.arrayItem(SqlFunctions.java:4742)
>   at 
> app//org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(SqlFunctions.java:4780)
>   at Baz$1$1.current(Unknown Source)
>   at 
> app//org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:687)
>   at 
> app//org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
>   at 
> app//org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:219)
>   at 
> app//org.apache.calcite.sql.test.ResultCheckers.compareResultSet(ResultCheckers.java:128)
>   at 
> app//org.apache.calcite.sql.test.ResultCheckers$RefSetResultChecker.checkResult(ResultCheckers.java:336)
>   at 
> app//org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlOperatorTest.java:12987)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CALCITE-5998) The SAFE_OFFSET operator can cause an index out of bounds exception

2023-09-11 Thread Mihai Budiu (Jira)

Mihai Budiu created CALCITE-5998:


 Summary: The SAFE_OFFSET operator can cause an index out of bounds 
exception
 Key: CALCITE-5998
 URL: https://issues.apache.org/jira/browse/CALCITE-5998
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Mihai Budiu


The following query, when added as a SqlOperatorTest:

{code:sql}
select ARRAY[p3,p2,p1][SAFE_OFFSET(p0)] from (values (-1, 6, 4, 2)) as t(p0, 
p1, p2, p3)
{code}

causes an exception. Here is the top of the stack trace:

{code:java}
Array index -1 is out of bounds
org.apache.calcite.runtime.CalciteException: Array index -1 is out of bounds
at 
java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance0(Native
 Method)
at 
java.base@11.0.18/jdk.internal.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at 
java.base@11.0.18/jdk.internal.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at 
java.base@11.0.18/java.lang.reflect.Constructor.newInstance(Constructor.java:490)
at 
app//org.apache.calcite.runtime.Resources$ExInstWithCause.ex(Resources.java:507)
at 
app//org.apache.calcite.runtime.Resources$ExInst.ex(Resources.java:601)
at 
app//org.apache.calcite.runtime.SqlFunctions.arrayItem(SqlFunctions.java:4742)
at 
app//org.apache.calcite.runtime.SqlFunctions.arrayItemOptional(SqlFunctions.java:4780)
at Baz$1$1.current(Unknown Source)
at 
app//org.apache.calcite.linq4j.Linq4j$EnumeratorIterator.next(Linq4j.java:687)
at 
app//org.apache.calcite.avatica.util.IteratorCursor.next(IteratorCursor.java:46)
at 
app//org.apache.calcite.avatica.AvaticaResultSet.next(AvaticaResultSet.java:219)
at 
app//org.apache.calcite.sql.test.ResultCheckers.compareResultSet(ResultCheckers.java:128)
at 
app//org.apache.calcite.sql.test.ResultCheckers$RefSetResultChecker.checkResult(ResultCheckers.java:336)
at 
app//org.apache.calcite.test.SqlOperatorTest$TesterImpl.check(SqlOperatorTest.java:12987)
{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5996) TRANSLATE operator is incorrectly unparsed

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763978#comment-17763978
 ] 

Mihai Budiu commented on CALCITE-5996:
--

I have a fix, but I don't know where unit tests for unparsing are.
Does someone know where this test would belong?

> TRANSLATE operator is incorrectly unparsed
> --
>
> Key: CALCITE-5996
> URL: https://issues.apache.org/jira/browse/CALCITE-5996
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> This query
> {code}
> select translate(col using utf8)
> from (select 'a' as col
>  from (values(true)))
> {code}
> if converted to SqlNode and back produces:
> {code}
> SELECT TRANSLATE("COL", "UTF8")
> FROM (SELECT 'a' AS "COL"
> FROM (VALUES ROW(TRUE)))
> {code}
> which is no longer correct SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CALCITE-5997) OFFSET operator is incorrectly unparsed

2023-09-11 Thread Mihai Budiu (Jira)

Mihai Budiu created CALCITE-5997:


 Summary: OFFSET operator is incorrectly unparsed
 Key: CALCITE-5997
 URL: https://issues.apache.org/jira/browse/CALCITE-5997
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Mihai Budiu


The following query:

{code:sql}
select ARRAY[p2,p1,p0][OFFSET(2)] from (values (6, 4, 2)) as t(p0, p1, p2)
{code}

when parsed as a SqlNode and then unparsed produces:

{code:sql}
SELECT ARRAY["P2", "P1", "P0"][2]
FROM (VALUES ROW(6, 4, 2)) AS "T" ("P0", "P1", "P2")
{code}

which no longer produces the same result (the OFFSET function call is missing).



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5996) TRANSLATE operator is incorrectly unparsed

2023-09-11 Thread Mihai Budiu (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5996?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Mihai Budiu updated CALCITE-5996:
-
Description: 
This query

{code}
select translate(col using utf8)
from (select 'a' as col
 from (values(true)))
{code}

if converted to SqlNode and back produces:

{code}
SELECT TRANSLATE("COL", "UTF8")
FROM (SELECT 'a' AS "COL"
FROM (VALUES ROW(TRUE)))
{code}

which is no longer correct SQL.

  was:
This query

{code:sql}
select translate(col using utf8)
from (select 'a' as col
 from (values(true)))
{code}

if converted to SqlNode and back produces

{code;sql}
SELECT TRANSLATE("COL", "UTF8")
FROM (SELECT 'a' AS "COL"
FROM (VALUES ROW(TRUE)))
{code}

which is no longer correct SQL.


> TRANSLATE operator is incorrectly unparsed
> --
>
> Key: CALCITE-5996
> URL: https://issues.apache.org/jira/browse/CALCITE-5996
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>
> This query
> {code}
> select translate(col using utf8)
> from (select 'a' as col
>  from (values(true)))
> {code}
> if converted to SqlNode and back produces:
> {code}
> SELECT TRANSLATE("COL", "UTF8")
> FROM (SELECT 'a' AS "COL"
> FROM (VALUES ROW(TRUE)))
> {code}
> which is no longer correct SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CALCITE-5996) TRANSLATE operator is incorrectly unparsed

2023-09-11 Thread Mihai Budiu (Jira)

Mihai Budiu created CALCITE-5996:


 Summary: TRANSLATE operator is incorrectly unparsed
 Key: CALCITE-5996
 URL: https://issues.apache.org/jira/browse/CALCITE-5996
 Project: Calcite
  Issue Type: Bug
  Components: core
Affects Versions: 1.35.0
Reporter: Mihai Budiu


This query

{code:sql}
select translate(col using utf8)
from (select 'a' as col
 from (values(true)))
{code}

if converted to SqlNode and back produces

{code;sql}
SELECT TRANSLATE("COL", "UTF8")
FROM (SELECT 'a' AS "COL"
FROM (VALUES ROW(TRUE)))
{code}

which is no longer correct SQL.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5964) Support additional metadata attributes in GET_TABLES and GET_COLUMNS

2023-09-11 Thread Oliver Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oliver Lee updated CALCITE-5964:

Description: 
The goal is for Avatica to be able to handle additional columns in the standard 
JDBC getTables() and getColumns() responses. 

After testing, it appears that the Avatica client in RemoteMeta is able to 
handle Signatures of different shapes that don't necessarily conform to the 
JDBC standard 
[getTables.|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]

I have opened a PR that adds a test to RemoteDriverMockTest that verifies that 
if a Signature comes in with additional columns, that Avatica is able to handle 
it fluently. 

 

This ticket is related to CALCITE-5982 , and includes some additions to 
connection properties to allow Calcite to create and send Signatures of 
different shapes. 

 

 

 

 

-The goal is to add to Avatica a mechanism such that additional metadata fields 
pertaining to tables and columns can be transmitted alongside the standard JDBC 
[getTables|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
 and {{getColumns}} calls.-

-The Avatica client needs the response to be extensible such that revisions to 
metadata fields send and future additions does not require a new JAR file.- 

-Requirements:-
 # -Avatica user does not need to download new jar files if the server decides 
to send over new metadata data in the future-
 # -If the client makes modifications to support additional columns, they 
should always be present in the call and appear with null values, as opposed to 
complete omission (Number of columns in response stays the same)- 
 # -Can handle attributes of varying types i.e. {{numberOne: int}} and 
{{booleanOne: boolean}}-
 # -Allows value retrieval from the {{ResultSet}} through calling 
{{resultSet.getInt(“booleanOne”)}} or {{resultSet.getBoolean(“booleanOne”)}}-

-Current proposal is to modify the {{MetaTable}} and {{MetaColumn}} classes to 
include a map.-
-{{{}HashMap{}}}, such that when instantiating the 
{{CalciteMetaTable}} in the {{{}ResultSet{}}}, new entries could be added in 
the future without changes to Avatica.-

-One we have a list of additional metadata fields to be emitted in 
{{{}CalciteMetaImpl{}}}, the {{ResultSet}} would be created with the 
appropriate values.-
 -- 
-There are still some challenges identified below and I would love some input:-

-Challenges:-
 * -Currently the {{MetaTable}} class that is instantiated is a 
{{{}CalciteMetaImpl{}}}. For the {{getTables()}} call, the response will be a 
list composed of schema tables of class {{CalciteMetaTable}} and database 
tables which can potentially be overloaded into 1 or more different subclasses. 
From this one heterogeneous list, we must determine the full list of columns to 
be included in the additional metadata hash. My initial plan was to provide a 
function in Calcite’s {{Table}} class such as {{getAdditionalColumns}} and 
allow it to be overloaded, but then I discovered the heterogeneity of the list.-
 * -Modifying the MetaTable class to include the hashmap of values could be 
easily done, but the challenge lies at {{{}RemoteMeta{}}}, to be able to 
serialize this cleanly so that requirement (4) is met and users can retrieve 
the values nicely. {{RemoteMeta}} currently serializes the response using 
reflection by looking at MetaTable.class and its attributes. The addition of 
one map is not immediately compatible with iterating over the keys of the map 
and turning each of those into fields. I’m looking into the idea of processing 
the enumerable in {{CalciteMetaImpl}} before the Frame gets created-

  was:
The goal is for Avatica to be able to handle additional columns in the standard 
JDBC getTables() and getColumns() responses. 

After testing, it appears that the Avatica client in RemoteMeta is able to 
handle Signatures of different shapes that don't necessarily conform to the 
JDBC standard 
[getTables.|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]

I have opened a PR that adds a test to RemoteDriverMockTest that verifies that 
if a Signature comes in with additional columns, that Avatica is able to handle 
it fluently. 

 

 

-The goal is to add to Avatica a mechanism such that additional metadata fields 
pertaining to tables and columns can be transmitted alongside the standard JDBC 
[getTables|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
 and {{getColumns}} calls.-

-The Avatica client needs the response to be extensible such that revisions to 
metadata fields

[jira] [Commented] (CALCITE-5964) Support additional metadata attributes in GET_TABLES and GET_COLUMNS

2023-09-11 Thread Oliver Lee (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763972#comment-17763972
 ] 

Oliver Lee commented on CALCITE-5964:
-

Hey [~julianhyde] ,

I have updated the Jira ticket description to reflect the current findings.

I have a PR for this ready for review: 
[https://github.com/apache/calcite-avatica/pull/227]

 

> Support additional metadata attributes in GET_TABLES and GET_COLUMNS
> 
>
> Key: CALCITE-5964
> URL: https://issues.apache.org/jira/browse/CALCITE-5964
> Project: Calcite
>  Issue Type: New Feature
>  Components: avatica
>Reporter: Oliver Lee
>Assignee: Oliver Lee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is for Avatica to be able to handle additional columns in the 
> standard JDBC getTables() and getColumns() responses. 
> After testing, it appears that the Avatica client in RemoteMeta is able to 
> handle Signatures of different shapes that don't necessarily conform to the 
> JDBC standard 
> [getTables.|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
> I have opened a PR that adds a test to RemoteDriverMockTest that verifies 
> that if a Signature comes in with additional columns, that Avatica is able to 
> handle it fluently. 
>  
> This ticket is related to CALCITE-5982 , and includes some additions to 
> connection properties to allow Calcite to create and send Signatures of 
> different shapes. 
>  
>  
>  
>  
> -The goal is to add to Avatica a mechanism such that additional metadata 
> fields pertaining to tables and columns can be transmitted alongside the 
> standard JDBC 
> [getTables|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
>  and {{getColumns}} calls.-
> -The Avatica client needs the response to be extensible such that revisions 
> to metadata fields send and future additions does not require a new JAR 
> file.- 
> -Requirements:-
>  # -Avatica user does not need to download new jar files if the server 
> decides to send over new metadata data in the future-
>  # -If the client makes modifications to support additional columns, they 
> should always be present in the call and appear with null values, as opposed 
> to complete omission (Number of columns in response stays the same)- 
>  # -Can handle attributes of varying types i.e. {{numberOne: int}} and 
> {{booleanOne: boolean}}-
>  # -Allows value retrieval from the {{ResultSet}} through calling 
> {{resultSet.getInt(“booleanOne”)}} or {{resultSet.getBoolean(“booleanOne”)}}-
> -Current proposal is to modify the {{MetaTable}} and {{MetaColumn}} classes 
> to include a map.-
> -{{{}HashMap{}}}, such that when instantiating the 
> {{CalciteMetaTable}} in the {{{}ResultSet{}}}, new entries could be added in 
> the future without changes to Avatica.-
> -One we have a list of additional metadata fields to be emitted in 
> {{{}CalciteMetaImpl{}}}, the {{ResultSet}} would be created with the 
> appropriate values.-
>  -- 
> -There are still some challenges identified below and I would love some 
> input:-
> -Challenges:-
>  * -Currently the {{MetaTable}} class that is instantiated is a 
> {{{}CalciteMetaImpl{}}}. For the {{getTables()}} call, the response will be a 
> list composed of schema tables of class {{CalciteMetaTable}} and database 
> tables which can potentially be overloaded into 1 or more different 
> subclasses. From this one heterogeneous list, we must determine the full list 
> of columns to be included in the additional metadata hash. My initial plan 
> was to provide a function in Calcite’s {{Table}} class such as 
> {{getAdditionalColumns}} and allow it to be overloaded, but then I discovered 
> the heterogeneity of the list.-
>  * -Modifying the MetaTable class to include the hashmap of values could be 
> easily done, but the challenge lies at {{{}RemoteMeta{}}}, to be able to 
> serialize this cleanly so that requirement (4) is met and users can retrieve 
> the values nicely. {{RemoteMeta}} currently serializes the response using 
> reflection by looking at MetaTable.class and its attributes. The addition of 
> one map is not immediately compatible with iterating over the keys of the map 
> and turning each of those into fields. I’m looking into the idea of 
> processing the enumerable in {{CalciteMetaImpl}} before the Frame gets 
> created-



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5964) Support additional metadata attributes in GET_TABLES and GET_COLUMNS

2023-09-11 Thread Oliver Lee (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oliver Lee updated CALCITE-5964:

Description: 
The goal is for Avatica to be able to handle additional columns in the standard 
JDBC getTables() and getColumns() responses. 

After testing, it appears that the Avatica client in RemoteMeta is able to 
handle Signatures of different shapes that don't necessarily conform to the 
JDBC standard 
[getTables.|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]

I have opened a PR that adds a test to RemoteDriverMockTest that verifies that 
if a Signature comes in with additional columns, that Avatica is able to handle 
it fluently. 

 

 

-The goal is to add to Avatica a mechanism such that additional metadata fields 
pertaining to tables and columns can be transmitted alongside the standard JDBC 
[getTables|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
 and {{getColumns}} calls.-

-The Avatica client needs the response to be extensible such that revisions to 
metadata fields send and future additions does not require a new JAR file.- 

-Requirements:-
 # -Avatica user does not need to download new jar files if the server decides 
to send over new metadata data in the future-
 # -If the client makes modifications to support additional columns, they 
should always be present in the call and appear with null values, as opposed to 
complete omission (Number of columns in response stays the same)- 
 # -Can handle attributes of varying types i.e. {{numberOne: int}} and 
{{booleanOne: boolean}}-
 # -Allows value retrieval from the {{ResultSet}} through calling 
{{resultSet.getInt(“booleanOne”)}} or {{resultSet.getBoolean(“booleanOne”)}}-

-Current proposal is to modify the {{MetaTable}} and {{MetaColumn}} classes to 
include a map.-
-{{{}HashMap{}}}, such that when instantiating the 
{{CalciteMetaTable}} in the {{{}ResultSet{}}}, new entries could be added in 
the future without changes to Avatica.-

-One we have a list of additional metadata fields to be emitted in 
{{{}CalciteMetaImpl{}}}, the {{ResultSet}} would be created with the 
appropriate values.-
 -- 
-There are still some challenges identified below and I would love some input:-

-Challenges:-
 * -Currently the {{MetaTable}} class that is instantiated is a 
{{{}CalciteMetaImpl{}}}. For the {{getTables()}} call, the response will be a 
list composed of schema tables of class {{CalciteMetaTable}} and database 
tables which can potentially be overloaded into 1 or more different subclasses. 
From this one heterogeneous list, we must determine the full list of columns to 
be included in the additional metadata hash. My initial plan was to provide a 
function in Calcite’s {{Table}} class such as {{getAdditionalColumns}} and 
allow it to be overloaded, but then I discovered the heterogeneity of the list.-
 * -Modifying the MetaTable class to include the hashmap of values could be 
easily done, but the challenge lies at {{{}RemoteMeta{}}}, to be able to 
serialize this cleanly so that requirement (4) is met and users can retrieve 
the values nicely. {{RemoteMeta}} currently serializes the response using 
reflection by looking at MetaTable.class and its attributes. The addition of 
one map is not immediately compatible with iterating over the keys of the map 
and turning each of those into fields. I’m looking into the idea of processing 
the enumerable in {{CalciteMetaImpl}} before the Frame gets created-

  was:
The goal is to add to Avatica a mechanism such that additional metadata fields 
pertaining to tables and columns can be transmitted alongside the standard JDBC 
[getTables|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
 and {{getColumns}} calls.

The Avatica client needs the response to be extensible such that revisions to 
metadata fields send and future additions does not require a new JAR file. 

Requirements:

 # Avatica user does not need to download new jar files if the server decides 
to send over new metadata data in the future
 # If the client makes modifications to support additional columns, they should 
always be present in the call and appear with null values, as opposed to 
complete omission (Number of columns in response stays the same) 
 # Can handle attributes of varying types i.e. {{numberOne: int}} and 
{{booleanOne: boolean}}
 # Allows value retrieval from the {{ResultSet}} through calling 
{{resultSet.getInt(“booleanOne”)}} or {{resultSet.getBoolean(“booleanOne”)}}

Current proposal is to modify the {{MetaTable}} and {{MetaColumn}} classes to 
include a map.
{{{}HashMap{}}}, such that when instantiating the

[jira] [Commented] (CALCITE-5982) Allow overloading the created enumerable in Calcite when calling getTables() or getColumns()

2023-09-11 Thread Oliver Lee (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763971#comment-17763971
 ] 

Oliver Lee commented on CALCITE-5982:
-

Hey [~julianhyde] ,

I have this PR ready for review:  [https://github.com/apache/calcite/pull/3421] 

> Allow overloading the created enumerable in Calcite when calling getTables() 
> or getColumns() 
> -
>
> Key: CALCITE-5982
> URL: https://issues.apache.org/jira/browse/CALCITE-5982
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Oliver Lee
>Assignee: Oliver Lee
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to introduce a mechanism that allows overloading the enumerable 
> type that is created when {{getTables()}} and {{getColumns()}} is called. If 
> a user provides an overloaded {{{}MetaTable{}}}/ {{MetaColumn}} class, then 
> they can add in additional metadata fields that they would like to be 
> transferred. 
>  
> Currently, {{CalciteMetaImpl}} in {{getTables()}} and {{getColumns()}} calls 
> are hardcoded to do reflection on {{MetaTable.class}} and 
> {{MetaColumn.class}} ‘ fields, matched with the list of column names that are 
> passed in. Reflection is important here, as it creates the proper 
> {{ColumnMetaData}} and {{Signature}} that the client needs to deserialize. 
> See here for {{[getTables()|#L270]]}} and here for 
> [{{getColumns()}}|https://github.com/apache/calcite/blob/164ff0a27e243850d294908dc5cff90760d0a35a/core/src/main/java/org/apache/calcite/jdbc/CalciteMetaImpl.java#L320]
>  
>  
> I would like to introduce fields ( {{metaTableClass}} , {{metaColumnClass}} ) 
> on {{CalciteMetaImpl}} that determine which class to use for each of these. 
> This will be configured as a {{CalciteConnectionProperty}} when making the 
> {{jdbc:calcite}} connection, and default to MetaTable.class and 
> MetaColumn.class if not provided. 
>  
>  
> Requirements:
>  * User can specify in {{Properties}} new {{CalciteConnectionProperty}} ’s to 
> specify which overloaded class of {{CalciteMetaTable}} and {{MetaColumn}} to 
> use
>  * If not specified, it will default to {{CalciteMetaTable.class}} and 
> {{MetaColumn.class}}
>  
>  
> The provided overloaded class will create a subclass of {{CalciteMetaTable}} 
> / {{MetaColumn}} that has the same shape constructor and also provide an 
> override for the function {{{}getColumnNames(){}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5982) Allow overloading the created enumerable in Calcite when calling getTables() or getColumns()

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5982?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5982:

Labels: pull-request-available  (was: )

> Allow overloading the created enumerable in Calcite when calling getTables() 
> or getColumns() 
> -
>
> Key: CALCITE-5982
> URL: https://issues.apache.org/jira/browse/CALCITE-5982
> Project: Calcite
>  Issue Type: New Feature
>Reporter: Oliver Lee
>Assignee: Oliver Lee
>Priority: Major
>  Labels: pull-request-available
>
> The goal is to introduce a mechanism that allows overloading the enumerable 
> type that is created when {{getTables()}} and {{getColumns()}} is called. If 
> a user provides an overloaded {{{}MetaTable{}}}/ {{MetaColumn}} class, then 
> they can add in additional metadata fields that they would like to be 
> transferred. 
>  
> Currently, {{CalciteMetaImpl}} in {{getTables()}} and {{getColumns()}} calls 
> are hardcoded to do reflection on {{MetaTable.class}} and 
> {{MetaColumn.class}} ‘ fields, matched with the list of column names that are 
> passed in. Reflection is important here, as it creates the proper 
> {{ColumnMetaData}} and {{Signature}} that the client needs to deserialize. 
> See here for {{[getTables()|#L270]]}} and here for 
> [{{getColumns()}}|https://github.com/apache/calcite/blob/164ff0a27e243850d294908dc5cff90760d0a35a/core/src/main/java/org/apache/calcite/jdbc/CalciteMetaImpl.java#L320]
>  
>  
> I would like to introduce fields ( {{metaTableClass}} , {{metaColumnClass}} ) 
> on {{CalciteMetaImpl}} that determine which class to use for each of these. 
> This will be configured as a {{CalciteConnectionProperty}} when making the 
> {{jdbc:calcite}} connection, and default to MetaTable.class and 
> MetaColumn.class if not provided. 
>  
>  
> Requirements:
>  * User can specify in {{Properties}} new {{CalciteConnectionProperty}} ’s to 
> specify which overloaded class of {{CalciteMetaTable}} and {{MetaColumn}} to 
> use
>  * If not specified, it will default to {{CalciteMetaTable.class}} and 
> {{MetaColumn.class}}
>  
>  
> The provided overloaded class will create a subclass of {{CalciteMetaTable}} 
> / {{MetaColumn}} that has the same shape constructor and also provide an 
> override for the function {{{}getColumnNames(){}}}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5986) The SqlTypeFamily for FP types is incorrect

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763907#comment-17763907
 ] 

Mihai Budiu commented on CALCITE-5986:
--

RexLiteral also has a field typeName which has this comment:
{code:java}
  // TODO jvs 26-May-2006:  Use SqlTypeFamily instead; it exists
  // for exactly this purpose (to avoid the confusion which results
  // from overloading SqlTypeName).
  /**
   * An indication of the broad type of this literal -- even if its type isn't
   * a SQL type. Sometimes this will be different than the SQL type; for
   * example, all exact numbers, including integers have typeName
   * {@link SqlTypeName#DECIMAL}. See {@link #valueMatchesType} for the
   * definitive story.
   */ {code}
Indeed, I found the typeName field of RexLiteral quite confusing.

> The SqlTypeFamily for FP types is incorrect
> ---
>
> Key: CALCITE-5986
> URL: https://issues.apache.org/jira/browse/CALCITE-5986
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> In SqlTypeFamily we have this code:
> {code:java}
> private static final Map JDBC_TYPE_TO_FAMILY =
> ...
>   .put(Types.FLOAT, NUMERIC)
>   .put(Types.REAL, NUMERIC)
>   .put(Types.DOUBLE, NUMERIC)
> {code}
> But it looks to me like the type family should be APPROXIMATE_NUMERIC.
> This impacts the way RelToSqlConverter works, for instance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5986) The SqlTypeFamily for FP types is incorrect

2023-09-11 Thread Mihai Budiu (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5986?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763882#comment-17763882
 ] 

Mihai Budiu commented on CALCITE-5986:
--

Another unpleasant thing is that the SqlTypeFamily comment is misleading: 
RelDataType.getFamily() does not even return a SqlTypeFamily, it returns a 
RelDataTypeFamily, which is an interface implemented by SqlTypeFamily.

> The SqlTypeFamily for FP types is incorrect
> ---
>
> Key: CALCITE-5986
> URL: https://issues.apache.org/jira/browse/CALCITE-5986
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> In SqlTypeFamily we have this code:
> {code:java}
> private static final Map JDBC_TYPE_TO_FAMILY =
> ...
>   .put(Types.FLOAT, NUMERIC)
>   .put(Types.REAL, NUMERIC)
>   .put(Types.DOUBLE, NUMERIC)
> {code}
> But it looks to me like the type family should be APPROXIMATE_NUMERIC.
> This impacts the way RelToSqlConverter works, for instance.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5989) Type inference for RPAD and LPAD functions (BIGQUERY) is incorrect

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5989?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5989:

Labels: pull-request-available  (was: )

> Type inference for RPAD and LPAD functions (BIGQUERY) is incorrect
> --
>
> Key: CALCITE-5989
> URL: https://issues.apache.org/jira/browse/CALCITE-5989
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> The type inference uses the type `ReturnTypes.ARG0_NULLABLE_VARYING` for the 
> output.
> This means that the output cannot be longer than arg0. This bug surfaces when 
> the query is optimized.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5988) SqlImplementor.toSql cannot emit VARBINARY literals

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5988:

Labels: pull-request-available  (was: )

> SqlImplementor.toSql cannot emit VARBINARY literals
> ---
>
> Key: CALCITE-5988
> URL: https://issues.apache.org/jira/browse/CALCITE-5988
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Mihai Budiu
>Priority: Minor
>  Labels: pull-request-available
>
> Given a literal with type VARBINARY the function SqlImplementor.toSql() will 
> crash with assertion failure error:
> {code}
> X'41424344':VARBINARY: BINARY
> java.lang.AssertionError: X'41424344':VARBINARY: BINARY
>   at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.toSql(SqlImplementor.java:1461)
>   at 
> org.apache.calcite.rel.rel2sql.SqlImplementor.toSql(SqlImplementor.java:1384)
>   at 
> org.apache.calcite.rel.rel2sql.SqlImplementor$Context.toSql(SqlImplementor.java:696)
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5964) Support additional metadata attributes in GET_TABLES and GET_COLUMNS

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5964?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5964:

Labels: pull-request-available  (was: )

> Support additional metadata attributes in GET_TABLES and GET_COLUMNS
> 
>
> Key: CALCITE-5964
> URL: https://issues.apache.org/jira/browse/CALCITE-5964
> Project: Calcite
>  Issue Type: New Feature
>  Components: avatica
>Reporter: Oliver Lee
>Assignee: Oliver Lee
>Priority: Major
>  Labels: pull-request-available
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> The goal is to add to Avatica a mechanism such that additional metadata 
> fields pertaining to tables and columns can be transmitted alongside the 
> standard JDBC 
> [getTables|https://docs.oracle.com/javase/8/docs/api/java/sql/ResultSet.html#getTables-java.lang.String-java.lang.String-java.lang.String-java.lang.String:A-)]
>  and {{getColumns}} calls.
> The Avatica client needs the response to be extensible such that revisions to 
> metadata fields send and future additions does not require a new JAR file. 
> Requirements:
>  # Avatica user does not need to download new jar files if the server decides 
> to send over new metadata data in the future
>  # If the client makes modifications to support additional columns, they 
> should always be present in the call and appear with null values, as opposed 
> to complete omission (Number of columns in response stays the same) 
>  # Can handle attributes of varying types i.e. {{numberOne: int}} and 
> {{booleanOne: boolean}}
>  # Allows value retrieval from the {{ResultSet}} through calling 
> {{resultSet.getInt(“booleanOne”)}} or {{resultSet.getBoolean(“booleanOne”)}}
> Current proposal is to modify the {{MetaTable}} and {{MetaColumn}} classes to 
> include a map.
> {{{}HashMap{}}}, such that when instantiating the 
> {{CalciteMetaTable}} in the {{{}ResultSet{}}}, new entries could be added in 
> the future without changes to Avatica.
> One we have a list of additional metadata fields to be emitted in 
> {{{}CalciteMetaImpl{}}}, the {{ResultSet}} would be created with the 
> appropriate values.
>  
> There are still some challenges identified below and I would love some input:
> Challenges:
>  * Currently the {{MetaTable}} class that is instantiated is a 
> {{{}CalciteMetaImpl{}}}. For the {{getTables()}} call, the response will be a 
> list composed of schema tables of class {{CalciteMetaTable}} and database 
> tables which can potentially be overloaded into 1 or more different 
> subclasses. From this one heterogeneous list, we must determine the full list 
> of columns to be included in the additional metadata hash. My initial plan 
> was to provide a function in Calcite’s {{Table}} class such as 
> {{getAdditionalColumns}} and allow it to be overloaded, but then I discovered 
> the heterogeneity of the list.
>  * Modifying the MetaTable class to include the hashmap of values could be 
> easily done, but the challenge lies at {{{}RemoteMeta{}}}, to be able to 
> serialize this cleanly so that requirement (4) is met and users can retrieve 
> the values nicely. {{RemoteMeta}} currently serializes the response using 
> reflection by looking at MetaTable.class and its attributes. The addition of 
> one map is not immediately compatible with iterating over the keys of the map 
> and turning each of those into fields. I’m looking into the idea of 
> processing the enumerable in {{CalciteMetaImpl}} before the Frame gets created



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763796#comment-17763796
 ] 

Julian Hyde commented on CALCITE-5995:
--

If your example query is typical, could that case be handled by constant 
reduction rather than caching? If the calls to JSON_VALUE have constant 
arguments then the calls themselves can be converted to constants at prepare 
time.

> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below as the dejsonize function being 
> called repeatedly unnecessarily.  
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread Julian Hyde (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763792#comment-17763792
 ] 

Julian Hyde commented on CALCITE-5995:
--

Can this task use the caching that I added recently for other built-in 
functions? 

> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below as the dejsonize function being 
> called repeatedly unnecessarily.  
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread xiaogang zhou (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaogang zhou updated CALCITE-5995:
---
Description: 
I used the json_value function to parse json values. And I found calcite's 
json_value function does not cache the dejsonized objects, which could cause 
some performance issue in situation below as the dejsonize function being 
called repeatedly unnecessarily.  

 
{code:java}
select 
json_value(A, 'xxx'),
json_value(A, 'yyy'),
json_value(A, 'zzz'),...
from some_table;

{code}
 

 

As project like flink uses the json_value to codegen it's own json_value 
function, I think this could cause a bad performance for users. So I suggest to 
introduce a cache in  

 

org.apache.calcite.runtime.JsonFunctions#dejsonize

 

and the solution is very common in projects like hive

[https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]

 

and of course, this feature can be turned on only some certain config is 
setted. And if this is acceptable, I think I can take the ticket. thx

 

  was:
I used the json_value function to parse json values. And I found calcite's 
json_value function does not cache the dejsonized objects, which could cause 
some performance issue in situation below. 

 
{code:java}
select 
json_value(A, 'xxx'),
json_value(A, 'yyy'),
json_value(A, 'zzz'),...
from some_table;

{code}
 

 

As project like flink uses the json_value to codegen it's own json_value 
function, I think this could cause a bad performance for users. So I suggest to 
introduce a cache in  

 

org.apache.calcite.runtime.JsonFunctions#dejsonize

 

and the solution is very common in projects like hive

[https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]

 

and of course, this feature can be turned on only some certain config is 
setted. And if this is acceptable, I think I can take the ticket. thx

 


> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below as the dejsonize function being 
> called repeatedly unnecessarily.  
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread xiaogang zhou (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

xiaogang zhou updated CALCITE-5995:
---
Description: 
I used the json_value function to parse json values. And I found calcite's 
json_value function does not cache the dejsonized objects, which could cause 
some performance issue in situation below. 

 
{code:java}
select 
json_value(A, 'xxx'),
json_value(A, 'yyy'),
json_value(A, 'zzz'),...
from some_table;

{code}
 

 

As project like flink uses the json_value to codegen it's own json_value 
function, I think this could cause a bad performance for users. So I suggest to 
introduce a cache in  

 

org.apache.calcite.runtime.JsonFunctions#dejsonize

 

and the solution is very common in projects like hive

[https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]

 

and of course, this feature can be turned on only some certain config is 
setted. And if this is acceptable, I think I can take the ticket. thx

 

> add cache to dejsonize function in JsonFunctions
> 
>
> Key: CALCITE-5995
> URL: https://issues.apache.org/jira/browse/CALCITE-5995
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Affects Versions: 1.35.0
>Reporter: xiaogang zhou
>Priority: Minor
> Fix For: 1.36.0
>
>
> I used the json_value function to parse json values. And I found calcite's 
> json_value function does not cache the dejsonized objects, which could cause 
> some performance issue in situation below. 
>  
> {code:java}
> select 
> json_value(A, 'xxx'),
> json_value(A, 'yyy'),
> json_value(A, 'zzz'),...
> from some_table;
> {code}
>  
>  
> As project like flink uses the json_value to codegen it's own json_value 
> function, I think this could cause a bad performance for users. So I suggest 
> to introduce a cache in  
>  
> org.apache.calcite.runtime.JsonFunctions#dejsonize
>  
> and the solution is very common in projects like hive
> [https://github.com/apache/hive/blob/storage-branch-2.3/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDTFJSONTuple.java]
>  
> and of course, this feature can be turned on only some certain config is 
> setted. And if this is acceptable, I think I can take the ticket. thx
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5957) Valid DATE '1945-2-2' is not accepted due to regression

2023-09-11 Thread Evgeny Stanilovsky (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5957?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763695#comment-17763695
 ] 

Evgeny Stanilovsky commented on CALCITE-5957:
-

[~zabetak] hello, seems there is no activity in additional review here, can you 
merge it ?
thanks.

> Valid DATE '1945-2-2' is not accepted due to regression
> ---
>
> Key: CALCITE-5957
> URL: https://issues.apache.org/jira/browse/CALCITE-5957
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Runkang He
>Assignee: Evgeny Stanilovsky
>Priority: Blocker
>  Labels: pull-request-available
> Fix For: avatica-1.24.0
>
> Attachments: image-2023-08-27-19-09-33-284.png
>
>  Time Spent: 10m
>  Remaining Estimate: 0h
>
> DATE '1945-2-2' is a valid date. In CALCITE-5923 when we turn on the result 
> check of `testCastStringToDateTime`, we find that Calcite accepted DATE 
> '1945-2-2' before CALCITE-5678 but not afterwards, so this is a regression 
> that we need to fix.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Created] (CALCITE-5995) add cache to dejsonize function in JsonFunctions

2023-09-11 Thread xiaogang zhou (Jira)

xiaogang zhou created CALCITE-5995:
--

 Summary: add cache to dejsonize function in JsonFunctions
 Key: CALCITE-5995
 URL: https://issues.apache.org/jira/browse/CALCITE-5995
 Project: Calcite
  Issue Type: Improvement
  Components: core
Affects Versions: 1.35.0
Reporter: xiaogang zhou
 Fix For: 1.36.0






--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5994) When a sort's input max row cnt is 1,remove the redundant sort

2023-09-11 Thread LakeShen (Jira)



[ 
https://issues.apache.org/jira/browse/CALCITE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17763690#comment-17763690
 ] 

LakeShen commented on CALCITE-5994:
---

This PR is Ready, the PR link is : 
[https://github.com/apache/calcite/pull/3418] 

if others have time, please help me to review it, thank you very much:)

> When a sort's input max row cnt is 1,remove the redundant sort
> --
>
> Key: CALCITE-5994
> URL: https://issues.apache.org/jira/browse/CALCITE-5994
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: LakeShen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> When a Sort's input source max row cnt is 1,then we could remove the 
> redundant Sort,the Sort could be a sorted semantic Sort(offset and fetch is 
> null).
> For example,the sql:
> {code:java}
> select * from (select * from tableA limit 1)  order by c ;
> {code}
> because the `(select * from tableA limit 1) ` max row cnt is 1, then we could 
> remove order by c
> {code:java}
> select * from tableA limit 1;
> {code}
> The sql:
> {code:java}
> select max(totalprice) from orders order by 1 {code}
> could converted to:
> {code:java}
> select max(totalprice) from orders{code}
> Above logic are same as Presto/Trino's 
> [RemoveRedundantSort|https://github.com/prestodb/presto/blob/c21fc28846252cd910d90f046514bf586d7bb5c6/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantSort.java#L27]
>  rule:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5994) When a sort's input max row cnt is 1,remove the redundant sort

2023-09-11 Thread LakeShen (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

LakeShen updated CALCITE-5994:
--
Fix Version/s: 1.36.0

> When a sort's input max row cnt is 1,remove the redundant sort
> --
>
> Key: CALCITE-5994
> URL: https://issues.apache.org/jira/browse/CALCITE-5994
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: LakeShen
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> When a Sort's input source max row cnt is 1,then we could remove the 
> redundant Sort,the Sort could be a sorted semantic Sort(offset and fetch is 
> null).
> For example,the sql:
> {code:java}
> select * from (select * from tableA limit 1)  order by c ;
> {code}
> because the `(select * from tableA limit 1) ` max row cnt is 1, then we could 
> remove order by c
> {code:java}
> select * from tableA limit 1;
> {code}
> The sql:
> {code:java}
> select max(totalprice) from orders order by 1 {code}
> could converted to:
> {code:java}
> select max(totalprice) from orders{code}
> Above logic are same as Presto/Trino's 
> [RemoveRedundantSort|https://github.com/prestodb/presto/blob/c21fc28846252cd910d90f046514bf586d7bb5c6/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantSort.java#L27]
>  rule:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5994) When a sort's input max row cnt is 1,remove the redundant sort

2023-09-11 Thread ASF GitHub Bot (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

ASF GitHub Bot updated CALCITE-5994:

Labels: pull-request-available  (was: )

> When a sort's input max row cnt is 1,remove the redundant sort
> --
>
> Key: CALCITE-5994
> URL: https://issues.apache.org/jira/browse/CALCITE-5994
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: LakeShen
>Priority: Major
>  Labels: pull-request-available
>
> When a Sort's input source max row cnt is 1,then we could remove the 
> redundant Sort,the Sort could be a sorted semantic Sort(offset and fetch is 
> null).
> For example,the sql:
> {code:java}
> select * from (select * from tableA limit 1)  order by c ;
> {code}
> because the `(select * from tableA limit 1) ` max row cnt is 1, then we could 
> remove order by c
> {code:java}
> select * from tableA limit 1;
> {code}
> The sql:
> {code:java}
> select max(totalprice) from orders order by 1 {code}
> could converted to:
> {code:java}
> select max(totalprice) from orders{code}
> Above logic are same as Presto/Trino's 
> [RemoveRedundantSort|https://github.com/prestodb/presto/blob/c21fc28846252cd910d90f046514bf586d7bb5c6/presto-main/src/main/java/com/facebook/presto/sql/planner/iterative/rule/RemoveRedundantSort.java#L27]
>  rule:
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5949) RexExecutable should return unchanged original expressions when it fails

2023-09-11 Thread Ruben Q L (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5949:
---
Fix Version/s: 1.36.0

> RexExecutable should return unchanged original expressions when it fails
> 
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Priority: Major
>  Labels: pull-request-available
> Fix For: 1.36.0
>
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Updated] (CALCITE-5949) RexExecutable should return unchanged original expressions when it fails

2023-09-11 Thread Ruben Q L (Jira)



 [ 
https://issues.apache.org/jira/browse/CALCITE-5949?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ruben Q L updated CALCITE-5949:
---
Summary: RexExecutable should return unchanged original expressions when it 
fails  (was: RexExecutable correct handling of invalid constant expressions)

> RexExecutable should return unchanged original expressions when it fails
> 
>
> Key: CALCITE-5949
> URL: https://issues.apache.org/jira/browse/CALCITE-5949
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Affects Versions: 1.35.0
>Reporter: Claude Brisson
>Priority: Major
>  Labels: pull-request-available
>
> While reducing, when encountering an invalid expression in the list of 
> constant expressions, RexExecutor is meant to return all initial expressions 
> unchanged.
> It fails to do so, because already handled correct expressions have already 
> been added to the returned list, which can be greater than the input list.
> For instance, when given the list \{ LN(2), LN(-2) }, the RexExecutor will 
> output a list of length 3.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

44 matches

Mail list logo