[jira] [Comment Edited] (CALCITE-525) Exception-handling in built-in functions

2018-10-15 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651127#comment-16651127
 ] 

Hongze Zhang edited comment on CALCITE-525 at 10/16/18 5:08 AM:


Hi [~julianhyde] and [~vladimirsitnikov],

Thanks for the concern about this issue, and I have read all your comments.

I wonder if we could start from "catch error from a RexCall", because if we 
want to drop a row when error happens, it is not that easy to judge which row 
should be dropped and which row should be reserved in Calcite.

I personally prefer to make Calcite return an empty or specified value when a 
RexCall failed, like what I have described about a "CATCH_ERROR" function. I 
hope you could take a look at [a nother 
work|https://github.com/zhztheplayer/calcite/commits/525-3]. The work contains 
following:

1. A 'CATCH_ERROR(... (EMPTY | ERROR) ON ERROR)' function and implementation;
 2. A 'ERROR_MODE' option for Calcite connection: if the 'ERROR_MODE' is 
'THROW_ERROR', Calcite's converter will automatically wrap every rex call with 
a 'CATCH_ERROR' call.

By having this feature, calcite adaptors can also implement 'CATCH_ERROR' in 
their own rex implementation to use 'ERROR_MODE' feature. The cost is to add a 
non-standard operator.

Also, you can see an [issue|https://issues.apache.org/jira/browse/HIVE-5438] 
from HIVE.


was (Author: zhztheplayer):
Hi [~julianhyde] and [~vladimirsitnikov],

Thanks for the concern about this issue, and I have read all your comments.

I wonder if we could start from "catch error from a RexCall", because if we 
want to drop a row when error happens, it is not that easy to judge which row 
should be dropped and which row should be reserved in Calcite.

I personally prefer to make Calcite return an empty or specified value when a 
RexCall failed, like what I have described about a "CATCH_ERROR" function. I 
hope you could take a look at [a nother 
work|https://github.com/zhztheplayer/calcite/commits/525-3]. The work contains 
following:

1. A 'CATCH_ERROR(... (EMPTY | ERROR) ON ERROR)' function and implementation;
 2. A 'ERROR_MODE' option for Calcite connection: if the 'ERROR_MODE' is 
'THROW_ERROR', Calcite's converter will automatically wrap every rex call with 
a 'CATCH_ERROR' call.

By having this feature, calcite adaptors can also implement 'CATCH_ERROR' in 
their own rex implementation to use 'ERROR_MODE' feature. The cost is to add a 
non-standard operator.

Also, you can see an issue from HIVE.

> Exception-handling in built-in functions
> 
>
> Key: CALCITE-525
> URL: https://issues.apache.org/jira/browse/CALCITE-525
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Hongze Zhang
>Priority: Major
>
> The standard calls for certain built-in functions to throw exceptions.
> Examples:
> * 1 / 0
> * MOD(1, 0)
> * OVERLAY('foo' PLACING 'x' FROM -1)
> * 'x' NOT LIKE 'x' ESCAPE 'x'
> First, these exceptions should occur at run time. They should cause the 
> current value to become null, or the row to be omitted, but should not abort 
> the query. (Actual behavior TBD.)
> Second, EnumerableCalc does constant reduction and generates code like 
> 'static final int X = 0 / 0'. This code blows up when the class is loaded. It 
> should not. The code should give errors for each row, as described above.
> While fixing this bug, see SqlOperatorBaseTest.testArgumentBounds and remove 
> restrictions related to /, MOD and OVERLAY, LIKE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-525) Exception-handling in built-in functions

2018-10-15 Thread Hongze Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651127#comment-16651127
 ] 

Hongze Zhang commented on CALCITE-525:
--

Hi [~julianhyde] and [~vladimirsitnikov],

Thanks for the concern about this issue, and I have read all your comments.

I wonder if we could start from "catch error from a RexCall", because if we 
want to drop a row when error happens, it is not that easy to judge which row 
should be dropped and which row should be reserved in Calcite.

I personally prefer to make Calcite return an empty or specified value when a 
RexCall failed, like what I have described about a "CATCH_ERROR" function. I 
hope you could take a look at [a nother 
work|https://github.com/zhztheplayer/calcite/commits/525-3]. The work contains 
following:

1. A 'CATCH_ERROR(... (EMPTY | ERROR) ON ERROR)' function and implementation;
 2. A 'ERROR_MODE' option for Calcite connection: if the 'ERROR_MODE' is 
'THROW_ERROR', Calcite's converter will automatically wrap every rex call with 
a 'CATCH_ERROR' call.

By having this feature, calcite adaptors can also implement 'CATCH_ERROR' in 
their own rex implementation to use 'ERROR_MODE' feature. The cost is to add a 
non-standard operator.

Also, you can see an issue from HIVE.

> Exception-handling in built-in functions
> 
>
> Key: CALCITE-525
> URL: https://issues.apache.org/jira/browse/CALCITE-525
> Project: Calcite
>  Issue Type: Bug
>Reporter: Julian Hyde
>Assignee: Hongze Zhang
>Priority: Major
>
> The standard calls for certain built-in functions to throw exceptions.
> Examples:
> * 1 / 0
> * MOD(1, 0)
> * OVERLAY('foo' PLACING 'x' FROM -1)
> * 'x' NOT LIKE 'x' ESCAPE 'x'
> First, these exceptions should occur at run time. They should cause the 
> current value to become null, or the row to be omitted, but should not abort 
> the query. (Actual behavior TBD.)
> Second, EnumerableCalc does constant reduction and generates code like 
> 'static final int X = 0 / 0'. This code blows up when the class is loaded. It 
> should not. The code should give errors for each row, as described above.
> While fixing this bug, see SqlOperatorBaseTest.testArgumentBounds and remove 
> restrictions related to /, MOD and OVERLAY, LIKE.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2280) Liberal "babel" parser that accepts all SQL dialects

2018-10-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650969#comment-16650969
 ] 

Julian Hyde commented on CALCITE-2280:
--

We can definitely add HIVE as a conformance level. As for adding Hive QL 
features to Calcite's core SQL parser, it depends on the particular feature:
 * If a feature is in the SQL standard, then we should add it.
 * If a feature is generally useful, and occurs in several databases, then we 
should add it, enabled by a conformance flag. ({{ORDER BY ordinal}} is an 
example of such a feature.)
 * If a feature is peculiar to just one database, and does not offer a 
significant extra functionality, or makes the parser ambiguous or more complex 
for other dialects, then we probably will not add it to Calcite's dialect.

Note that Hive QL is now called Hive SQL, because Hive is trying to move closer 
to standard SQL over time. I don't think we should be adding to Calcite support 
for features that the Hive team no longer recommends that people use in Hive.

For such features, Babel is probably a better home.

> Liberal "babel" parser that accepts all SQL dialects
> 
>
> Key: CALCITE-2280
> URL: https://issues.apache.org/jira/browse/CALCITE-2280
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.17.0
>
>
> Create a parser that accepts all SQL dialects.
> It would accept common dialects such as Oracle, MySQL, PostgreSQL, BigQuery. 
> If you have preferred dialects, please let us know in the comments section. 
> (If you're willing to work on a particular dialect, even better!)
> We would do this in a new module, inheriting and extending the parser in the 
> same way that the DDL parser in the "server" module does.
> This would be a messy and difficult project, because we would have to comply 
> with the rules of each parser (and its set of built-in functions) rather than 
> writing the rules as we would like them to be. That's why I would keep it out 
> of the core parser. But it would also have large benefits.
> This would be new territory Calcite: as a tool for manipulating/understanding 
> SQL, not (necessarily) for relational algebra or execution.
> Some possible uses:
> * analyze query lineage (what tables and columns are used in a query);
> * translate from one SQL dialect to another (using the JDBC adapter to 
> generate SQL in the target dialect);
> * a "deep" compatibility mode (much more comprehensive than the current 
> compatibility mode) where Calcite could pretend to be, say, Oracle;
> * SQL parser as a service: a REST call gives a SQL query, and returns a JSON 
> or XML document with the parse tree.
> If you can think of interesting uses, please discuss in the comments.
> There are similarities with Uber's 
> [QueryParser|https://eng.uber.com/queryparser/] tool. Maybe we can 
> collaborate, or make use of their test cases.
> We will need a lot of sample queries. If you are able to contribute sample 
> queries for particular dialects, please discuss in the comments section. It 
> would be good if the sample queries are based on a familiar schema (e.g. 
> scott or foodmart) but we can be flexible about this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2280) Liberal "babel" parser that accepts all SQL dialects

2018-10-15 Thread Xuefu Zhang (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650899#comment-16650899
 ] 

Xuefu Zhang commented on CALCITE-2280:
--

Hi [~julianhyde] With my limited understanding of this change, I assume we have 
put a boiler place for extending Calcite grammar to accommodate any specialties 
in any SQL dialects. This is great!

On the other hand, I'm curious about whether we can provide, or have a plan to 
do so, a conformance level for Hive QL as a whole, similar to MySQL, given the 
popularity of Hive QL in the big data world. 

Could you shed some light on this? Thanks.

> Liberal "babel" parser that accepts all SQL dialects
> 
>
> Key: CALCITE-2280
> URL: https://issues.apache.org/jira/browse/CALCITE-2280
> Project: Calcite
>  Issue Type: Bug
>  Components: babel
>Reporter: Julian Hyde
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.17.0
>
>
> Create a parser that accepts all SQL dialects.
> It would accept common dialects such as Oracle, MySQL, PostgreSQL, BigQuery. 
> If you have preferred dialects, please let us know in the comments section. 
> (If you're willing to work on a particular dialect, even better!)
> We would do this in a new module, inheriting and extending the parser in the 
> same way that the DDL parser in the "server" module does.
> This would be a messy and difficult project, because we would have to comply 
> with the rules of each parser (and its set of built-in functions) rather than 
> writing the rules as we would like them to be. That's why I would keep it out 
> of the core parser. But it would also have large benefits.
> This would be new territory Calcite: as a tool for manipulating/understanding 
> SQL, not (necessarily) for relational algebra or execution.
> Some possible uses:
> * analyze query lineage (what tables and columns are used in a query);
> * translate from one SQL dialect to another (using the JDBC adapter to 
> generate SQL in the target dialect);
> * a "deep" compatibility mode (much more comprehensive than the current 
> compatibility mode) where Calcite could pretend to be, say, Oracle;
> * SQL parser as a service: a REST call gives a SQL query, and returns a JSON 
> or XML document with the parse tree.
> If you can think of interesting uses, please discuss in the comments.
> There are similarities with Uber's 
> [QueryParser|https://eng.uber.com/queryparser/] tool. Maybe we can 
> collaborate, or make use of their test cases.
> We will need a lot of sample queries. If you are able to contribute sample 
> queries for particular dialects, please discuss in the comments section. It 
> would be good if the sample queries are based on a familiar schema (e.g. 
> scott or foodmart) but we can be flexible about this.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-1289) RexUtil.simplifyCase() should account for nullability

2018-10-15 Thread Julian Hyde (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650449#comment-16650449
 ] 

Julian Hyde commented on CALCITE-1289:
--

Given e = "CASE WHEN condition THEN TRUE ELSE FALSE END" then it is valid to 
simplify "e" to "condition" with unknownAs=FALSE and it is also valid to 
simplify "e" to "condition IS TRUE".

With unknownAs=UNKNOWN it is not valid to simplify "e" to "condition".

> RexUtil.simplifyCase() should account for nullability
> -
>
> Key: CALCITE-1289
> URL: https://issues.apache.org/jira/browse/CALCITE-1289
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: MinJi Kim
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.9.0
>
>
> RexUtil.simplify() returns incorrect type information if the RexNode is a 
> CASE(CONDITION, TRUE, FALSE).  In this case, the CASE RexNode is a Boolean 
> with isNullable == false.  But, with the simplify, RexUtil.simplify returns 
> CONDITION.  When CONDITION has isNullable == true, the returned RexNode does 
> not match the row type and fails to transform  (vice versa).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-1289) RexUtil.simplifyCase() should account for nullability

2018-10-15 Thread Zoltan Haindrich (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650267#comment-16650267
 ] 

Zoltan Haindrich commented on CALCITE-1289:
---

I was looking into these casts around case expressions; and I think the 
original simplification was not entirely correct by rewriting that case to just 
{{CONDITION}} - I think the right rewrite would have been {{CONDITION IS TRUE}} 
; which would be already 2 valued :)

> RexUtil.simplifyCase() should account for nullability
> -
>
> Key: CALCITE-1289
> URL: https://issues.apache.org/jira/browse/CALCITE-1289
> Project: Calcite
>  Issue Type: Bug
>  Components: core
>Reporter: MinJi Kim
>Assignee: Julian Hyde
>Priority: Major
> Fix For: 1.9.0
>
>
> RexUtil.simplify() returns incorrect type information if the RexNode is a 
> CASE(CONDITION, TRUE, FALSE).  In this case, the CASE RexNode is a Boolean 
> with isNullable == false.  But, with the simplify, RexUtil.simplify returns 
> CONDITION.  When CONDITION has isNullable == true, the returned RexNode does 
> not match the row type and fails to transform  (vice versa).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (CALCITE-2619) Reduce string literal creation cost by removing charset check

2018-10-15 Thread Daniel Lemire (JIRA)


[ 
https://issues.apache.org/jira/browse/CALCITE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650255#comment-16650255
 ] 

Daniel Lemire commented on CALCITE-2619:


I'll do some evaluation of the problem (hopefully this week) and report back.

> Reduce string literal creation cost by removing charset check
> -
>
> Key: CALCITE-2619
> URL: https://issues.apache.org/jira/browse/CALCITE-2619
> Project: Calcite
>  Issue Type: Improvement
>  Components: core
>Reporter: Ted Xu
>Assignee: Julian Hyde
>Priority: Major
>
> The cost of creating NlsString is very high, due to its charset check. In 
> some cases, e.g., expression evaluate because of Partition Prune, the 
> NlsString creation costs 40%+ of total executor's overhead.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)