[jira] [Comment Edited] (CALCITE-525) Exception-handling in built-in functions
[ https://issues.apache.org/jira/browse/CALCITE-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651127#comment-16651127 ] Hongze Zhang edited comment on CALCITE-525 at 10/16/18 5:08 AM: Hi [~julianhyde] and [~vladimirsitnikov], Thanks for the concern about this issue, and I have read all your comments. I wonder if we could start from "catch error from a RexCall", because if we want to drop a row when error happens, it is not that easy to judge which row should be dropped and which row should be reserved in Calcite. I personally prefer to make Calcite return an empty or specified value when a RexCall failed, like what I have described about a "CATCH_ERROR" function. I hope you could take a look at [a nother work|https://github.com/zhztheplayer/calcite/commits/525-3]. The work contains following: 1. A 'CATCH_ERROR(... (EMPTY | ERROR) ON ERROR)' function and implementation; 2. A 'ERROR_MODE' option for Calcite connection: if the 'ERROR_MODE' is 'THROW_ERROR', Calcite's converter will automatically wrap every rex call with a 'CATCH_ERROR' call. By having this feature, calcite adaptors can also implement 'CATCH_ERROR' in their own rex implementation to use 'ERROR_MODE' feature. The cost is to add a non-standard operator. Also, you can see an [issue|https://issues.apache.org/jira/browse/HIVE-5438] from HIVE. was (Author: zhztheplayer): Hi [~julianhyde] and [~vladimirsitnikov], Thanks for the concern about this issue, and I have read all your comments. I wonder if we could start from "catch error from a RexCall", because if we want to drop a row when error happens, it is not that easy to judge which row should be dropped and which row should be reserved in Calcite. I personally prefer to make Calcite return an empty or specified value when a RexCall failed, like what I have described about a "CATCH_ERROR" function. I hope you could take a look at [a nother work|https://github.com/zhztheplayer/calcite/commits/525-3]. The work contains following: 1. A 'CATCH_ERROR(... (EMPTY | ERROR) ON ERROR)' function and implementation; 2. A 'ERROR_MODE' option for Calcite connection: if the 'ERROR_MODE' is 'THROW_ERROR', Calcite's converter will automatically wrap every rex call with a 'CATCH_ERROR' call. By having this feature, calcite adaptors can also implement 'CATCH_ERROR' in their own rex implementation to use 'ERROR_MODE' feature. The cost is to add a non-standard operator. Also, you can see an issue from HIVE. > Exception-handling in built-in functions > > > Key: CALCITE-525 > URL: https://issues.apache.org/jira/browse/CALCITE-525 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Hongze Zhang >Priority: Major > > The standard calls for certain built-in functions to throw exceptions. > Examples: > * 1 / 0 > * MOD(1, 0) > * OVERLAY('foo' PLACING 'x' FROM -1) > * 'x' NOT LIKE 'x' ESCAPE 'x' > First, these exceptions should occur at run time. They should cause the > current value to become null, or the row to be omitted, but should not abort > the query. (Actual behavior TBD.) > Second, EnumerableCalc does constant reduction and generates code like > 'static final int X = 0 / 0'. This code blows up when the class is loaded. It > should not. The code should give errors for each row, as described above. > While fixing this bug, see SqlOperatorBaseTest.testArgumentBounds and remove > restrictions related to /, MOD and OVERLAY, LIKE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-525) Exception-handling in built-in functions
[ https://issues.apache.org/jira/browse/CALCITE-525?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16651127#comment-16651127 ] Hongze Zhang commented on CALCITE-525: -- Hi [~julianhyde] and [~vladimirsitnikov], Thanks for the concern about this issue, and I have read all your comments. I wonder if we could start from "catch error from a RexCall", because if we want to drop a row when error happens, it is not that easy to judge which row should be dropped and which row should be reserved in Calcite. I personally prefer to make Calcite return an empty or specified value when a RexCall failed, like what I have described about a "CATCH_ERROR" function. I hope you could take a look at [a nother work|https://github.com/zhztheplayer/calcite/commits/525-3]. The work contains following: 1. A 'CATCH_ERROR(... (EMPTY | ERROR) ON ERROR)' function and implementation; 2. A 'ERROR_MODE' option for Calcite connection: if the 'ERROR_MODE' is 'THROW_ERROR', Calcite's converter will automatically wrap every rex call with a 'CATCH_ERROR' call. By having this feature, calcite adaptors can also implement 'CATCH_ERROR' in their own rex implementation to use 'ERROR_MODE' feature. The cost is to add a non-standard operator. Also, you can see an issue from HIVE. > Exception-handling in built-in functions > > > Key: CALCITE-525 > URL: https://issues.apache.org/jira/browse/CALCITE-525 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Hongze Zhang >Priority: Major > > The standard calls for certain built-in functions to throw exceptions. > Examples: > * 1 / 0 > * MOD(1, 0) > * OVERLAY('foo' PLACING 'x' FROM -1) > * 'x' NOT LIKE 'x' ESCAPE 'x' > First, these exceptions should occur at run time. They should cause the > current value to become null, or the row to be omitted, but should not abort > the query. (Actual behavior TBD.) > Second, EnumerableCalc does constant reduction and generates code like > 'static final int X = 0 / 0'. This code blows up when the class is loaded. It > should not. The code should give errors for each row, as described above. > While fixing this bug, see SqlOperatorBaseTest.testArgumentBounds and remove > restrictions related to /, MOD and OVERLAY, LIKE. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2280) Liberal "babel" parser that accepts all SQL dialects
[ https://issues.apache.org/jira/browse/CALCITE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650969#comment-16650969 ] Julian Hyde commented on CALCITE-2280: -- We can definitely add HIVE as a conformance level. As for adding Hive QL features to Calcite's core SQL parser, it depends on the particular feature: * If a feature is in the SQL standard, then we should add it. * If a feature is generally useful, and occurs in several databases, then we should add it, enabled by a conformance flag. ({{ORDER BY ordinal}} is an example of such a feature.) * If a feature is peculiar to just one database, and does not offer a significant extra functionality, or makes the parser ambiguous or more complex for other dialects, then we probably will not add it to Calcite's dialect. Note that Hive QL is now called Hive SQL, because Hive is trying to move closer to standard SQL over time. I don't think we should be adding to Calcite support for features that the Hive team no longer recommends that people use in Hive. For such features, Babel is probably a better home. > Liberal "babel" parser that accepts all SQL dialects > > > Key: CALCITE-2280 > URL: https://issues.apache.org/jira/browse/CALCITE-2280 > Project: Calcite > Issue Type: Bug > Components: babel >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > Fix For: 1.17.0 > > > Create a parser that accepts all SQL dialects. > It would accept common dialects such as Oracle, MySQL, PostgreSQL, BigQuery. > If you have preferred dialects, please let us know in the comments section. > (If you're willing to work on a particular dialect, even better!) > We would do this in a new module, inheriting and extending the parser in the > same way that the DDL parser in the "server" module does. > This would be a messy and difficult project, because we would have to comply > with the rules of each parser (and its set of built-in functions) rather than > writing the rules as we would like them to be. That's why I would keep it out > of the core parser. But it would also have large benefits. > This would be new territory Calcite: as a tool for manipulating/understanding > SQL, not (necessarily) for relational algebra or execution. > Some possible uses: > * analyze query lineage (what tables and columns are used in a query); > * translate from one SQL dialect to another (using the JDBC adapter to > generate SQL in the target dialect); > * a "deep" compatibility mode (much more comprehensive than the current > compatibility mode) where Calcite could pretend to be, say, Oracle; > * SQL parser as a service: a REST call gives a SQL query, and returns a JSON > or XML document with the parse tree. > If you can think of interesting uses, please discuss in the comments. > There are similarities with Uber's > [QueryParser|https://eng.uber.com/queryparser/] tool. Maybe we can > collaborate, or make use of their test cases. > We will need a lot of sample queries. If you are able to contribute sample > queries for particular dialects, please discuss in the comments section. It > would be good if the sample queries are based on a familiar schema (e.g. > scott or foodmart) but we can be flexible about this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2280) Liberal "babel" parser that accepts all SQL dialects
[ https://issues.apache.org/jira/browse/CALCITE-2280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650899#comment-16650899 ] Xuefu Zhang commented on CALCITE-2280: -- Hi [~julianhyde] With my limited understanding of this change, I assume we have put a boiler place for extending Calcite grammar to accommodate any specialties in any SQL dialects. This is great! On the other hand, I'm curious about whether we can provide, or have a plan to do so, a conformance level for Hive QL as a whole, similar to MySQL, given the popularity of Hive QL in the big data world. Could you shed some light on this? Thanks. > Liberal "babel" parser that accepts all SQL dialects > > > Key: CALCITE-2280 > URL: https://issues.apache.org/jira/browse/CALCITE-2280 > Project: Calcite > Issue Type: Bug > Components: babel >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > Fix For: 1.17.0 > > > Create a parser that accepts all SQL dialects. > It would accept common dialects such as Oracle, MySQL, PostgreSQL, BigQuery. > If you have preferred dialects, please let us know in the comments section. > (If you're willing to work on a particular dialect, even better!) > We would do this in a new module, inheriting and extending the parser in the > same way that the DDL parser in the "server" module does. > This would be a messy and difficult project, because we would have to comply > with the rules of each parser (and its set of built-in functions) rather than > writing the rules as we would like them to be. That's why I would keep it out > of the core parser. But it would also have large benefits. > This would be new territory Calcite: as a tool for manipulating/understanding > SQL, not (necessarily) for relational algebra or execution. > Some possible uses: > * analyze query lineage (what tables and columns are used in a query); > * translate from one SQL dialect to another (using the JDBC adapter to > generate SQL in the target dialect); > * a "deep" compatibility mode (much more comprehensive than the current > compatibility mode) where Calcite could pretend to be, say, Oracle; > * SQL parser as a service: a REST call gives a SQL query, and returns a JSON > or XML document with the parse tree. > If you can think of interesting uses, please discuss in the comments. > There are similarities with Uber's > [QueryParser|https://eng.uber.com/queryparser/] tool. Maybe we can > collaborate, or make use of their test cases. > We will need a lot of sample queries. If you are able to contribute sample > queries for particular dialects, please discuss in the comments section. It > would be good if the sample queries are based on a familiar schema (e.g. > scott or foodmart) but we can be flexible about this. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-1289) RexUtil.simplifyCase() should account for nullability
[ https://issues.apache.org/jira/browse/CALCITE-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650449#comment-16650449 ] Julian Hyde commented on CALCITE-1289: -- Given e = "CASE WHEN condition THEN TRUE ELSE FALSE END" then it is valid to simplify "e" to "condition" with unknownAs=FALSE and it is also valid to simplify "e" to "condition IS TRUE". With unknownAs=UNKNOWN it is not valid to simplify "e" to "condition". > RexUtil.simplifyCase() should account for nullability > - > > Key: CALCITE-1289 > URL: https://issues.apache.org/jira/browse/CALCITE-1289 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: MinJi Kim >Assignee: Julian Hyde >Priority: Major > Fix For: 1.9.0 > > > RexUtil.simplify() returns incorrect type information if the RexNode is a > CASE(CONDITION, TRUE, FALSE). In this case, the CASE RexNode is a Boolean > with isNullable == false. But, with the simplify, RexUtil.simplify returns > CONDITION. When CONDITION has isNullable == true, the returned RexNode does > not match the row type and fails to transform (vice versa). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-1289) RexUtil.simplifyCase() should account for nullability
[ https://issues.apache.org/jira/browse/CALCITE-1289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650267#comment-16650267 ] Zoltan Haindrich commented on CALCITE-1289: --- I was looking into these casts around case expressions; and I think the original simplification was not entirely correct by rewriting that case to just {{CONDITION}} - I think the right rewrite would have been {{CONDITION IS TRUE}} ; which would be already 2 valued :) > RexUtil.simplifyCase() should account for nullability > - > > Key: CALCITE-1289 > URL: https://issues.apache.org/jira/browse/CALCITE-1289 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: MinJi Kim >Assignee: Julian Hyde >Priority: Major > Fix For: 1.9.0 > > > RexUtil.simplify() returns incorrect type information if the RexNode is a > CASE(CONDITION, TRUE, FALSE). In this case, the CASE RexNode is a Boolean > with isNullable == false. But, with the simplify, RexUtil.simplify returns > CONDITION. When CONDITION has isNullable == true, the returned RexNode does > not match the row type and fails to transform (vice versa). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (CALCITE-2619) Reduce string literal creation cost by removing charset check
[ https://issues.apache.org/jira/browse/CALCITE-2619?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16650255#comment-16650255 ] Daniel Lemire commented on CALCITE-2619: I'll do some evaluation of the problem (hopefully this week) and report back. > Reduce string literal creation cost by removing charset check > - > > Key: CALCITE-2619 > URL: https://issues.apache.org/jira/browse/CALCITE-2619 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Ted Xu >Assignee: Julian Hyde >Priority: Major > > The cost of creating NlsString is very high, due to its charset check. In > some cases, e.g., expression evaluate because of Partition Prune, the > NlsString creation costs 40%+ of total executor's overhead. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)