[jira] [Created] (CALCITE-6229) Upgrade json-path version to 2.9.0
Will Noble created CALCITE-6229: --- Summary: Upgrade json-path version to 2.9.0 Key: CALCITE-6229 URL: https://issues.apache.org/jira/browse/CALCITE-6229 Project: Calcite Issue Type: Bug Reporter: Will Noble There is a CVE: https://osv.dev/vulnerability/GHSA-pfh2-hfmq-phg5 Probably not critical for most users since it would be strange for unsanitized user input to make it into the JSON parser through Calcite, but should be easy to patch. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5164) Planner#parser can't parse TIMESTAMP() function
[ https://issues.apache.org/jira/browse/CALCITE-5164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788914#comment-17788914 ] Will Noble commented on CALCITE-5164: - If I'm not mistaken, I believe this ticket is now obsolete since CALCITE-5508 is resolved. > Planner#parser can't parse TIMESTAMP() function > --- > > Key: CALCITE-5164 > URL: https://issues.apache.org/jira/browse/CALCITE-5164 > Project: Calcite > Issue Type: Bug > Components: babel >Affects Versions: 1.30.0 >Reporter: Jiajun Xie >Assignee: Jiajun Xie >Priority: Minor > > Both core and babel will parse fail. > {code:java} > FrameworkConfig coreConfig = Frameworks.newConfigBuilder().build(); > Planner corePlanner = Frameworks.getPlanner(coreConfig); > corePlanner.parse("SELECT TIMESTAMP('2022-05-21 08:00:00'"); > // Caused by: org.apache.calcite.sql.parser.babel.ParseException: > Incorrect syntax near the keyword 'TIMESTAMP' at line 1, column 8. > FrameworkConfig babelConfig = Frameworks.newConfigBuilder() > .parserConfig(SqlParser.Config.DEFAULT.withParserFactory( > SqlBabelParserImpl.FACTORY)) > .build(); > Planner babelPlanner = Frameworks.getPlanner(babelConfig); > babelPlanner.parse("SELECT TIMESTAMP('2022-05-21 08:00:00'"); > // Caused by: org.apache.calcite.sql.parser.babel.ParseException: > Incorrect syntax near the keyword 'TIMESTAMP' at line 1, column 8 > {code} > Here are some databases that support TIMESTAMP function. > - MySQL: > [https://dev.mysql.com/doc/refman/8.0/en/date-and-time-functions.html#function_timestamp] > {code:java} > select timestamp('2022-05-21 08:00:00') > // result > timestamp('2022-05-21 08:00:00') > 2022-05-21 08:00:00 > {code} > - Derby: [https://docs.oracle.com/javadb/10.6.2.1/ref/rreftimestampfunc.html] > Also, here are some databases that not support TIMESTAMP function: > - Oracle: > [https://docs.oracle.com/cd/B19306_01/server.102/b14225/ch4datetime.htm] > {code:java} > select timestamp('2022-05-21 08:00:00') > //ORA-00923: FROM keyword not found where expected > {code} > - SQL Server: > [https://docs.microsoft.com/en-us/sql/t-sql/functions/date-and-time-data-types-and-functions-transact-sql?view=sql-server-ver15] > {code:java} > select timestamp('2022-05-21 08:00:00') > // Msg 195 Level 15 State 10 Line 1 > // 'timestamp' is not a recognized built-in function name.{code} > Is it necessary for us to support it in babel module? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-6117) Converting SAFE_CAST from RexCall to SqlCall fails to add the type as an argument
[ https://issues.apache.org/jira/browse/CALCITE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6117: Summary: Converting SAFE_CAST from RexCall to SqlCall fails to add the type as an argument (was: Unparsing SAFE_CAST is broken) > Converting SAFE_CAST from RexCall to SqlCall fails to add the type as an > argument > - > > Key: CALCITE-6117 > URL: https://issues.apache.org/jira/browse/CALCITE-6117 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Trivial > Labels: pull-request-available > > BigQuery's {{SAFE_CAST}} function operates much like {{CAST}}. Currently, the > unparsing logic is broken because it does not use the [special logic for > {{CAST}}|https://github.com/apache/calcite/blob/94dc1673da824174f9271677ead73cfae1aeb29b/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L816] > in {{SqlImplementor.callToSql}} that turns the rex call's return type into > the sql call's second argument. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-6117) Unparsing SAFE_CAST is broken
[ https://issues.apache.org/jira/browse/CALCITE-6117?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17785720#comment-17785720 ] Will Noble commented on CALCITE-6117: - It will trigger [this assertion|https://github.com/apache/calcite/blob/3843ede3d28783235780c0f81f725dc7e64a7828/core/src/main/java/org/apache/calcite/sql/fun/SqlCastFunction.java#L254] because the sql call only has 1 argument. > Unparsing SAFE_CAST is broken > - > > Key: CALCITE-6117 > URL: https://issues.apache.org/jira/browse/CALCITE-6117 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Trivial > Labels: pull-request-available > > BigQuery's {{SAFE_CAST}} function operates much like {{CAST}}. Currently, the > unparsing logic is broken because it does not use the [special logic for > {{CAST}}|https://github.com/apache/calcite/blob/94dc1673da824174f9271677ead73cfae1aeb29b/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L816] > in {{SqlImplementor.callToSql}} that turns the rex call's return type into > the sql call's second argument. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6117) Unparsing SAFE_CAST is broken
Will Noble created CALCITE-6117: --- Summary: Unparsing SAFE_CAST is broken Key: CALCITE-6117 URL: https://issues.apache.org/jira/browse/CALCITE-6117 Project: Calcite Issue Type: Bug Reporter: Will Noble BigQuery's {{SAFE_CAST}} function operates much like {{CAST}}. Currently, the unparsing logic is broken because it does not use the [special logic for {{CAST}}|https://github.com/apache/calcite/blob/94dc1673da824174f9271677ead73cfae1aeb29b/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L816] in {{SqlImplementor.callToSql}} that turns the rex call's return type into the sql call's second argument. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-6062) Allow conversion of un-zoned timestamp strings to timestamp with local time zone
[ https://issues.apache.org/jira/browse/CALCITE-6062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6062: Description: [This function|https://github.com/apache/calcite/blob/ce88348960e95e1c13da850b4422b2f49f022d93/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L4333] that implements conversion from strings to {{TIMESTAMP WITH LOCAL TIME ZONE}} uses {{TimestampWithTimeZoneString}}, which seems to have been a subtle mistake. That function is only used to implement {{CAST}} expressions from strings to TS w/ LTZ. Those expressions should allow for strings that do not have a time zone, e.g. {{CAST('2023-10-19 12:00:00' AS TIMESTAMP WITH LOCAL TIME ZONE)}}, but it currently cannot because the {{TimestampWithTimeZoneString}} class always expects a string with a time zone on the end. In the case that no zone is included in the string, we should assume that the string expresses a timestamp in the data context time zone by default. was: [This function|https://github.com/apache/calcite/blob/ce88348960e95e1c13da850b4422b2f49f022d93/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L4333] that implements conversion from strings to {{TIMESTAMP WITH LOCAL TIME ZONE}} uses {{TimestampWithTimeZoneString}}, which seems to have been a subtle mistake. That function is only used to implement {{CAST}} expressions from strings to TS w/ LTZ. Those expressions should allow for strings that do not have a time zone, e.g. {{CAST('2023-10-19 12:00:00' AS TIMESTAMP WITH LOCAL TIME ZONE)}}, but it currently cannot because the {{TimestampWithTimeZoneString}} class always expects a time zone string. In the case that no zone is included in the string, we should assume that the string expresses a timestamp in the data context time zone by defaul. > Allow conversion of un-zoned timestamp strings to timestamp with local time > zone > > > Key: CALCITE-6062 > URL: https://issues.apache.org/jira/browse/CALCITE-6062 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Priority: Minor > > [This > function|https://github.com/apache/calcite/blob/ce88348960e95e1c13da850b4422b2f49f022d93/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L4333] > that implements conversion from strings to {{TIMESTAMP WITH LOCAL TIME > ZONE}} uses {{TimestampWithTimeZoneString}}, which seems to have been a > subtle mistake. > That function is only used to implement {{CAST}} expressions from strings to > TS w/ LTZ. Those expressions should allow for strings that do not have a time > zone, e.g. {{CAST('2023-10-19 12:00:00' AS TIMESTAMP WITH LOCAL TIME ZONE)}}, > but it currently cannot because the {{TimestampWithTimeZoneString}} class > always expects a string with a time zone on the end. > In the case that no zone is included in the string, we should assume that the > string expresses a timestamp in the data context time zone by default. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6062) Allow conversion of un-zoned timestamp strings to timestamp with local time zone
Will Noble created CALCITE-6062: --- Summary: Allow conversion of un-zoned timestamp strings to timestamp with local time zone Key: CALCITE-6062 URL: https://issues.apache.org/jira/browse/CALCITE-6062 Project: Calcite Issue Type: Bug Components: core Reporter: Will Noble [This function|https://github.com/apache/calcite/blob/ce88348960e95e1c13da850b4422b2f49f022d93/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L4333] that implements conversion from strings to {{TIMESTAMP WITH LOCAL TIME ZONE}} uses {{TimestampWithTimeZoneString}}, which seems to have been a subtle mistake. That function is only used to implement {{CAST}} expressions from strings to TS w/ LTZ. Those expressions should allow for strings that do not have a time zone, e.g. {{CAST('2023-10-19 12:00:00' AS TIMESTAMP WITH LOCAL TIME ZONE)}}, but it currently cannot because the {{TimestampWithTimeZoneString}} class always expects a time zone string. In the case that no zone is included in the string, we should assume that the string expresses a timestamp in the data context time zone by defaul. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-6055) Customize handling of DateTimeTypeName based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1438#comment-1438 ] Will Noble commented on CALCITE-6055: - My bad. Added more context to the description. Let me know if anything is still unclear or if you think the title could still use tweaking. > Customize handling of DateTimeTypeName based on type system > --- > > Key: CALCITE-6055 > URL: https://issues.apache.org/jira/browse/CALCITE-6055 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > Labels: pull-request-available > > *Background*: > In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, > minute, and second, henceforth referred to as clock-calendar parameters. This > does not define an unambiguous instant in time, since timestamps in different > time zones can have different parameters at the same instant. It is analogous > to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} > type a different definition: “an absolute point in time, independent of any > time zone or convention such as Daylight Savings Time.” > ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} > simply attaches a time zone to the existing {{TIMESTAMP}} type, making it > _both_ a set of clock-calendar parameters _and_ an unambiguous instant in > time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. > According to Oracle, it “is normalized to the database time zone, and the > time zone offset is not stored as part of the column data. When users > retrieve the data, Oracle returns it in the users' local session time zone.” > This act of conversion between the local time zone and the database time zone > means that users interacting with this kind of timestamp will observe the > same instant in time even though they see different clock-calendar > parameters. However, the local time zone is not considered “part of the > data”. Thus, it has the semantics of an un-zoned, absolute point in time, > analogous to a BigQuery {{TIMESTAMP}}. > This creates the confusing situation where a BigQuery {{DATETIME}} is > actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an > ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. > *Problem*: > This is very similar to CALCITE-5424, which only applied to literals (e.g. > {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the > {{DateTimeTypeName}} > construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] > which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS > TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a > {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that > and the {{TIMESTAMP}} type names. During validation and sql-to-rel > conversion, we can look up the type names in the UDT map. If no mapping is > found for {{DATETIME}} (which would occur for any dialect besides BigQuery), > an error would occur during validation, because that type does not exist in > that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on > the canonical {{TIMESTAMP}} type, preserving the existing behavior for all > other dialects. > *In summary*: > {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP > WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all > other dialects. > {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} > for BigQuery, but throw a validation error for all other dialects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-6055) Customize handling of DateTimeTypeName based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Summary: Customize handling of DateTimeTypeName based on type system (was: Customize handling of type name based on type system) > Customize handling of DateTimeTypeName based on type system > --- > > Key: CALCITE-6055 > URL: https://issues.apache.org/jira/browse/CALCITE-6055 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > Labels: pull-request-available > > *Background*: > In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, > minute, and second, henceforth referred to as clock-calendar parameters. This > does not define an unambiguous instant in time, since timestamps in different > time zones can have different parameters at the same instant. It is analogous > to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} > type a different definition: “an absolute point in time, independent of any > time zone or convention such as Daylight Savings Time.” > ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} > simply attaches a time zone to the existing {{TIMESTAMP}} type, making it > _both_ a set of clock-calendar parameters _and_ an unambiguous instant in > time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. > According to Oracle, it “is normalized to the database time zone, and the > time zone offset is not stored as part of the column data. When users > retrieve the data, Oracle returns it in the users' local session time zone.” > This act of conversion between the local time zone and the database time zone > means that users interacting with this kind of timestamp will observe the > same instant in time even though they see different clock-calendar > parameters. However, the local time zone is not considered “part of the > data”. Thus, it has the semantics of an un-zoned, absolute point in time, > analogous to a BigQuery {{TIMESTAMP}}. > This creates the confusing situation where a BigQuery {{DATETIME}} is > actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an > ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. > *Problem*: > This is very similar to CALCITE-5424, which only applied to literals (e.g. > {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the > {{DateTimeTypeName}} > construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] > which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS > TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a > {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that > and the {{TIMESTAMP}} type names. During validation and sql-to-rel > conversion, we can look up the type names in the UDT map. If no mapping is > found for {{DATETIME}} (which would occur for any dialect besides BigQuery), > an error would occur during validation, because that type does not exist in > that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on > the canonical {{TIMESTAMP}} type, preserving the existing behavior for all > other dialects. > *In summary*: > {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP > WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all > other dialects. > {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} > for BigQuery, but throw a validation error for all other dialects. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Description: *Background*: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it _both_ a set of clock-calendar parameters _and_ an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. *Problem*: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion, we can look up the type names in the UDT map. If no mapping is found for {{DATETIME}} (which would occur for any dialect besides BigQuery), an error would occur during validation, because that type does not exist in that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the canonical {{TIMESTAMP}} type, preserving the existing behavior for all other dialects. *In summary*: {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all other dialects. {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} for BigQuery, but throw a validation error for all other dialects. was: *Background*: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters _and_ an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. *Problem*: This is very similar to CALCITE-5424, which only applied to literals (e.g.
[jira] [Updated] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Description: *Background*: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters _and_ an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. *Problem*: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion, we can look up the type names in the UDT map. If no mapping is found for {{DATETIME}} (which would occur for any dialect besides BigQuery), an error would occur during validation, because that type does not exist in that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the canonical {{TIMESTAMP}} type, preserving the existing behavior for all other dialects. *In summary*: {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all other dialects. {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} for BigQuery, but throw a validation error for all other dialects. was: Background: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. Problem: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP
[jira] [Updated] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Description: Background: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. Problem: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion, we can look up the type names in the UDT map. If no mapping is found for {{DATETIME}} (which would occur for any dialect besides BigQuery), an error would occur during validation, because that type does not exist in that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the canonical {{TIMESTAMP}} type, preserving the existing behavior for all other dialects. In summary: {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all other dialects. {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} for BigQuery, but throw a validation error for all other dialects. was: Background: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. Problem: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17
[jira] [Updated] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Description: Background: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. Problem: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion, we can look up the type names in the UDT map. If no mapping is found for {{DATETIME}} (which would occur for any dialect besides BigQuery), an error would occur during validation, because that type does not exist in that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the canonical {{TIMESTAMP}} type, preserving the existing behavior for all other dialects. was: Background: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. Problem: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use
[jira] [Updated] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Description: Background: In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, minute, and second, henceforth referred to as clock-calendar parameters. This does not define an unambiguous instant in time, since timestamps in different time zones can have different parameters at the same instant. It is analogous to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} type a different definition: “an absolute point in time, independent of any time zone or convention such as Daylight Savings Time.” ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both a set of clock-calendar parameters and an unambiguous instant in time. The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According to Oracle, it “is normalized to the database time zone, and the time zone offset is not stored as part of the column data. When users retrieve the data, Oracle returns it in the users' local session time zone.” This act of conversion between the local time zone and the database time zone means that users interacting with this kind of timestamp will observe the same instant in time even though they see different clock-calendar parameters. However, the local time zone is not considered “part of the data”. Thus, it has the semantics of an un-zoned, absolute point in time, analogous to a BigQuery {{TIMESTAMP}}. This creates the confusing situation where a BigQuery {{DATETIME}} is actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}. Problem: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{SqlUnknownTypeNameSpec}} that's resolved during validation. was:This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{SqlUnknownTypeNameSpec}} that's resolved during validation. > Customize handling of type name based on type system > > > Key: CALCITE-6055 > URL: https://issues.apache.org/jira/browse/CALCITE-6055 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > Labels: pull-request-available > > Background: > In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, > minute, and second, henceforth referred to as clock-calendar parameters. This > does not define an unambiguous instant in time, since timestamps in different > time zones can have different parameters at the same instant. It is analogous > to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} > type a different definition: “an absolute point in time, independent of any > time zone or convention such as Daylight Savings Time.” > ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} > simply attaches a time zone to the existing {{TIMESTAMP}} type, making it > both a set of clock-calendar parameters and an unambiguous instant in time. > The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. > According to Oracle, it “is normalized to the database time zone, and the > time zone offset is not stored as part of the column data. When users > retrieve the data, Oracle returns it in the users' local session time zone.” > This act of conversion between the local time zone and the database time zone > means that users interacting with this kind of timestamp will observe the > same instant in time even though they see different clock-calendar > parameters. However, the local time zone is not considered “part of the > data”. Thus, it has the semantics of an un-zoned, absolute point in time, > analogous to a BigQuery {{TIMESTAMP}}. > This creates the confusing situation where a BigQuery {{DATETIME}} is > actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an >
[jira] [Commented] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776924#comment-17776924 ] Will Noble commented on CALCITE-5526: - Let me know if you think the two basic tests in [this draft PR|https://github.com/apache/calcite/pull/3476] fall short of illuminating the issue. The {{UdtTest}} case passes and shows the happy path of scenario #2 above. The {{SqlUserDefinedTypeNameSpec}} nodes are converted to normal {{RelDataType}} objects during sql-to-rel conversion. Relational expressions are dialect-agnostic by nature. Losing the context that those types were originally UDTs is fine, even desirable. The {{SqlParserTest}} is the base class for a few tests like {{CoreSqlParserTest}} and {{ServerParserTest}}. These tests involve parsing and then unparsing SQL without converting to rel nodes, so it exemplifies the complexity of scenario #1. With the draft implementation as currently written, {{CoreSqlParserTest}} passes, but {{ServerParserTest}} does not, because the former uses backtick identifier quoting and the latter uses double-quotes. While these tests function by simply parsing then unparsing, perhaps this does not represent a realistic use-case for Calcite, and could be revisited as a testing strategy. Otherwise, the implementation could be tweaked in various ways as described in that PR. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776449#comment-17776449 ] Will Noble edited comment on CALCITE-5526 at 10/18/23 1:50 AM: --- There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody, or if it even makes a ton a sense. It's not particularly relevant to us. It would also be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses the query, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. The reason I question whether scenario #1 even makes sense is that it implies programs will always want to unparse a query in the same context (schema and possibly dialect) that it was parsed. This may not be true, and if it's not, it becomes complicated for the program to get it right every time. In scenario #2, all that context has already been lost anyway. was (Author: wnoble): There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody, or if it even makes a ton a sense. It's not particularly relevant to us. It would also be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses the query, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776449#comment-17776449 ] Will Noble edited comment on CALCITE-5526 at 10/18/23 1:42 AM: --- There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody, or if it even makes a ton a sense. It's not particularly relevant to us. It would also be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses the query, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. was (Author: wnoble): There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody, or if it even makes a ton a sense. It's not particularly relevant to us. It would also be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776449#comment-17776449 ] Will Noble edited comment on CALCITE-5526 at 10/18/23 1:41 AM: --- There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody, or if it even makes a ton a sense. It's not particularly relevant to us. It would also be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. was (Author: wnoble): There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody. It's not particularly relevant to us. This would be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776449#comment-17776449 ] Will Noble edited comment on CALCITE-5526 at 10/18/23 1:37 AM: --- There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody. It's not particularly relevant to us. This would be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}} for each particular dialect. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. was (Author: wnoble): There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody. It's not particularly relevant to us. This would be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}}. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776449#comment-17776449 ] Will Noble edited comment on CALCITE-5526 at 10/18/23 1:36 AM: --- There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody. It's not particularly relevant to us. This would be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 2. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}}. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. was (Author: wnoble): There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody. It's not particularly relevant to us. This would be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 1. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}}. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776449#comment-17776449 ] Will Noble commented on CALCITE-5526: - There are actually 2 different scenarios: 1. The program parses a query using a UDT map, and ends up with {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes in the AST. The program never bothers converting that AST into a relational expression, and simply unparses it in a specific dialect. I'm not sure how relevant this scenario is to anybody. It's not particularly relevant to us. This would be the annoying scenario to support; we would have to consult the type mapping for these {{SqlUnknownLiteral}} and {{SqlUserDefinedTypeNameSpec}} nodes during unparsing. 1. The program parses, converts to {{RelNode}} form, does whatever, then converts back to {{SqlNode}} form, and *then* unparses in a specific dialect. This is the scenario we really care about. In this scenario, all those "unknown" and "user-defined" nodes have disappeared; replaced by their canonical types. This would be easy to support: we just hardcode the mapping in {{SqlDialect.unparseDateTimeLiteral}} and {{SqlDialect.getCastSpec}}. I'm inclined to move forward on scenario #2. Seems like they're actually fairly independent issues. The former only deals with those unknown/user-defined SQL nodes, and the latter only deals with "canonical" nodes like {{SqlBasicTypeNameSpec}}. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17776433#comment-17776433 ] Will Noble commented on CALCITE-6055: - Actually, this can be handled fine using the existing {{SqlUserDefinedTypeNameSpec}}. We had come up with a fix for this in our local branch, but it was still erroneously using {{SqlBasicTypeNameSpec}} for the original {{TIMESTAMP}} type name, leading to the complex problem described in CALCITE-5636. I should be able to obsolete that ticket in short order as I address this one. > Customize handling of type name based on type system > > > Key: CALCITE-6055 > URL: https://issues.apache.org/jira/browse/CALCITE-6055 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > > This is very similar to CALCITE-5424, which only applied to literals (e.g. > {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the > {{DateTimeTypeName}} > construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] > which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS > TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a > {{SqlUnknownTypeNameSpec}} that's resolved during validation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-6055) Customize handling of type name based on type system
[ https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-6055: Description: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a {{SqlUnknownTypeNameSpec}} that's resolved during validation. (was: This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST '2023-10-17 12:00:00' AS TIMESTAMP}}). Seems like we can use a very similar approach: introduce a {{SqlUnknownTypeNameSpec}} that's resolved during validation.) > Customize handling of type name based on type system > > > Key: CALCITE-6055 > URL: https://issues.apache.org/jira/browse/CALCITE-6055 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > > This is very similar to CALCITE-5424, which only applied to literals (e.g. > {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the > {{DateTimeTypeName}} > construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] > which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS > TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a > {{SqlUnknownTypeNameSpec}} that's resolved during validation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6055) Customize handling of type name based on type system
Will Noble created CALCITE-6055: --- Summary: Customize handling of type name based on type system Key: CALCITE-6055 URL: https://issues.apache.org/jira/browse/CALCITE-6055 Project: Calcite Issue Type: Bug Reporter: Will Noble This is very similar to CALCITE-5424, which only applied to literals (e.g. {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the {{DateTimeTypeName}} construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019] which is used in other contexts (e.g. {{CAST '2023-10-17 12:00:00' AS TIMESTAMP}}). Seems like we can use a very similar approach: introduce a {{SqlUnknownTypeNameSpec}} that's resolved during validation. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-6054) LAST_DAY should support timestamps with local time zone
Will Noble created CALCITE-6054: --- Summary: LAST_DAY should support timestamps with local time zone Key: CALCITE-6054 URL: https://issues.apache.org/jira/browse/CALCITE-6054 Project: Calcite Issue Type: Bug Components: core Reporter: Will Noble The [implementation for {{LAST_DAY}}|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/adapter/enumerable/RexImpTable.java#L2497] currently throws an error when it encounters a {{TIMESTAMP WITH LOCAL TIME ZONE}}. Rather, it should adjust the value of such a timestamp based on the data context time zone, then treat it like a regular {{TIMESTAMP}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-6045) CURRENT_TIMESTAMP has incorrect return type
[ https://issues.apache.org/jira/browse/CALCITE-6045?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17775930#comment-17775930 ] Will Noble commented on CALCITE-6045: - This would seem to be a case where the standard SQL function {{CURRENT_TIMESTAMP}} (meant to return a {{{}TIMESTAMP WITH TIME ZONE}}) has a name collision with the BigQuery-specific function {{CURRENT_TIMESTAMP}} (mean to return a {{{}TIMESTAMP WITH LOCAL TIME ZONE}} a.k.a. BigQuery-specific {{TIMESTAMP}}). Is there a standard procedure for handling function name collisions between standard SQL and particular dialects? > CURRENT_TIMESTAMP has incorrect return type > --- > > Key: CALCITE-6045 > URL: https://issues.apache.org/jira/browse/CALCITE-6045 > Project: Calcite > Issue Type: Bug >Reporter: Tanner Clary >Priority: Major > > When trying to work on CALCITE-6021, I noticed that {{CURRENT_TIMESTAMP}} > currently returns type {{TIMESTAMP}} when it should be > {{TIMESTAMP_WITH_LOCAL_TIME_ZONE}}. > After modifying it, I noticed function was returning the time from (UTC - > System TZ) hours ago. For example, I am in {{America/Los_Angeles}} and if I > called the function at {{2023-10-10 13:28:00 America/Los_Angeles}}, it would > return {{2023-10-10 06:28:00 America/Los_Angeles}}. > I think this is because the DataContext {{CURRENT_TIMESTAMP}} variable, which > is meant to represent milliseconds since epoch UTC, actually has the timezone > offset applied in {{CalciteConnectionImpl#DataContextImpl}} > [here|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/jdbc/CalciteConnectionImpl.java#L442]. > To be clear: it is meant to represent millis since epoch UTC, but instead it > is millis since epoch [system tz], as I understand it. > Additionally, I believe the {{getString()}} method for timestamps in > AvaticaResultSet should behave similarly to > [{{SqlFunctions#timestampWithLocalTimezoneToString()}}|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/runtime/SqlFunctions.java#L4021] > when dealing with a {{TIMESTAMP WITH LOCAL TIME ZONE}}. Right now, it does > not take the timezone into consideration so although it represents the > accurate instant in time, it displays differently than > {{CAST(CURRENT_TIMESTAMP AS VARCHAR)}}. > For example, {{SELECT CURRENT_TIMESTAMP, CAST(CURRENT_TIMESTAMP AS > VARCHAR)}}, with the correct return type, returns something like: > {{2023-10-10 13:28:00 | 2023-10-10 06:28:00.000 America/Los_Angeles}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762525#comment-17762525 ] Will Noble edited comment on CALCITE-5636 at 9/7/23 12:58 AM: -- Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. There may actually be a couple bugs at play here. Anyway, the reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes {{expandSelectItem()}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4638], which eventually tries to derive the type of the {{TIMESTAMP_ADD()}} call [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6555], which must validate the operands in order to derive the type of the function call, and in so doing coerces the type of the first operand by casting it to an ISO timestamp. All of this results in a value in {{expandedSelectItems}} that looks like this: {code:sql} TIMESTAMP_ADD(CAST(DATE(1900, 1, 1) AS TIMESTAMP(0)), INTERVAL 1 DAY) AS `ts` {code} [A few lines down|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4659] in {{validateSelectList()}}, we invoke {{validateExpr()}} on that expanded node, and this eventually tries to derive the type of of that {{CAST}} expression, which results in visiting the inserted {{TIMESTAMP(0)}} expression (which is a {{SqlDataTypeSpec}}) [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6657], and in so doing switches it from {{TIMESTAMP}} to {{TIMESTAMP WITH LOCAL TIME ZONE}}, because it thinks that the {{CAST}} expression was written by the user, and must therefore be translated according to the catalog reader. I'm still thinking about how to handle this. My [original PR|https://github.com/apache/calcite/pull/3144] tried to handle it by consulting the type map when the {{CAST}} is inserted during coercion, but what we probably actually want is for the type map to be consulted during validation IFF the {{SqlDataTypeSpec}} was actually written by the user, as opposed to being inserted during expansion. Perhaps we should introduce a boolean flag to {{SqlDataTypeSpec}} that indicates that the type is final? was (Author: wnoble): Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489].
[jira] [Comment Edited] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762525#comment-17762525 ] Will Noble edited comment on CALCITE-5636 at 9/7/23 12:54 AM: -- Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. There may actually be a couple bugs at play here. Anyway, the reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes {{expandSelectItem()}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4638], which eventually tries to derive the type of the {{TIMESTAMP_ADD()}} call [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6555], which must validate the operands in order to derive the type of the function call, and in so doing coerces the type of the first operand by casting it to an ISO timestamp. All of this results in a value in {{expandedSelectItems}} that looks like this: {code:sql} TIMESTAMP_ADD(CAST(DATE(1900, 1, 1) AS TIMESTAMP(0)), INTERVAL 1 DAY) AS `ts` {code} [A few lines down|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4659] in {{validateSelectList()}}, we invoke {{validateExpr()}} on that expanded node, and this eventually tries to derive the type of of that {{CAST}} expression, which results in visiting the inserted {{TIMESTAMP(0)}} expression (which is a {{SqlDataTypeSpec}}) [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6657], and in so doing switches it from {{TIMESTAMP}} to {{TIMESTAMP WITH LOCAL TIME ZONE}}, because it thinks that the {{CAST}} expression was written by the user, and must therefore be translated according to the catalog reader. I'm still thinking about how to handle this. was (Author: wnoble): Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. There may actually be a
[jira] [Comment Edited] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762525#comment-17762525 ] Will Noble edited comment on CALCITE-5636 at 9/7/23 12:52 AM: -- Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. There may actually be a couple bugs at play here. Anyway, the reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes {{expandSelectItem()}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4638], which eventually tries to derive the type of the {{TIMESTAMP_ADD()}} call [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6555], which must validate the operands in order to derive the type of the function call, and in so doing coerces the type of the first operand by casting it to an ISO timestamp. All of this results in a value in {{expandedSelectItems}} that looks like this: {code:sql} TIMESTAMP_ADD(CAST(DATE(1900, 1, 1) AS TIMESTAMP(0)), INTERVAL 1 DAY) AS `ts` {code} [A few lines down|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4659] in {{validateSelectList()}}, we invoke {{validateExpr()}} on that expanded node, and this eventually tries to derive the type of of that {{CAST}} expression, which results in visiting the inserted {{SqlDataTypeSpec}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6657], and in so doing switches it from {{TIMESTAMP}} to {{TIMESTAMP WITH LOCAL TIME ZONE}}, because it thinks that the {{CAST}} expression was written by the user, and must therefore be translated according to the catalog reader. I'm still thinking about how to handle this. was (Author: wnoble): Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. There may actually be a couple bugs at play here. The reason for
[jira] [Comment Edited] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762525#comment-17762525 ] Will Noble edited comment on CALCITE-5636 at 9/7/23 12:48 AM: -- Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. There may actually be a couple bugs at play here. The reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes {{expandSelectItem()}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4638], which eventually tries to derive the type of the {{TIMESTAMP_ADD()}} call [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6555], which must validate the operands in order to derive the type of the function call, and in so doing coerces the type of the first operand by casting it to an ISO timestamp. All of this results in a value in {{expandedSelectItems}} that looks like this: {code:sql} TIMESTAMP_ADD(CAST(DATE(1900, 1, 1) AS TIMESTAMP(0)), INTERVAL 1 DAY) AS `ts` {code} [A few lines down|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4659] in {{validateSelectList()}}, we invoke {{validateExpr()}} on that expanded node, and this eventually tries to derive the type of of that {{CAST}} expression, which results in visiting the inserted {{SqlDataTypeSpec}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6657], and in so doing switches it from {{TIMESTAMP}} to {{TIMESTAMP WITH LOCAL TIME ZONE}}, because it thinks that the {{CAST}} expression was written by the user, and must therefore be translated according to the catalog reader. I'm still thinking about how to handle this. was (Author: wnoble): Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. The reason for the discrepancy can be seen in
[jira] [Comment Edited] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762525#comment-17762525 ] Will Noble edited comment on CALCITE-5636 at 9/7/23 12:45 AM: -- Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. The reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes {{expandSelectItem()}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4638], which eventually tries to derive the type of the {{TIMESTAMP_ADD()}} call [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6555], which must validate the operands in order to derive the type of the function call, and in so doing coerces the type of the first operand by casting it to an ISO timestamp. All of this results in a value in {{expandedSelectItems}} that looks like this: {code:sql} TIMESTAMP_ADD(CAST(DATE(1900, 1, 1) AS TIMESTAMP(0)), INTERVAL 1 DAY) AS `ts` {code} [A few lines down|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4659] in {{validateSelectList()}}, we invoke {{validateExpr()}} on that expanded node, and this eventually tries to derive the type of of that {{CAST}} expression, which results in visiting the inserted {{SqlDataTypeSpec}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6657], and in so doing switches it from {{TIMESTAMP}} to {{TIMESTAMP WITH LOCAL TIME ZONE}}, because it thinks that the {{CAST}} expression was written by the user, and must therefore be translated according to the catalog reader. I'm still thinking about how to handle this. was (Author: wnoble): Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. The reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes
[jira] [Commented] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17762525#comment-17762525 ] Will Noble commented on CALCITE-5636: - Finally getting back to this. Hopefully this is a much better description of the problem. It occurs when using BigQuery's catalog type map (where {{TIMESTAMP}} maps to {{TIMESTAMP WITH LOCAL TIME ZONE}}) and {{SqlToRelConverter}} encounters a query like this: {code:sql} SELECT TIMESTAMP_ADD(DATE(1900, 1, 1), INTERVAL (1) DAY) AS `ts` {code} The parsed {{SqlNode}} has validated type {{TIMESTAMP(0) NOT NULL}}, and it ends up converting into the following relational expression: {code:java} LogicalProject(ts=[+(CAST(DATE(1900, 1, 1)):TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL, *(8640:INTERVAL DAY, 1))]) {code} which of course has converted type {{TIMESTAMP_WITH_LOCAL_TIME_ZONE(0) NOT NULL}}, so the type of the {{SqlNode}} does not match the type of the {{{}RelNode{}}}, and it triggers [this exception|https://github.com/apache/calcite/blob/50c3edfc3d6630528ab51fe836bd50df82cc7db8/core/src/main/java/org/apache/calcite/sql2rel/SqlToRelConverter.java#L489]. Btw, despite the fact that a BQ {{TIMESTAMP}} is really a timestamp w/ ltz, the {{TIMESTAMP_ADD()}} function actually returns a {{DATETIME}} object (ISO timestamp) when the first argument is a {{DATE}}, so the validated type of the {{SqlNode}} is actually correct, as verified by manually running the query against BQ. My original PR was mistaken in believing that it should be timestamp w/ ltz, as the name seems to imply. Whoops. The reason for the discrepancy can be seen in {{SqlValidatorImpl.validateSelectList()}}. This invokes {{expandSelectItem()}} [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L4638], which eventually tries to derive the type of the {{TIMESTAMP_ADD()}} call [here|https://github.com/apache/calcite/blob/bcf6bd8577b25c563b1c597c70704594a18ca1a3/core/src/main/java/org/apache/calcite/sql/validate/SqlValidatorImpl.java#L6555], which must validate the operands in order to derive the type of the function call, and in so doing coerces the type of the first operand by casting it to an ISO timestamp. All of this results in a value in {{expandedSelectItems}} that looks like this: {code:sql} TIMESTAMP_ADD(CAST(DATE(1900, 1, 1) AS TIMESTAMP(0)), INTERVAL 1 DAY) AS `ts` {code} A few lines down, > Consult type mappings when coercing types > - > > Key: CALCITE-5636 > URL: https://issues.apache.org/jira/browse/CALCITE-5636 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Will Noble >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This is another offshoot of CALCITE-5346. > In > [{{AbstractTypeCoercion.implicitCast}}|https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L713], > Calcite may insert {{CAST}} invocations disregarding the root schema type > map. This happens during the validation phase, so it's easy enough to read > the type map directly in the {{AbstractTypeCoercion}} utility, although it's > hard to say if this is an optimal solution. It may make sense to incorporate > the type map into the {{RelDataTypeSystem}} which is more widely available in > non-validation phases of SQL processing pipelines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17748764#comment-17748764 ] Will Noble commented on CALCITE-5866: - The main thing for us to decide is what the behavior should be for a Calcite relational expression, then we can handle each dialect as necessary. If this was already hashed out somewhere linkable I'd be curious to know what was said. # We could decide that any sort or aggregate node effectively de-stabilizes any prior sorts, meaning the ordering is lost but the limit and offset are kept. I believe that would be consistent with standard SQL databases. When translating stable-sorting dialects to rel nodes, we can use a rule that pushes sorts to the outermost layer possible, similar to the example I gave in my last comment. # The canonical behavior of Calcite's rel nodes *could* depart from what a "standard SQL" dialect expects, and keep prior sorts stable. When translating standard SQL dialects into rel nodes, we can use a rule that removes unrestricted inner sorts, so that translating the resulting rel node into a stable-sorting dialect produces the same results. No matter which is chosen, both styles should be accommodable in any dialect. I have a feeling these rules I'm describing may already exist. > Not all dialects support sorting in sub-queries > --- > > Key: CALCITE-5866 > URL: https://issues.apache.org/jira/browse/CALCITE-5866 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > > The rel-to-sql converter inserts subqueries in certain situations, such as > when sorting by ordinal (see CALCITE-5768). Certain dialects, such as MSSQL > (SQL Server) ignore the {{ORDER BY}} clause in a subquery. We may need a new > dialect-level setting like {{canSortInSubQuery}} that dictates whether > Calcite can safely insert sub-queries with {{ORDER BY}} clauses in them, or > whether it must do everything in it's power to avoid that (such as refusing > to sort by ordinal in cases where not all sort expressions are included in > the {{SELECT}} list). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745321#comment-17745321 ] Will Noble edited comment on CALCITE-5866 at 7/21/23 12:40 AM: --- Actually, I think we could come up with a general solution using aliases. Working on top of the simple examples in my last comment, this could work: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB") AS "t0" ORDER BY "JOB", "$f1" {code} (I have not actually tested this, but I believe it could work). The trick is that, for queries that do not support ordering in sub-queries, we must try to move the {{ORDER BY}} clause to the outer query when we insert a sub-query, make sure the sub-query has aliases, then sort by alias in the outer query. This may require complex logic to make sure it doesn't interfere with other sorts, but I can't yet see a reason why it wouldn't work. Is there anything in the rel-to-sql converter that already works kinda like this? was (Author: wnoble): Actually, I think we could come up with a general solution using aliases. Working on top of the simple examples in my last comment, this could work: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB") AS "t0" ORDER BY "JOB", "$f1" {code} (I have not actually tested this, but I believe it could work). The trick is that, for queries that do not support ordering in sub-queries, we must try to move the {{ORDER BY}} clause to the outer query when we insert a sub-query, make sure the sub-query has aliases, then sort by alias in the outer query. > Not all dialects support sorting in sub-queries > --- > > Key: CALCITE-5866 > URL: https://issues.apache.org/jira/browse/CALCITE-5866 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > > The rel-to-sql converter inserts subqueries in certain situations, such as > when sorting by ordinal (see CALCITE-5768). Certain dialects, such as MSSQL > (SQL Server) ignore the {{ORDER BY}} clause in a subquery. We may need a new > dialect-level setting like {{canSortInSubQuery}} that dictates whether > Calcite can safely insert sub-queries with {{ORDER BY}} clauses in them, or > whether it must do everything in it's power to avoid that (such as refusing > to sort by ordinal in cases where not all sort expressions are included in > the {{SELECT}} list). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745321#comment-17745321 ] Will Noble commented on CALCITE-5866: - Actually, I think we could come up with a general solution using aliases. Working on top of the simple examples in my last comment, this could work: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB") AS "t0" ORDER BY "JOB", "$f1" {code} (I have not actually tested this, but I believe it could work). The trick is that, for queries that do not support ordering in sub-queries, we must try to move the {{ORDER BY}} clause to the outer query when we insert a sub-query, make sure the sub-query has aliases, then sort by alias in the outer query. > Not all dialects support sorting in sub-queries > --- > > Key: CALCITE-5866 > URL: https://issues.apache.org/jira/browse/CALCITE-5866 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > > The rel-to-sql converter inserts subqueries in certain situations, such as > when sorting by ordinal (see CALCITE-5768). Certain dialects, such as MSSQL > (SQL Server) ignore the {{ORDER BY}} clause in a subquery. We may need a new > dialect-level setting like {{canSortInSubQuery}} that dictates whether > Calcite can safely insert sub-queries with {{ORDER BY}} clauses in them, or > whether it must do everything in it's power to avoid that (such as refusing > to sort by ordinal in cases where not all sort expressions are included in > the {{SELECT}} list). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745319#comment-17745319 ] Will Noble edited comment on CALCITE-5866 at 7/21/23 12:34 AM: --- > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. The ordering of the rows in the sub-query will remain stable as they pass through later stages of processing (unless they are re-ordered in a later stage). This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this way. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While exploring solutions for CALCITE-5808 I came up with this that leverages MSSQL's sort-by-ordinal ability: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2) AS "t0" {code} Of course, this is a problem because it has an {{ORDER BY}} clause in a sub-query, which will be ignored. So this query has no dependable order for the results. Using sort-by-ordinal (or even sort-by-alias), it would actually be impossible to write a single query that sorts on anything that is not a column in the final output. In order to achieve that, you must sort by expression. In this case, the top example works fine. But MSSQL does a bad enough job at matching expressions in the {{ORDER BY}} list to expressions in the {{SELECT}} list that we cannot depend on sort-by-expression in the general case. Something like {{SELECT ORDER BY }} will fail in the general case with the error message ["A constant expression was encountered in the ORDER BY list"|https://stackoverflow.com/questions/49377988/a-constant-expression-was-encountered-in-the-order-by-list-position-1] even for expressions that are not constant. I just encountered this error trying to run this query (I know it's complicated but it's hard to say for sure where the problem is so I'm reproducing it almost as-is from my real-world example): {code:sql} SELECT ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) AS `users.created_microsecond` FROM `USERS` AS `users` GROUP BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) ORDER BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) DESC FETCH NEXT 1000 ROWS ONLY {code} As you can see, the {{ORDER BY}} expression is the same as the one in the {{SELECT}} and {{GROUP BY}} lists, but MSSQL thinks it's a constant expression. Taken together, this means: # We cannot generally rely on sorting by ordinal (or alias) because we need to insert sub-queries for sort expressions that aren't also selected, and we can't sort sub-queries. # We cannot generally rely on sorting by expression because MSSQL does a bad job of recognizing when expressions are constant or not. This part is very poorly documented, because clearly sorting by expression works in some circumstances (and is in fact the only way to achieve certain results, like sorting by an un-selected expression), but not others. I'm not sure how we can reconcile these in a reliable way. Perhaps the only way is to file bugs against SQL Server. was (Author: wnoble): > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this way. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of
[jira] [Comment Edited] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745319#comment-17745319 ] Will Noble edited comment on CALCITE-5866 at 7/21/23 12:21 AM: --- > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this way. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While exploring solutions for CALCITE-5808 I came up with this that leverages MSSQL's sort-by-ordinal ability: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2) AS "t0" {code} Of course, this is a problem because it has an {{ORDER BY}} clause in a sub-query, which will be ignored. So this query has no dependable order for the results. Using sort-by-ordinal (or even sort-by-alias), it would actually be impossible to write a single query that sorts on anything that is not a column in the final output. In order to achieve that, you must sort by expression. In this case, the top example works fine. But MSSQL does a bad enough job at matching expressions in the {{ORDER BY}} list to expressions in the {{SELECT}} list that we cannot depend on sort-by-expression in the general case. Something like {{SELECT ORDER BY }} will fail in the general case with the error message ["A constant expression was encountered in the ORDER BY list"|https://stackoverflow.com/questions/49377988/a-constant-expression-was-encountered-in-the-order-by-list-position-1] even for expressions that are not constant. I just encountered this error trying to run this query (I know it's complicated but it's hard to say for sure where the problem is so I'm reproducing it almost as-is from my real-world example): {code:sql} SELECT ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) AS `users.created_microsecond` FROM `USERS` AS `users` GROUP BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) ORDER BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) DESC FETCH NEXT 1000 ROWS ONLY {code} As you can see, the {{ORDER BY}} expression is the same as the one in the {{SELECT}} and {{GROUP BY}} lists, but MSSQL thinks it's a constant expression. Taken together, this means: # We cannot generally rely on sorting by ordinal (or alias) because we need to insert sub-queries for sort expressions that aren't also selected, and we can't sort sub-queries. # We cannot generally rely on sorting by expression because MSSQL does a bad job of recognizing when expressions are constant or not. This part is very poorly documented, because clearly sorting by expression works in some circumstances (and is in fact the only way to achieve certain results, like sorting by an un-selected expression), but not others. I'm not sure how we can reconcile these in a reliable way. Perhaps the only way is to file bugs against SQL Server. was (Author: wnoble): > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this was. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While
[jira] [Comment Edited] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745319#comment-17745319 ] Will Noble edited comment on CALCITE-5866 at 7/21/23 12:18 AM: --- > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this was. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While exploring solutions for CALCITE-5808 I came up with this that leverages MSSQL's sort-by-ordinal ability: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2) AS "t0" {code} Of course, this is a problem because it has an {{ORDER BY}} clause in a sub-query, which will be ignored. So this query has no dependable order for the results. Using sort-by-ordinal (or even sort-by-alias), it would actually be impossible to write a single query that sorts on anything that is not a column in the final output. In order to achieve that, you must sort by expression. In this case, the top example works fine. But MSSQL does a bad enough job at matching expressions in the {{ORDER BY}} list to expressions in the {{SELECT}} list that we cannot depend on sort-by-expression in the general case. Something like {{SELECT ORDER BY }} will fail in the general case with the error message ["A constant expression was encountered in the ORDER BY list"|https://stackoverflow.com/questions/49377988/a-constant-expression-was-encountered-in-the-order-by-list-position-1] even for expressions that are not constant. I just encountered this error trying to run this query (I know it's complicated but it's hard to say for sure where the problem is so I'm reproducing it almost as-is from my real-world example): {code:sql} SELECT ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) AS `users.created_microsecond` FROM `USERS` AS `users` GROUP BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) ORDER BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) DESC FETCH NEXT 1000 ROWS ONLY {code} As you can see, the {{ORDER BY}} expression is the same as the one in the {{SELECT}} and {{GROUP BY}} lists, but MSSQL thinks it's a constant expression. Taken together, this means: # We cannot generally rely on sorting by ordinal (or alias) because we need to insert sub-queries for sort expressions that aren't also selected, and we can't sort sub-queries. # We cannot generally rely on sorting by expression because MSSQL does a bad job of recognizing when expressions are constant or not. This part is very poorly documented, because clearly sorting by expression works in some circumstances (and is in fact the only way to achieve certain results, like sorting by an un-selected expression), but not others. I'm not sure how we can reconcile these in a reliable way. Perhaps the only way is to file bugs against SQL Server. was (Author: wnoble): > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this was. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While
[jira] [Comment Edited] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745319#comment-17745319 ] Will Noble edited comment on CALCITE-5866 at 7/21/23 12:17 AM: --- > IMO the `ORDER BY` clause in a sub-query is useless, because it not affects > the final result. Please correct it if I am wrong. Most dialects allow sorting in sub-queries. This is a departure from the relational model, but MSSQL is the only popular SQL dialect I've encountered that does not work this was. Because of this, it is not generally possible to sort on something that is not also selected. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While exploring solutions for CALCITE-5808 I came up with this that leverages MSSQL's sort-by-ordinal ability: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2) AS "t0" {code} Of course, this is a problem because it has an {{ORDER BY}} clause in a sub-query, which will be ignored. So this query has no dependable order for the results. Using sort-by-ordinal, it would actually be impossible to write a single query that sorts on anything that is not a column in the final output. In order to achieve that, you must sort by expression. In this case, the top example works fine. But MSSQL does a bad enough job at matching expressions in the {{ORDER BY}} list to expressions in the {{SELECT}} list that we cannot depend on sort-by-expression in the general case. Something like {{SELECT ORDER BY }} will fail in the general case with the error message ["A constant expression was encountered in the ORDER BY list"|https://stackoverflow.com/questions/49377988/a-constant-expression-was-encountered-in-the-order-by-list-position-1] even for expressions that are not constant. I just encountered this error trying to run this query (I know it's complicated but it's hard to say for sure where the problem is so I'm reproducing it almost as-is from my real-world example): {code:sql} SELECT ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) AS `users.created_microsecond` FROM `USERS` AS `users` GROUP BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) ORDER BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) DESC FETCH NEXT 1000 ROWS ONLY {code} As you can see, the {{ORDER BY}} expression is the same as the one in the {{SELECT}} and {{GROUP BY}} lists, but MSSQL thinks it's a constant expression. Taken together, this means: # We cannot generally rely on sorting by ordinal (or alias) because we need to insert sub-queries for sort expressions that aren't also selected, and we can't sort sub-queries. # We cannot generally rely on sorting by expression because MSSQL does a bad job of recognizing when expressions are constant or not. This part is very poorly documented, because clearly sorting by expression works in some circumstances (and is in fact the only way to achieve certain results, like sorting by an un-selected expression), but not others. I'm not sure how we can reconcile these in a reliable way. Perhaps the only way is to file bugs against SQL Server. was (Author: wnoble): This is actually a weirder problem than it seems because I've encountered some severely under-documented discrepancies in what SQL Server 2017 will accept, and it seems like MSSQL is actually incompatible with particular edge-cases that other databases can handle. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While exploring solutions for CALCITE-5808 I came up with this that leverages MSSQL's sort-by-ordinal ability: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS
[jira] [Commented] (CALCITE-5866) Not all dialects support sorting in sub-queries
[ https://issues.apache.org/jira/browse/CALCITE-5866?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17745319#comment-17745319 ] Will Noble commented on CALCITE-5866: - This is actually a weirder problem than it seems because I've encountered some severely under-documented discrepancies in what SQL Server 2017 will accept, and it seems like MSSQL is actually incompatible with particular edge-cases that other databases can handle. Here's the kind of query that motivated this bug in the first place: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", COUNT("ENAME") {code} While exploring solutions for CALCITE-5808 I came up with this that leverages MSSQL's sort-by-ordinal ability: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2) AS "t0" {code} Of course, this is a problem because it has an {{ORDER BY}} clause in a sub-query, which will be ignored. So this query has no dependable order for the results. Using sort-by-ordinal, it would actually be impossible to write a single query that sorts on anything that is not a column in the final output. In order to achieve that, you must sort by expression. In this case, the top example works fine. But MSSQL does a bad enough job at matching expressions in the {{ORDER BY}} list to expressions in the {{SELECT}} list that we cannot depend on sort-by-expression in the general case. Something like {{SELECT ORDER BY }} will fail in the general case with the error message ["A constant expression was encountered in the ORDER BY list"|https://stackoverflow.com/questions/49377988/a-constant-expression-was-encountered-in-the-order-by-list-position-1] even for expressions that are not constant. I just encountered this error trying to run this query (I know it's complicated but it's hard to say for sure where the problem is so I'm reproducing it almost as-is from my real-world example): {code:sql} SELECT ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) AS `users.created_microsecond` FROM `USERS` AS `users` GROUP BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) ORDER BY ((CONVERT(VARCHAR(19), CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) ,120) + '.' + RIGHT('00' + CAST(DATEPART(MCS, CAST( CAST(CASE WHEN 1 > 0 THEN '2016-02-01 00:30:00.567891' ELSE users.name END AS DATETIME2) AS DATETIME2)) AS VARCHAR(6)), 6))) DESC FETCH NEXT 1000 ROWS ONLY {code} As you can see, the {{ORDER BY}} expression is the same as the one in the {{SELECT}} and {{GROUP BY}} lists, but MSSQL thinks it's a constant expression. Taken together, this means: # We cannot generally rely on sorting by ordinal (or alias) because we need to insert sub-queries for sort expressions that aren't also selected, and we can't sort sub-queries. # We cannot generally rely on sorting by expression because MSSQL does a bad job of recognizing when expressions are constant or not. This part is very poorly documented, because clearly sorting by expression works in some circumstances (and is in fact the only way to achieve certain results, like sorting by an un-selected expression), but not others. I'm not sure how we can reconcile these in a reliable way. Perhaps the only way is to file bugs against SQL Server. > Not all dialects support sorting in sub-queries > --- > > Key: CALCITE-5866 > URL: https://issues.apache.org/jira/browse/CALCITE-5866 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Priority: Minor > > The rel-to-sql converter inserts subqueries in certain situations, such as > when sorting by ordinal (see CALCITE-5768). Certain dialects, such as MSSQL > (SQL Server) ignore the {{ORDER BY}} clause in a subquery. We may need a new > dialect-level setting like {{canSortInSubQuery}} that dictates whether > Calcite can safely insert sub-queries with {{ORDER BY}} clauses in them, or > whether it must do everything in it's power to avoid that (such as refusing > to sort by ordinal in
[jira] [Created] (CALCITE-5866) Not all dialects support sorting in sub-queries
Will Noble created CALCITE-5866: --- Summary: Not all dialects support sorting in sub-queries Key: CALCITE-5866 URL: https://issues.apache.org/jira/browse/CALCITE-5866 Project: Calcite Issue Type: Bug Reporter: Will Noble The rel-to-sql converter inserts subqueries in certain situations, such as when sorting by ordinal (see CALCITE-5768). Certain dialects, such as MSSQL (SQL Server) ignore the {{ORDER BY}} clause in a subquery. We may need a new dialect-level setting like {{canSortInSubQuery}} that dictates whether Calcite can safely insert sub-queries with {{ORDER BY}} clauses in them, or whether it must do everything in it's power to avoid that (such as refusing to sort by ordinal in cases where not all sort expressions are included in the {{SELECT}} list). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5808) Rel-to-Sql conversion should better support grouping or sorting by references or ordinals
[ https://issues.apache.org/jira/browse/CALCITE-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5808: Description: While exploring solutions to CALCITE-5724 I produced [this draft commit|https://github.com/apache/calcite/commit/d644ecee44ffd7927a62be96329cd5456f545de2] to explore what the Rel-to-Sql conversion process would look like if it were heavily biased toward using ordinals in the {{GROUP BY}} and {{ORDER BY}} clauses. It works for most common queries, but I gave up with basically 2 problems yet to be solved: # Grouping with {{ROLLUP}}, {{CUBE}}, or {{GROUPING SETS}} does not allow ordinals (at least [according to this presto ticket|https://github.com/prestodb/presto/issues/9522]; I didn't double-check any standards or dialects), so my draft solution would have to distinguish between these cases and a simple group-by, and only use ordinals in the simple case. # Window functions with an {{ORDER BY}} clause also use ordinals in my draft. It seems likely that ordinals are also disallowed in this context, but I'm not sure. Looking at the results in {{RelToSqlConverterTest.testConvertWindowToSql()}}, it certainly looks like the query is incorrect even if ordinals could hypothetically go in a window's {{ORDER BY}}. So, I've decided to stop working on this for now, but wanted to preserve my draft change and start a discussion on this ticket. Here are some thoughts: # It seems like we can reach a happy middle ground on sorting by using ordinals whenever the dialect allows *and* the expression returned by {{field()}} is anything other than a {{SqlIdentifier}} (i.e. a named column reference). Existing behavior as of writing is to only use ordinals when the expression is a {{SqlCall}}, but we should use it for literals as well. That would solve CALCITE-5724, but still use named references for "simple" collations, which seems to be the case for all window functions. # When it comes to grouping, things are more complicated. As of writing, Calcite tends to group by expressions, but some dialects (e.g. BigQuery) can get easily confused by this, even when the expression in the {{SELECT}} list perfectly matches that in the {{GROUP BY}} clause. With our Calcite integration, we need customized cleanup logic to rewrite the {{GROUP BY}} clauses in terms of aliases / ordinals in the {{SELECT}} list whenever possible, as a band-aid for this problem. We want to get rid of this band-aid and upstream a proper solution, and I thought using ordinals could be it, but the problems with {{GROUPING SETS}} et al still needs to be solved, and Calcite seems to have poor support for grouping by ordinals in general (see the changes I had to make to {{RelToSqlConverter.generateGroupList}} in my draft) which should be improved. was: While exploring solutions to CALCITE-5724 I produced [this draft commit|https://github.com/apache/calcite/commit/d644ecee44ffd7927a62be96329cd5456f545de2] to explore what the Rel-to-Sql conversion process would look like if it were heavily biased toward using ordinals in the {{GROUP BY}} and {{ORDER BY}} clauses. It works for most common queries, but I gave up with basically 2 problems yet to be solved: # Grouping with {{ROLLUP}}, {{CUBE}}, or {{GROUPING SETS}} does not allow ordinals (at least [according to this presto ticket|https://github.com/prestodb/presto/issues/9522]; I didn't double-check any standards or dialects), so my draft solution would have to distinguish between these cases and a simple group-by, and only use ordinals in the simple case. # Window functions with an {{ORDER BY}} clause also use ordinals in my draft. It seems likely that ordinals are also disallowed in this context, but I'm not sure. Looking at the results in {{RelToSqlConverterTest.testConvertWindowToSql()}}, it certainly looks like the query is incorrect even if ordinals can hypothetically go in a window's {{ORDER BY}}. So, I've decided to stop working on this for now, but wanted to preserve my draft change and start a discussion on this ticket. Here are some thoughts: # It seems like we can reach a happy middle ground on sorting by using ordinals whenever the dialect allows *and* the expression returned by {{field()}} is anything other than a {{SqlIdentifier}} (i.e. a named column reference). Existing behavior as of writing is to only use ordinals when the expression is a {{SqlCall}}, but we should use it for literals as well. That would solve CALCITE-5724, but still use named references for "simple" collations, which seems to be the case for all window functions. # When it comes to grouping, things are more complicated. As of writing, Calcite tends to group by expressions, but some dialects (e.g. BigQuery) can get easily confused by this, even when the expression in the {{SELECT}} list perfectly matches that in the {{GROUP BY}} clause. With our Calcite
[jira] [Updated] (CALCITE-5808) Rel-to-Sql conversion should better support grouping or sorting by references or ordinals
[ https://issues.apache.org/jira/browse/CALCITE-5808?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5808: Description: While exploring solutions to CALCITE-5724 I produced [this draft commit|https://github.com/apache/calcite/commit/d644ecee44ffd7927a62be96329cd5456f545de2] to explore what the Rel-to-Sql conversion process would look like if it were heavily biased toward using ordinals in the {{GROUP BY}} and {{ORDER BY}} clauses. It works for most common queries, but I gave up with basically 2 problems yet to be solved: # Grouping with {{ROLLUP}}, {{CUBE}}, or {{GROUPING SETS}} does not allow ordinals (at least [according to this presto ticket|https://github.com/prestodb/presto/issues/9522]; I didn't double-check any standards or dialects), so my draft solution would have to distinguish between these cases and a simple group-by, and only use ordinals in the simple case. # Window functions with an {{ORDER BY}} clause also use ordinals in my draft. It seems likely that ordinals are also disallowed in this context, but I'm not sure. Looking at the results in {{RelToSqlConverterTest.testConvertWindowToSql()}}, it certainly looks like the query is incorrect even if ordinals can hypothetically go in a window's {{ORDER BY}}. So, I've decided to stop working on this for now, but wanted to preserve my draft change and start a discussion on this ticket. Here are some thoughts: # It seems like we can reach a happy middle ground on sorting by using ordinals whenever the dialect allows *and* the expression returned by {{field()}} is anything other than a {{SqlIdentifier}} (i.e. a named column reference). Existing behavior as of writing is to only use ordinals when the expression is a {{SqlCall}}, but we should use it for literals as well. That would solve CALCITE-5724, but still use named references for "simple" collations, which seems to be the case for all window functions. # When it comes to grouping, things are more complicated. As of writing, Calcite tends to group by expressions, but some dialects (e.g. BigQuery) can get easily confused by this, even when the expression in the {{SELECT}} list perfectly matches that in the {{GROUP BY}} clause. With our Calcite integration, we need customized cleanup logic to rewrite the {{GROUP BY}} clauses in terms of aliases / ordinals in the {{SELECT}} list whenever possible, as a band-aid for this problem. We want to get rid of this band-aid and upstream a proper solution, and I thought using ordinals could be it, but the problems with {{GROUPING SETS}} et al still needs to be solved, and Calcite seems to have poor support for grouping by ordinals in general (see the changes I had to make to {{RelToSqlConverter.generateGroupList}} in my draft) which should be improved. was: While exploring solutions to CALCITE-5724 I produced [this draft PR|https://github.com/apache/calcite/commit/d644ecee44ffd7927a62be96329cd5456f545de2] to explore what the Rel-to-Sql conversion process would look like if it were heavily biased toward using ordinals in the {{GROUP BY}} and {{ORDER BY}} clauses. It works for most common queries, but I gave up with basically 2 problems yet to be solved: # Grouping with {{ROLLUP}}, {{CUBE}}, or {{GROUPING SETS}} does not allow ordinals (at least [according to this presto ticket|https://github.com/prestodb/presto/issues/9522]; I didn't double-check any standards or dialects), so my draft solution would have to distinguish between these cases and a simple group-by, and only use ordinals in the simple case. # Window functions with an {{ORDER BY}} clause also use ordinals in my draft. It seems likely that ordinals are also disallowed in this context, but I'm not sure. Looking at the results in {{RelToSqlConverterTest.testConvertWindowToSql()}}, it certainly looks like the query is incorrect even if ordinals can hypothetically go in a window's {{ORDER BY}}. So, I've decided to stop working on this for now, but wanted to preserve my draft change and start a discussion on this ticket. Here are some thoughts: # It seems like we can reach a happy middle ground on sorting by using ordinals whenever the dialect allows *and* the expression returned by {{field()}} is anything other than a {{SqlIdentifier}} (i.e. a named column reference). Existing behavior as of writing is to only use ordinals when the expression is a {{SqlCall}}, but we should use it for literals as well. That would solve CALCITE-5724, but still use named references for "simple" collations, which seems to be the case for all window functions. # When it comes to grouping, things are more complicated. As of writing, Calcite tends to group by expressions, but some dialects (e.g. BigQuery) can get easily confused by this, even when the expression in the {{SELECT}} list perfectly matches that in the {{GROUP BY}} clause. With our Calcite integration,
[jira] [Created] (CALCITE-5808) Rel-to-Sql conversion should better support grouping or sorting by references or ordinals
Will Noble created CALCITE-5808: --- Summary: Rel-to-Sql conversion should better support grouping or sorting by references or ordinals Key: CALCITE-5808 URL: https://issues.apache.org/jira/browse/CALCITE-5808 Project: Calcite Issue Type: Improvement Reporter: Will Noble While exploring solutions to CALCITE-5724 I produced [this draft PR|https://github.com/apache/calcite/commit/d644ecee44ffd7927a62be96329cd5456f545de2] to explore what the Rel-to-Sql conversion process would look like if it were heavily biased toward using ordinals in the {{GROUP BY}} and {{ORDER BY}} clauses. It works for most common queries, but I gave up with basically 2 problems yet to be solved: # Grouping with {{ROLLUP}}, {{CUBE}}, or {{GROUPING SETS}} does not allow ordinals (at least [according to this presto ticket|https://github.com/prestodb/presto/issues/9522]; I didn't double-check any standards or dialects), so my draft solution would have to distinguish between these cases and a simple group-by, and only use ordinals in the simple case. # Window functions with an {{ORDER BY}} clause also use ordinals in my draft. It seems likely that ordinals are also disallowed in this context, but I'm not sure. Looking at the results in {{RelToSqlConverterTest.testConvertWindowToSql()}}, it certainly looks like the query is incorrect even if ordinals can hypothetically go in a window's {{ORDER BY}}. So, I've decided to stop working on this for now, but wanted to preserve my draft change and start a discussion on this ticket. Here are some thoughts: # It seems like we can reach a happy middle ground on sorting by using ordinals whenever the dialect allows *and* the expression returned by {{field()}} is anything other than a {{SqlIdentifier}} (i.e. a named column reference). Existing behavior as of writing is to only use ordinals when the expression is a {{SqlCall}}, but we should use it for literals as well. That would solve CALCITE-5724, but still use named references for "simple" collations, which seems to be the case for all window functions. # When it comes to grouping, things are more complicated. As of writing, Calcite tends to group by expressions, but some dialects (e.g. BigQuery) can get easily confused by this, even when the expression in the {{SELECT}} list perfectly matches that in the {{GROUP BY}} clause. With our Calcite integration, we need customized cleanup logic to rewrite the {{GROUP BY}} clauses in terms of aliases / ordinals in the {{SELECT}} list whenever possible, as a band-aid for this problem. We want to get rid of this band-aid and upstream a proper solution, and I thought using ordinals could be it, but the problems with {{GROUPING SETS}} et al still needs to be solved, and Calcite seems to have poor support for grouping by ordinals in general (see the changes I had to make to {{RelToSqlConverter.generateGroupList}} in my draft) which should be improved. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5724) Generated SQL uses literal values in ORDER BY clauses
[ https://issues.apache.org/jira/browse/CALCITE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737418#comment-17737418 ] Will Noble edited comment on CALCITE-5724 at 6/27/23 2:44 AM: -- I think there's a better way. Between CALCITE-5044 and CALCITE-5510, the behavior of {{SqlImplementor.Context.orderField}} went from simply copying the expression from the select list, to this complex behavior: # If the dialect is sort-by-ordinal and the expression is a literal integer, convert it to a literal string, and sort by that string expression. # If the dialect is sort-by-ordinal and the expression is a {{SqlBasicCall}}, sort by ordinal. # Otherwise, sort by expression. It seems like a much better state of affairs would be this: # If the dialect is sort-by-ordinal, sort by ordinal. # Otherwise, sort by expression. Biasing ourselves more consistently toward sorting by ordinal like this causes the rel-to-sql converter to generate more sub-queries, the logic for which was recently fixed in CALCITE-5768. I have a hunch that bug was stopping someone from implementing this seemingly-more-obvious solution in the first place. was (Author: wnoble): I think there's a better way. Between CALCITE-5044 and CALCITE-5510, the behavior of {{SqlImplementor.Context.orderField}} went from simply copying the expression from the select list, to this complex behavior: # If the dialect is sort-by-ordinal and the expression is a literal integer, convert it to a literal string. # If the dialect is sort-by-ordinal and the expression is a {{SqlBasicCall}}, sort by ordinal. # Otherwise, sort by expression. It seems like a much better state of affairs would be this: # If the dialect is sort-by-ordinal, sort by ordinal. # Otherwise, sort by expression. Biasing ourselves more consistently toward sorting by ordinal like this causes the rel-to-sql converter to generate more sub-queries, the logic for which was recently fixed in CALCITE-5768. I have a hunch that bug was stopping someone from implementing this seemingly-more-obvious solution in the first place. > Generated SQL uses literal values in ORDER BY clauses > - > > Key: CALCITE-5724 > URL: https://issues.apache.org/jira/browse/CALCITE-5724 > Project: Calcite > Issue Type: Bug >Reporter: Joey Moore >Assignee: Joey Moore >Priority: Major > Labels: pull-request-available > > Current behavior in the SqlImplementor will generate SqlCharStringLiterals in > ORDER BY fields when there is a Literal value in the SELECT clause. This > happens in languages with isSortByOrdinal(). This leads to errors in dialects > in which cannot have literal values in ORDER BY clauses such as BigQuery. > Proposed fix is to use ordinals in all cases where a literal value is present > in the SELECT clause. > Example of current implementation: > {code:java} > select 3.14159265 as pi > from \"product\" > order by 1; > {code} > Will return > {code:java} > SELECT 3.14159265 AS \"PI\" > FROM \"foodmart\".\"product\" > ORDER BY '3.14159265'{code} > Proposed implementation will return : > {code:java} > SELECT 3.14159265 AS \"PI\" > FROM \"foodmart\".\"product\" > ORDER BY 1{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5724) Generated SQL uses literal values in ORDER BY clauses
[ https://issues.apache.org/jira/browse/CALCITE-5724?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17737418#comment-17737418 ] Will Noble commented on CALCITE-5724: - I think there's a better way. Between CALCITE-5044 and CALCITE-5510, the behavior of {{SqlImplementor.Context.orderField}} went from simply copying the expression from the select list, to this complex behavior: # If the dialect is sort-by-ordinal and the expression is a literal integer, convert it to a literal string. # If the dialect is sort-by-ordinal and the expression is a {{SqlBasicCall}}, sort by ordinal. # Otherwise, sort by expression. It seems like a much better state of affairs would be this: # If the dialect is sort-by-ordinal, sort by ordinal. # Otherwise, sort by expression. Biasing ourselves more consistently toward sorting by ordinal like this causes the rel-to-sql converter to generate more sub-queries, the logic for which was recently fixed in CALCITE-5768. I have a hunch that bug was stopping someone from implementing this seemingly-more-obvious solution in the first place. > Generated SQL uses literal values in ORDER BY clauses > - > > Key: CALCITE-5724 > URL: https://issues.apache.org/jira/browse/CALCITE-5724 > Project: Calcite > Issue Type: Bug >Reporter: Joey Moore >Assignee: Joey Moore >Priority: Major > Labels: pull-request-available > > Current behavior in the SqlImplementor will generate SqlCharStringLiterals in > ORDER BY fields when there is a Literal value in the SELECT clause. This > happens in languages with isSortByOrdinal(). This leads to errors in dialects > in which cannot have literal values in ORDER BY clauses such as BigQuery. > Proposed fix is to use ordinals in all cases where a literal value is present > in the SELECT clause. > Example of current implementation: > {code:java} > select 3.14159265 as pi > from \"product\" > order by 1; > {code} > Will return > {code:java} > SELECT 3.14159265 AS \"PI\" > FROM \"foodmart\".\"product\" > ORDER BY '3.14159265'{code} > Proposed implementation will return : > {code:java} > SELECT 3.14159265 AS \"PI\" > FROM \"foodmart\".\"product\" > ORDER BY 1{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5775) Null direction emulation broken for complex expressions on some dialects
[ https://issues.apache.org/jira/browse/CALCITE-5775?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17735913#comment-17735913 ] Will Noble commented on CALCITE-5775: - Filed CALCITE-5793. Calcite already is so inclined to group and order by ordinals for BQ. It just doesn't work for null direction emulation since that requires adding an addition {{IS NULL}} operator, and the only operators you can apply to sorting ordinals is {{DESC}} or {{NULLS FIRST}} / {{LAST}}, so it has to expand it into the full expression in that case. This could potentially be mitigated by introducing a subquery with the {{IS NULL}} expression in the select list. This may be worth considering. I believe that previous bug you're talking about is CALCITE-5767. I'm not sure if that's related to the claims that you think are red herrings, but the problems outlined in this bug mean that only a single null direction (the default for whatever dialect is being unparsed) can possibly work. The user does not have the ability to use the non-default null direction if it has to be emulated with an extra {{IS NULL}} sort. > Null direction emulation broken for complex expressions on some dialects > > > Key: CALCITE-5775 > URL: https://issues.apache.org/jira/browse/CALCITE-5775 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Priority: Minor > Labels: pull-request-available > > This is a problem in BigQuery, and may be a problem in other dialects as > well. Consider the following piece of valid BQ SQL: > {code:sql} > SELECT REPEAT(first_name, 2), >COUNT(id) > FROM looker_test.users > GROUP BY REPEAT(first_name, 2) > ORDER BY 1 > {code} > Now consider a version where the {{ORDER BY}} clause is changed to this: > {code:sql} > ORDER BY REPEAT(first_name, 2) > {code} > This is logically the same query, because the expression in the {{ORDER}} > clause is the same as the one in the {{SELECT}} / {{GROUP}} clauses. BigQuery > is sophisticated enough to match the select to the group expression in both > queries, but cannot match either with the order expression. It gives this > error: _ORDER BY clause expression references column first_name which is > neither grouped nor aggregated_. > So, when sorting by complex expressions in BQ, Calcite relies on either: > * No null direction emulation required. > * sorting by alias or ordinal, which is a problem with current null direction > emulation because it adds an extra complex sort expression, or > * having a query that just happens to also have the underlying field in the > {{GROUP BY}} clause by itself, which seems to actually happen pretty often in > my testing, but obviously shouldn't be a constraint. > As I wrote that, I realized this may be easily fixable for BQ since it added > support for {{NULLS FIRST}} / {{LAST}} in 2020 and it seems Calcite has not > caught up yet. Consider this rel node: > {code} > LogicalSort(sort0=[$0], dir0=[ASC]) > LogicalAggregate(group=[{0}], cent=[COUNT($1)]) > LogicalProject($f0=[CASE(IS NULL($4), 0, 1)], MGR=[$3]) > JdbcTableScan(table=[[JDBC_SCOTT, EMP]]) > {code} > Calcite would convert it to this in BigQuery due to null direction emulation: > {code:sql} > SELECT CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END AS `$f0`, COUNT(MGR) AS > cent > FROM SCOTT.EMP > GROUP BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END > ORDER BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END IS NULL, 1 > {code} > Which of course triggers the problem described above. This may be a problem > for MSSQL as well since it doesn't support {{NULLS LAST}}. The fix for BQ, at > least, may be to just support {{NULLS LAST}} and sort by ordinal. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5793) Use NULLS FIRST / LAST when unparsing in BigQuery
Will Noble created CALCITE-5793: --- Summary: Use NULLS FIRST / LAST when unparsing in BigQuery Key: CALCITE-5793 URL: https://issues.apache.org/jira/browse/CALCITE-5793 Project: Calcite Issue Type: Improvement Reporter: Will Noble BigQuery added support for {{NULLS FIRST}} and {{NULLS LAST}} in 2020. Calcite should use it instead of emulating null direction. CALCITE-5775 has context on how this was discovered. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5775) Null direction emulation broken for complex expressions on some dialects
Will Noble created CALCITE-5775: --- Summary: Null direction emulation broken for complex expressions on some dialects Key: CALCITE-5775 URL: https://issues.apache.org/jira/browse/CALCITE-5775 Project: Calcite Issue Type: Bug Components: core Reporter: Will Noble This is a problem in BigQuery, and may be a problem in other dialects as well. Consider the following piece of valid BQ SQL: {code:sql} SELECT REPEAT(first_name, 2), COUNT(id) FROM looker_test.users GROUP BY REPEAT(first_name, 2) ORDER BY 1 {code} Now consider a version where the {{ORDER BY}} clause is changed to this: {code:sql} ORDER BY REPEAT(first_name, 2) {code} This is logically the same query, because the expression in the {{ORDER}} clause is the same as the one in the {{SELECT}} / {{GROUP}} clauses. BigQuery is sophisticated enough to match the select to the group expression in both queries, but cannot match either with the order expression. It gives this error: _ORDER BY clause expression references column first_name which is neither grouped nor aggregated_. So, when sorting by complex expressions in BQ, Calcite relies on either: * No null direction emulation required. * sorting by alias or ordinal, which is a problem with current null direction emulation because it adds an extra complex sort expression, or * having a query that just happens to also have the underlying field in the {{GROUP BY}} clause by itself, which seems to actually happen pretty often in my testing, but obviously shouldn't be a constraint. As I wrote that, I realized this may be easily fixable for BQ since it added support for {{NULLS FIRST}} / {{LAST}} in 2020 and it seems Calcite has not caught up yet. Consider this rel node: {code} LogicalSort(sort0=[$0], dir0=[ASC]) LogicalAggregate(group=[{0}], cent=[COUNT($1)]) LogicalProject($f0=[CASE(IS NULL($4), 0, 1)], MGR=[$3]) JdbcTableScan(table=[[JDBC_SCOTT, EMP]]) {code} Calcite would convert it to this in BigQuery due to null direction emulation: {code:sql} SELECT CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END AS `$f0`, COUNT(MGR) AS cent FROM SCOTT.EMP GROUP BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END ORDER BY CASE WHEN HIREDATE IS NULL THEN 0 ELSE 1 END IS NULL, 1 {code} Which of course triggers the problem described above. This may be a problem for MSSQL as well since it doesn't support {{NULLS LAST}}. The fix for BQ, at least, may be to just support {{NULLS LAST}} and sort by ordinal. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5767) Calcite's MSSQL dialect should not give GROUPING special treatment when emulating NULL direction
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730760#comment-17730760 ] Will Noble edited comment on CALCITE-5767 at 6/8/23 10:42 PM: -- I updated the description and summary, but kept the verbose prose description because it's really a complicated problem with seemingly no ideal solution. Really, there are 4 potential "problems" that could be blamed for this. The first 3 are MSSQL's seemingly non-standard design decisions outlined in the ticket description (each of these is not necessarily something that needs changing, but if any one of these were different, there would be no problem). The fourth potential target for blame is the fact that {{RelBuilder}} uses a "default" null direction by default instead of {{NullDirection.UNSPECIFIED}} which, if used, would also eliminate this weird issue, and allow us to return {{null}} from {{emulateNullDirection}} and not have to include an effectively constant expression in the {{ORDER BY}} list. So, in a sense the problem lies with {{RelBuilder}} because it's forcing us to produce a seemingly less efficient and slightly more confusing SQL query. But, changing the default null direction in {{RelBuilder}} from NULLS-high to unspecified would be a major breaking change, so I'm assuming that's a non-starter. was (Author: wnoble): I updated the description and summary, but kept the verbose prose description because it's really a complicated problem with seemingly no ideal solution. Really, there are 4 potential "problems" that could be blamed for this. The first 3 are MSSQL's seemingly non-standard design decisions outlined in the ticket description (each of these is not necessarily something that needs changing, but if any one of these were different, there would be no problem). The fourth potential target for blame is the fact that {{RelBuilder}} uses a "default" null direction by default instead of {{NullDirection.UNSPECIFIED}} which, if used, would also eliminate this weird issue, and allow us to return {{null}} from {{emulateNullDirection}} and not have to include an effectively constant expression in the {{ORDER BY}} list. But, changing the default null direction in {{RelBuilder}} from NULLS-high to unspecified would be a major breaking change, so I'm assuming that's a non-starter. > Calcite's MSSQL dialect should not give GROUPING special treatment when > emulating NULL direction > > > Key: CALCITE-5767 > URL: https://issues.apache.org/jira/browse/CALCITE-5767 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > Labels: pull-request-available > > {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} > calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95]. > This seems to be an optimization attempt since {{GROUPING}} is known to > never return {{{}NULL{}}}, and therefore needs no null direction emulation. > However, this causes problems because: > 1. MSSQL does not have the same default null direction as Calcite's > {{RelBuilder}} (and most other dialects). > 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax. > 3. MSSQL does not allow sorting on the same field twice, even though there is > no theoretical issue with this. > Each of these properties must be present for the problem to occur, so it's a > bit niche and specific to MSSQL. Seems like the best solution is to simply > eliminate the special-case treatment for {{GROUPING}} in > {{{}emulateNullDirection{}}}, which is currently creating problems due to > property #3. > {*}More in-depth explanation{*}: > In {{{}RelBuilder.collation{}}}, we use the "default null direction" to > insert rex nodes as sorting expressions, but this is only the default null > direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem > because MSSQL has special-case logic for emulating null direction of GROUPING > calls, whereby it effectively duplicates the expression. Really, > {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning > {{null}} instead, signalling to callers that no null-direction emulation is > necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes > another problem due to property #2 above when the null direction is > non-default as is caused simply by using {{RelBuilder.collation}} as > described above (it should be noted that this method takes rex nodes instead > of {{RelFieldCollation}} object, so there is no way to specify null >
[jira] [Commented] (CALCITE-5767) Calcite's MSSQL dialect should not give GROUPING special treatment when emulating NULL direction
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17730760#comment-17730760 ] Will Noble commented on CALCITE-5767: - I updated the description and summary, but kept the verbose prose description because it's really a complicated problem with seemingly no ideal solution. Really, there are 4 potential "problems" that could be blamed for this. The first 3 are MSSQL's seemingly non-standard design decisions outlined in the ticket description (each of these is not necessarily something that needs changing, but if any one of these were different, there would be no problem). The fourth potential target for blame is the fact that {{RelBuilder}} uses a "default" null direction by default instead of {{NullDirection.UNSPECIFIED}} which, if used, would also eliminate this weird issue, and allow us to return {{null}} from {{emulateNullDirection}} and not have to include an effectively constant expression in the {{ORDER BY}} list. But, changing the default null direction in {{RelBuilder}} from NULLS-high to unspecified would be a major breaking change, so I'm assuming that's a non-starter. > Calcite's MSSQL dialect should not give GROUPING special treatment when > emulating NULL direction > > > Key: CALCITE-5767 > URL: https://issues.apache.org/jira/browse/CALCITE-5767 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > Labels: pull-request-available > > {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} > calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95]. > This seems to be an optimization attempt since {{GROUPING}} is known to > never return {{{}NULL{}}}, and therefore needs no null direction emulation. > However, this causes problems because: > 1. MSSQL does not have the same default null direction as Calcite's > {{RelBuilder}} (and most other dialects). > 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax. > 3. MSSQL does not allow sorting on the same field twice, even though there is > no theoretical issue with this. > Each of these properties must be present for the problem to occur, so it's a > bit niche and specific to MSSQL. Seems like the best solution is to simply > eliminate the special-case treatment for {{GROUPING}} in > {{{}emulateNullDirection{}}}, which is currently creating problems due to > property #3. > {*}More in-depth explanation{*}: > In {{{}RelBuilder.collation{}}}, we use the "default null direction" to > insert rex nodes as sorting expressions, but this is only the default null > direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem > because MSSQL has special-case logic for emulating null direction of GROUPING > calls, whereby it effectively duplicates the expression. Really, > {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning > {{null}} instead, signalling to callers that no null-direction emulation is > necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes > another problem due to property #2 above when the null direction is > non-default as is caused simply by using {{RelBuilder.collation}} as > described above (it should be noted that this method takes rex nodes instead > of {{RelFieldCollation}} object, so there is no way to specify null > direction) because the non-default null direction is not expanded into a > {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} > syntax). > Here's a test illustrating the problem: > Input SQL (default dialect) > {code:xml} > select "product_class_id", "brand_name", GROUPING("brand_name") > from "product" > group by GROUPING SETS (("product_class_id", "brand_name"), > ("product_class_id")) > order by 3, 2, 1 > {code} > Current behavior for unparsing as MSSQL (incorrect because it orders by the > same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail > if you try to actually run this against a real MSSQL database, even though it > seems like it shouldn't): > {code:xml} > SELECT [product_class_id], [brand_name], GROUPING([brand_name]) > FROM [foodmart].[product] > GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) > ORDER BY > GROUPING([brand_name]), > 3, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} > Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply
[jira] [Updated] (CALCITE-5767) Calcite's MSSQL dialect should not give GROUPING special treatment when emulating NULL direction
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5767: Description: {{MssqlSqlDialect.emulateNullDirection}} has [special logic for {{GROUPING}} calls|https://github.com/apache/calcite/blob/814ae6ec09e72544ba010f2591e06020c55b162b/core/src/main/java/org/apache/calcite/sql/dialect/MssqlSqlDialect.java#L95]. This seems to be an optimization attempt since {{GROUPING}} is known to never return {{{}NULL{}}}, and therefore needs no null direction emulation. However, this causes problems because: 1. MSSQL does not have the same default null direction as Calcite's {{RelBuilder}} (and most other dialects). 2. MSSQL does not support the common {{NULLS FIRST}} or {{NULLS LAST}} syntax. 3. MSSQL does not allow sorting on the same field twice, even though there is no theoretical issue with this. Each of these properties must be present for the problem to occur, so it's a bit niche and specific to MSSQL. Seems like the best solution is to simply eliminate the special-case treatment for {{GROUPING}} in {{{}emulateNullDirection{}}}, which is currently creating problems due to property #3. {*}More in-depth explanation{*}: In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem due to property #2 above when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax). Here's a test illustrating the problem: Input SQL (default dialect) {code:xml} select "product_class_id", "brand_name", GROUPING("brand_name") from "product" group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id")) order by 3, 2, 1 {code} Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{{}3{}}}, which will fail if you try to actually run this against a real MSSQL database, even though it seems like it shouldn't): {code:xml} SELECT [product_class_id], [brand_name], GROUPING([brand_name]) FROM [foodmart].[product] GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) ORDER BY GROUPING([brand_name]), 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax): {code:xml} ... ORDER BY 3 NULLS LAST, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Acceptable behavior (although the first {{{}ORDER BY{}}}-clause is effectively ordering by a constant, this will at least run and produce the correct results): {code:xml} ... ORDER BY CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END, 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} was: In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default
[jira] [Updated] (CALCITE-5767) Calcite's MSSQL dialect should not give GROUPING special treatment when emulating NULL direction
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5767: Summary: Calcite's MSSQL dialect should not give GROUPING special treatment when emulating NULL direction (was: MSSQL fails to unparse properly when sorting by GROUPING expression) > Calcite's MSSQL dialect should not give GROUPING special treatment when > emulating NULL direction > > > Key: CALCITE-5767 > URL: https://issues.apache.org/jira/browse/CALCITE-5767 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > Labels: pull-request-available > > In {{{}RelBuilder.collation{}}}, we use the "default null direction" to > insert rex nodes as sorting expressions, but this is only the default null > direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem > because MSSQL has special-case logic for emulating null direction of GROUPING > calls, whereby it effectively duplicates the expression. Really, > {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning > {{null}} instead, signalling to callers that no null-direction emulation is > necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes > another problem when the null direction is non-default as is caused simply by > using {{RelBuilder.collation}} as described above (it should be noted that > this method takes rex nodes instead of {{RelFieldCollation}} object, so there > is no way to specify null direction) because the non-default null direction > is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS > FIRST}} or {{LAST}} syntax). > Here's a test illustrating the problem: > Input SQL (default dialect) > {code:xml} > select "product_class_id", "brand_name", GROUPING("brand_name") > from "product" > group by GROUPING SETS (("product_class_id", "brand_name"), > ("product_class_id")) > order by 3, 2, 1 > {code} > Current behavior for unparsing as MSSQL (incorrect because it orders by the > same column twice; {{GROUPING([brand_name])}} and {{3}}, which will fail if > you try to actually run this against a real MSSQL database, even though it > seems like it shouldn't): > {code:xml} > SELECT [product_class_id], [brand_name], GROUPING([brand_name]) > FROM [foodmart].[product] > GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) > ORDER BY > GROUPING([brand_name]), > 3, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} > Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns > {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS > LAST}} syntax): > {code:xml} > ... > ORDER BY > 3 NULLS LAST, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} > Acceptable behavior (although the first {{ORDER BY}}-clause is effectively > ordering by a constant, this will at least run and produce the correct > results): > {code:xml} > ... > ORDER BY > CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END, > 3, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5768) Rel-to-sql conversion fails to insert a subquery for some sort-by-ordinal queries
[ https://issues.apache.org/jira/browse/CALCITE-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5768: Description: Currently, when given a rel expression like this: {code} LogicalProject(JOB=[$0]) LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) LogicalAggregate(group=[{2}], agg#0=[COUNT($1)]) LogicalTableScan(table=[[scott, EMP]]) {code} Calcite will convert to this invalid ({{ORDER BY 2}} despite only one {{SELECT}} expression) SQL: {code:sql} SELECT "JOB" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2 {code} When it should convert to this SQL with a sub-query: {code:sql} SELECT "JOB" FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" FROM "scott"."EMP" GROUP BY "JOB" ORDER BY "JOB", 2) AS "t0" {code} {{hasSortByOrdinal}} has an [improper early return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] that's causing incorrect results. It should only return early if the result is {{true}}. Otherwise, it should continue searching for ordinals. was:{{hasSortByOrdinal}} has an [improper early return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] that's causing incorrect results. It should only return early if the result is {{true}}. Otherwise, it should continue searching for ordinals. > Rel-to-sql conversion fails to insert a subquery for some sort-by-ordinal > queries > - > > Key: CALCITE-5768 > URL: https://issues.apache.org/jira/browse/CALCITE-5768 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Major > Labels: pull-request-available > > Currently, when given a rel expression like this: > {code} > LogicalProject(JOB=[$0]) > LogicalSort(sort0=[$0], sort1=[$1], dir0=[ASC], dir1=[ASC]) > LogicalAggregate(group=[{2}], agg#0=[COUNT($1)]) > LogicalTableScan(table=[[scott, EMP]]) > {code} > Calcite will convert to this invalid ({{ORDER BY 2}} despite only one > {{SELECT}} expression) SQL: > {code:sql} > SELECT "JOB" > FROM "scott"."EMP" > GROUP BY "JOB" > ORDER BY "JOB", 2 > {code} > When it should convert to this SQL with a sub-query: > {code:sql} > SELECT "JOB" > FROM (SELECT "JOB", COUNT("ENAME") AS "$f1" > FROM "scott"."EMP" > GROUP BY "JOB" > ORDER BY "JOB", 2) AS "t0" > {code} > {{hasSortByOrdinal}} has an [improper early > return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] > that's causing incorrect results. It should only return early if the result > is {{true}}. Otherwise, it should continue searching for ordinals. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5768) Rel-to-sql conversion fails to insert a subquery for some sort-by-ordinal queries
[ https://issues.apache.org/jira/browse/CALCITE-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5768: Summary: Rel-to-sql conversion fails to insert a subquery for some sort-by-ordinal queries (was: Rel-to-sql conversion fails to insert a subquery) > Rel-to-sql conversion fails to insert a subquery for some sort-by-ordinal > queries > - > > Key: CALCITE-5768 > URL: https://issues.apache.org/jira/browse/CALCITE-5768 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Major > Labels: pull-request-available > > {{hasSortByOrdinal}} has an [improper early > return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] > that's causing incorrect results. It should only return early if the result > is {{true}}. Otherwise, it should continue searching for ordinals. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5768) Rel-to-sql conversion fails to insert a subquery
[ https://issues.apache.org/jira/browse/CALCITE-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5768: Summary: Rel-to-sql conversion fails to insert a subquery (was: SqlImplementor.hasSortByOrdinal is broken) > Rel-to-sql conversion fails to insert a subquery > > > Key: CALCITE-5768 > URL: https://issues.apache.org/jira/browse/CALCITE-5768 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Major > Labels: pull-request-available > > {{hasSortByOrdinal}} has an [improper early > return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] > that's causing incorrect results. It should only return early if the result > is {{true}}. Otherwise, it should continue searching for ordinals. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5768) SqlImplementor.hasSortByOrdinal is broken
Will Noble created CALCITE-5768: --- Summary: SqlImplementor.hasSortByOrdinal is broken Key: CALCITE-5768 URL: https://issues.apache.org/jira/browse/CALCITE-5768 Project: Calcite Issue Type: Bug Components: core Reporter: Will Noble {{hasSortByOrdinal}} has an [improper early return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] that's causing incorrect results. It should only return early if the result is {{true}}. Otherwise, it should continue searching for ordinals. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (CALCITE-5768) SqlImplementor.hasSortByOrdinal is broken
[ https://issues.apache.org/jira/browse/CALCITE-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble reassigned CALCITE-5768: --- Assignee: Will Noble > SqlImplementor.hasSortByOrdinal is broken > - > > Key: CALCITE-5768 > URL: https://issues.apache.org/jira/browse/CALCITE-5768 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Major > > {{hasSortByOrdinal}} has an [improper early > return|https://github.com/apache/calcite/blob/c4042a34ef054b89cec1c47fefcbc8689bad55be/core/src/main/java/org/apache/calcite/rel/rel2sql/SqlImplementor.java#L1902] > that's causing incorrect results. It should only return early if the result > is {{true}}. Otherwise, it should continue searching for ordinals. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5767) MSSQL fails to unparse properly when sorting by GROUPING expression
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5767: Description: In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax). Here's a test illustrating the problem: Input SQL (default dialect) {code:xml} select "product_class_id", "brand_name", GROUPING("brand_name") from "product" group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id")) order by 3, 2, 1 {code} Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{3}}, which will fail if you try to actually run this against a real MSSQL database, even though it seems like it shouldn't): {code:xml} SELECT [product_class_id], [brand_name], GROUPING([brand_name]) FROM [foodmart].[product] GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) ORDER BY GROUPING([brand_name]), 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax): {code:xml} ... ORDER BY 3 NULLS LAST, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Acceptable behavior (although the first {{ORDER BY}}-clause is effectively ordering by a constant, this will at least run and produce the correct results): {code:xml} ... ORDER BY CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END, 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} was: In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax). Here's a test illustrating the problem: Input SQL (default dialect) {code:xml} select "product_class_id", "brand_name", GROUPING("brand_name") from "product" group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id")) order by 3, 2, 1 {code} Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{3}}, which will fail if you try to actually run this against a real MSSQL database, even though it seems like it shouldn't): {code:xml} SELECT [product_class_id], [brand_name], GROUPING([brand_name]) FROM [foodmart].[product] GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) ORDER BY GROUPING([brand_name]), 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Behavior where
[jira] [Updated] (CALCITE-5767) MSSQL fails to unparse properly when sorting by GROUPING expression
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5767: Description: In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax). Here's a test illustrating the problem: Input SQL (default dialect) {code:xml} select "product_class_id", "brand_name", GROUPING("brand_name") from "product" group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id")) order by 3, 2, 1 {code} Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{3}}, which will fail if you try to actually run this against a real MSSQL database, even though it seems like it shouldn't): {code:xml} SELECT [product_class_id], [brand_name], GROUPING([brand_name]) FROM [foodmart].[product] GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) ORDER BY GROUPING([brand_name]), 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax): {code:xml} ... ORDER BY 3 NULLS LAST, 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Acceptable behavior (although the first {{ORDER BY}}-clause is effectively ordering by a constant, this will at least run and produce the correct results): {code:xml} ... ORDER BY CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END, 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} was: In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax). Here's a test illustrating the problem: Input SQL (default dialect) {code:xml} select "product_class_id", "brand_name", GROUPING("brand_name") from "product" group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id")) order by 3, 2, 1 {code} Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{3}}): {code:xml} SELECT [product_class_id], [brand_name], GROUPING([brand_name]) FROM [foodmart].[product] GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) ORDER BY GROUPING([brand_name]), 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS
[jira] [Created] (CALCITE-5767) MSSQL fails to unparse properly when sorting by GROUPING expression
Will Noble created CALCITE-5767: --- Summary: MSSQL fails to unparse properly when sorting by GROUPING expression Key: CALCITE-5767 URL: https://issues.apache.org/jira/browse/CALCITE-5767 Project: Calcite Issue Type: Bug Components: core Reporter: Will Noble In {{{}RelBuilder.collation{}}}, we use the "default null direction" to insert rex nodes as sorting expressions, but this is only the default null direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem because MSSQL has special-case logic for emulating null direction of GROUPING calls, whereby it effectively duplicates the expression. Really, {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning {{null}} instead, signalling to callers that no null-direction emulation is necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes another problem when the null direction is non-default as is caused simply by using {{RelBuilder.collation}} as described above (it should be noted that this method takes rex nodes instead of {{RelFieldCollation}} object, so there is no way to specify null direction) because the non-default null direction is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS FIRST}} or {{LAST}} syntax). Here's a test illustrating the problem: Input SQL (default dialect) {code:xml} select "product_class_id", "brand_name", GROUPING("brand_name") from "product" group by GROUPING SETS (("product_class_id", "brand_name"), ("product_class_id")) order by 3, 2, 1 {code} Current behavior for unparsing as MSSQL (incorrect because it orders by the same column twice; {{GROUPING([brand_name])}} and {{3}}): {code:xml} SELECT [product_class_id], [brand_name], GROUPING([brand_name]) FROM [foodmart].[product] GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) ORDER BY GROUPING([brand_name]), 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS LAST}} syntax): {code:xml} ... ORDER BY 3 NULLS LAST, 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} Acceptable behavior: {code:xml} ... ORDER BY CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END, 3, CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, [brand_name], CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, [product_class_id] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (CALCITE-5767) MSSQL fails to unparse properly when sorting by GROUPING expression
[ https://issues.apache.org/jira/browse/CALCITE-5767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble reassigned CALCITE-5767: --- Assignee: Will Noble > MSSQL fails to unparse properly when sorting by GROUPING expression > --- > > Key: CALCITE-5767 > URL: https://issues.apache.org/jira/browse/CALCITE-5767 > Project: Calcite > Issue Type: Bug > Components: core >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > In {{{}RelBuilder.collation{}}}, we use the "default null direction" to > insert rex nodes as sorting expressions, but this is only the default null > direction for NULLS-high dialects, i.e. *not* MSSQL. This is a problem > because MSSQL has special-case logic for emulating null direction of GROUPING > calls, whereby it effectively duplicates the expression. Really, > {{MssqlSqlDialect.emulateNullDirection}} probably should've been returning > {{null}} instead, signalling to callers that no null-direction emulation is > necessary because {{GROUPING}} never returns {{{}NULL{}}}, but this causes > another problem when the null direction is non-default as is caused simply by > using {{RelBuilder.collation}} as described above (it should be noted that > this method takes rex nodes instead of {{RelFieldCollation}} object, so there > is no way to specify null direction) because the non-default null direction > is not expanded into a {{CASE}} expression (MSSQL does not support {{NULLS > FIRST}} or {{LAST}} syntax). > Here's a test illustrating the problem: > Input SQL (default dialect) > {code:xml} > select "product_class_id", "brand_name", GROUPING("brand_name") > from "product" > group by GROUPING SETS (("product_class_id", "brand_name"), > ("product_class_id")) > order by 3, 2, 1 > {code} > Current behavior for unparsing as MSSQL (incorrect because it orders by the > same column twice; {{GROUPING([brand_name])}} and {{3}}): > {code:xml} > SELECT [product_class_id], [brand_name], GROUPING([brand_name]) > FROM [foodmart].[product] > GROUP BY GROUPING SETS(([product_class_id], [brand_name]), [product_class_id]) > ORDER BY > GROUPING([brand_name]), > 3, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} > Behavior where {{MssqlSqlDialect.emulateNullDirection}} simply returns > {{null}} for {{GROUPING}} expressions (incorrect because it uses {{NULLS > LAST}} syntax): > {code:xml} > ... > ORDER BY > 3 NULLS LAST, > 3, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} > Acceptable behavior: > {code:xml} > ... > ORDER BY > CASE WHEN GROUPING([brand_name]) IS NULL THEN 0 ELSE 1 END, > 3, > CASE WHEN [brand_name] IS NULL THEN 1 ELSE 0 END, > [brand_name], > CASE WHEN [product_class_id] IS NULL THEN 1 ELSE 0 END, > [product_class_id] > {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5636) Consult type mappings when coercing types
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5636: Summary: Consult type mappings when coercing types (was: Consult type mappings when coercing types.) > Consult type mappings when coercing types > - > > Key: CALCITE-5636 > URL: https://issues.apache.org/jira/browse/CALCITE-5636 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Will Noble >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > This is another offshoot of CALCITE-5346. > In > [{{AbstractTypeCoercion.implicitCast}}|https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L713], > Calcite may insert {{CAST}} invocations disregarding the root schema type > map. This happens during the validation phase, so it's easy enough to read > the type map directly in the {{AbstractTypeCoercion}} utility, although it's > hard to say if this is an optimal solution. It may make sense to incorporate > the type map into the {{RelDataTypeSystem}} which is more widely available in > non-validation phases of SQL processing pipelines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-3959) Implement INSTR function
[ https://issues.apache.org/jira/browse/CALCITE-3959?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17710342#comment-17710342 ] Will Noble commented on CALCITE-3959: - - BigQuery and Oracle have identical functions called {{INSTR}} (takes string, substring, position, occurence) - MySQL has a function called {{INSTR}} that's similar to BigQuery's {{STRPOS}} or Postgres' {{POSITION}} (takes only string and substring; equivalent to the special case of BQ's {{INSTR}} where position=1 and occurence=1). I'm not totally clear on how to look up "standard" SQL functions (i.e. to divide functions between {{SqlStdOperatorTable}} and {{SqlLibraryOperators}}) so I'll just assume that Postgres is basically standard SQL. So, here we have a situation where there's a standard SQL function (called {{POSITION}} in Postgres) that's equivalent to a special case of a more general library-specific function (called {{INSTR}} in BigQuery). Does it generally make sense to de-sugar the standard function into the library-specific function in this case? It seems like this would add a little complexity to the [{{unparseCall}}|https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/dialect/BigQuerySqlDialect.java#L152] method of virtually every {{SqlDialect}}, introducing the possibility of a runtime exception if {{occurence}} is ever anything besides a literal {{1}} (although this runtime exception may be unavoidable if parsing in BQ and unparsing in Postgres, for example). Seems like it should work fine, just curious how desirable it is for code health. > Implement INSTR function > > > Key: CALCITE-3959 > URL: https://issues.apache.org/jira/browse/CALCITE-3959 > Project: Calcite > Issue Type: Wish >Reporter: xzh_dz >Assignee: Joey Moore >Priority: Major > > [BiqQuery|https://cloud.google.com/bigquery/docs/reference/standard-sql/string_functions#instr] > and > [Oracle|https://docs.oracle.com/cd/B13789_01/server.101/b10759/functions058.htm] > both support functionally identical INSTR(source_value, search_value[, > position[, occurrence]]) functions which accepts 2 (character strings or > binary strings), 1 optional int representing position, and 1 optional int > representing occurrence. > Occurrence and position are assigned a default value of 1 if not specified. > The function returns the 1-based position of the nth occurrence of the 2nd > operand in the 1st operand where n is defined by the 4th operand. The > function begins searching at the 1-based position specified in the 3rd > operand. > The function also supports negative position values, with -1 indicating the > last character, and will search backwards from the position specified in that > case. > Returns 0 if: > * No match is found. > * If occurrence is greater than the number of matches found. > * If position is greater than the length of source_value. > Returns NULL if: > * Any input argument is NULL. > Returns an error if: > * position is 0. > * occurrence is 0 or negative. > EXAMPLE: {{INSTR("abc", "bc")}} would return 2. > EXAMPLE: {{INSTR("abcabc", "bc", 3)}} would return 5. > EXAMPLE: {{INSTR("abcabc", "bc", -1, 1)}} would return 5. > EXAMPLE: {{INSTR("abcabc", "bc", -1, 2)}} would return 2. > > MySQL also has an [INSTR|https://www.w3schools.com/sql/func_mysql_instr.asp] > function, the functionality of which is a subset of the INSTR present in BQ > and Oracle. MySQL INSTR only takes 2 parameters and returns the first > occurrence of the search value in the source value. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5636) Consult type mappings when coercing types.
[ https://issues.apache.org/jira/browse/CALCITE-5636?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17709784#comment-17709784 ] Will Noble commented on CALCITE-5636: - Whoops closed the other as a duplicate. Think that was some race condition. > Consult type mappings when coercing types. > -- > > Key: CALCITE-5636 > URL: https://issues.apache.org/jira/browse/CALCITE-5636 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Will Noble >Priority: Major > > This is another offshoot of CALCITE-5346. > In > [{{AbstractTypeCoercion.implicitCast}}|https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L713], > Calcite may insert {{CAST}} invocations disregarding the root schema type > map. This happens during the validation phase, so it's easy enough to read > the type map directly in the {{AbstractTypeCoercion}} utility, although it's > hard to say if this is an optimal solution. It may make sense to incorporate > the type map into the {{RelDataTypeSystem}} which is more widely available in > non-validation phases of SQL processing pipelines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Closed] (CALCITE-5635) Consult type mappings when coercing types.
[ https://issues.apache.org/jira/browse/CALCITE-5635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble closed CALCITE-5635. --- Resolution: Duplicate > Consult type mappings when coercing types. > -- > > Key: CALCITE-5635 > URL: https://issues.apache.org/jira/browse/CALCITE-5635 > Project: Calcite > Issue Type: Improvement > Components: core >Reporter: Will Noble >Priority: Major > > This is another offshoot of CALCITE-5346. > In > [{{AbstractTypeCoercion.implicitCast}}|https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L713], > Calcite may insert {{CAST}} invocations disregarding the root schema type > map. This happens during the validation phase, so it's easy enough to read > the type map directly in the {{AbstractTypeCoercion}} utility, although it's > hard to say if this is an optimal solution. It may make sense to incorporate > the type map into the {{RelDataTypeSystem}} which is more widely available in > non-validation phases of SQL processing pipelines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5636) Consult type mappings when coercing types.
Will Noble created CALCITE-5636: --- Summary: Consult type mappings when coercing types. Key: CALCITE-5636 URL: https://issues.apache.org/jira/browse/CALCITE-5636 Project: Calcite Issue Type: Improvement Components: core Reporter: Will Noble This is another offshoot of CALCITE-5346. In [{{AbstractTypeCoercion.implicitCast}}|https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L713], Calcite may insert {{CAST}} invocations disregarding the root schema type map. This happens during the validation phase, so it's easy enough to read the type map directly in the {{AbstractTypeCoercion}} utility, although it's hard to say if this is an optimal solution. It may make sense to incorporate the type map into the {{RelDataTypeSystem}} which is more widely available in non-validation phases of SQL processing pipelines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5635) Consult type mappings when coercing types.
Will Noble created CALCITE-5635: --- Summary: Consult type mappings when coercing types. Key: CALCITE-5635 URL: https://issues.apache.org/jira/browse/CALCITE-5635 Project: Calcite Issue Type: Improvement Components: core Reporter: Will Noble This is another offshoot of CALCITE-5346. In [{{AbstractTypeCoercion.implicitCast}}|https://github.com/apache/calcite/blob/2dba40e7a0a5651eac5a30d9e0a72f178bd9bff2/core/src/main/java/org/apache/calcite/sql/validate/implicit/AbstractTypeCoercion.java#L713], Calcite may insert {{CAST}} invocations disregarding the root schema type map. This happens during the validation phase, so it's easy enough to read the type map directly in the {{AbstractTypeCoercion}} utility, although it's hard to say if this is an optimal solution. It may make sense to incorporate the type map into the {{RelDataTypeSystem}} which is more widely available in non-validation phases of SQL processing pipelines. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5552) Returned timestamp is incorrect after the 100th row
[ https://issues.apache.org/jira/browse/CALCITE-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17697659#comment-17697659 ] Will Noble commented on CALCITE-5552: - Sounds like this is similar in nature to, though probably distinct from, CALCITE-5488 (just my initial hunch) > Returned timestamp is incorrect after the 100th row > --- > > Key: CALCITE-5552 > URL: https://issues.apache.org/jira/browse/CALCITE-5552 > Project: Calcite > Issue Type: Bug > Components: avatica >Affects Versions: avatica-1.23.0 >Reporter: Magnus Mogren >Priority: Critical > Attachments: TSTAMPS.csv > > > When fetching data that contains timestamps the returned timestamp after row > 100 is incorrect. > This can be reproduced using the CSV adapter (calcite-csv) using the > TSTAMPS.csv file attached. It contains 101 timestamps with the value > 1900-01-01 00:00:00. The first 100 is returned correctly, but number 101 in > the result has the value 1899-12-31 23:00:00 instead. > Marking this bug as critical since not beeing able to trust the values > returned by calcite is as bad as it gets in my opinion. > I do not know if the bug is in calcite or avatica. > I have created a project that reproduces the issue. You can find that here: > [nytro77/calcite-timestamp-bug: Showcase bug in Calcite or Avatica that > causes faulty timestamps to be returned > (github.com)|https://github.com/nytro77/calcite-timestamp-bug] > It starts an AvaticaServer that serves the attached file as a table using > calcite-csv and a unit tests that connects to the server and fetches the data. > Run it with > {code:java} > ./mvnw test{code} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (CALCITE-5446) Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver
[ https://issues.apache.org/jira/browse/CALCITE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble reassigned CALCITE-5446: --- Assignee: Will Noble (was: Oliver Lee) > Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver > -- > > Key: CALCITE-5446 > URL: https://issues.apache.org/jira/browse/CALCITE-5446 > Project: Calcite > Issue Type: Bug > Components: avatica >Reporter: Julian Hyde >Assignee: Will Noble >Priority: Major > Labels: pull-request-available > Time Spent: 5.5h > Remaining Estimate: 0h > > Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver. > Currently, Calcite types with {{SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE}} > are mapped to JDBC type (OTHER). As a result, the ResultSet.toString() > method just converts the (signed) {{long}} value to a string. > Here is what > [Progress|https://docs.progress.com/en-US/bundle/datadirect-connect-jdbc-51/page/TIMESTAMP-WITH-LOCAL-TIME-ZONE-Data-Type.html] > says: > {quote} > The Oracle {{TIMESTAMP WITH LOCAL TIME ZONE}} data type is mapped to the JDBC > {{TIMESTAMP}} data type. > When retrieving {{TIMESTAMP WITH LOCAL TIME ZONE}} columns, the value > returned to the user is converted to the time zone specified by the > {{TIME_ZONE}} session parameter. > When setting {{TIMESTAMP WITH LOCAL TIME ZONE}} columns: > * Using a timestamp (using {{PreparedStatement.setTimestamp}}, for example), > the value set is converted to the time zone specified by the {{TIME_ZONE}} > session parameter. > * Using a string (using {{PreparedStatement.setString()}}, for example), the > string is passed as-is to the server. The supplied string must be in the > format specified by the {{NLS_TIMESTAMP_TZ_FORMAT}} session parameter. If > not, the Oracle server generates an error when it attempts to convert the > string to the {{TIMESTAMP WITH LOCAL TIME ZONE}} type. > {quote} > We should definitely map to {{java.sql.Types.TIMESTAMP}}. > I'm not sure whether Calcite has a {{TIME_ZONE}} session parameter. But > values should be shifted from/to the desired time zone on the way in/out. > And maybe there should also be a Calcite server time zone that is the time > zone in which {{TIMESTAMP WITH LOCAL TIME ZONE}} values are represented. By > default this should be UTC, and using a time zone with a variable offset from > UTC (e.g. "Pacific/Los_Angeles") would be a bad idea. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5536) Clean up some of the magic numbers in AvaticaResultSetConversionsTest
Will Noble created CALCITE-5536: --- Summary: Clean up some of the magic numbers in AvaticaResultSetConversionsTest Key: CALCITE-5536 URL: https://issues.apache.org/jira/browse/CALCITE-5536 Project: Calcite Issue Type: Improvement Components: avatica Reporter: Will Noble Retroactively filing this bug for some [cleanup I did|https://github.com/apache/calcite-avatica/pull/208] in preparation for the fix to CALCITE-5446. This cleanup makes the fix and accompanied test adjustments easier to review. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5535) Time values are not properly truncated
Will Noble created CALCITE-5535: --- Summary: Time values are not properly truncated Key: CALCITE-5535 URL: https://issues.apache.org/jira/browse/CALCITE-5535 Project: Calcite Issue Type: Bug Components: avatica Reporter: Will Noble Avatica [uses the binary modulo operator|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L1164] to truncate time values by the number of milliseconds in a day, but this operator can result in negative numbers. We should use {{Math.floorMod()}} instead. There are also places where numbers are not truncated at all, such as when [converting to string|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L262]. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5526: Description: CALCITE-5424 dealt with parsing date/time literals via a custom type system, however there is still no way to unparse them via the same type system. This is handled in [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] which simply calls the literal object's {{toString()}} method. The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not be able to determine it's own unparsed representation by itself since it could be unparsed in any dialect. Since the current system for resolving custom types relies on access to the catalog, it appears we'll need to introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it can use the same mappings for unparsing as it does for parsing. was: [CALCITE-5424|https://issues.apache.org/jira/browse/CALCITE-5424] dealt with parsing date/time literals via a custom type system, however there is still no way to unparse them via the same type system. This is handled in [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] which simply calls the literal object's {{toString()}} method. The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not be able to determine it's own unparsed representation by itself since it could be unparsed in any dialect. Since the current system for resolving custom types relies on access to the catalog, it appears we'll need to introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it can use the same mappings for unparsing as it does for parsing. > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > CALCITE-5424 dealt with parsing date/time literals via a custom type system, > however there is still no way to unparse them via the same type system. This > is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble reassigned CALCITE-5526: --- Assignee: Will Noble > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > [CALCITE-5424|https://issues.apache.org/jira/browse/CALCITE-5424] dealt with > parsing date/time literals via a custom type system, however there is still > no way to unparse them via the same type system. This is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5526) Handle unparsing of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5526?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688638#comment-17688638 ] Will Noble commented on CALCITE-5526: - I'm having trouble visualizing your supposition. My understanding is that {{SqlUnknownLiteral}} was used because of the separate parsing and validation steps, but I'm not aware of a second step during unparsing that would allow us to invert the parsing logic directly. Where do you think this intermediate "unknown" literal should be created? And are you suggesting I introduce the dependency on {{CalciteCatalogReader}} in some other place besides the {{{}SqlDialect{}}}? > Handle unparsing of literals based on type system > - > > Key: CALCITE-5526 > URL: https://issues.apache.org/jira/browse/CALCITE-5526 > Project: Calcite > Issue Type: Task >Reporter: Will Noble >Priority: Minor > > [CALCITE-5424|https://issues.apache.org/jira/browse/CALCITE-5424] dealt with > parsing date/time literals via a custom type system, however there is still > no way to unparse them via the same type system. This is handled in > [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] > which simply calls the literal object's {{toString()}} method. > The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not > be able to determine it's own unparsed representation by itself since it > could be unparsed in any dialect. Since the current system for resolving > custom types relies on access to the catalog, it appears we'll need to > introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it > can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5446) Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver
[ https://issues.apache.org/jira/browse/CALCITE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17688627#comment-17688627 ] Will Noble commented on CALCITE-5446: - Adding my thoughts on the proper semantics for these timestamps types from CALCITE-5500 which was marked as a duplicate of this: {{TIMESTAMP}}: * {{getString()}} -> Always returns the same string no matter the client's time zone * {{getTimestamp()}} -> Subtracts the zone offset from the instant, so rendering the instant to a string in the client's time zone produces the same result as {{getString()}} {{TIMESTAMP WITH LOCAL TIME ZONE}}: * {{getTimestamp()}} -> Always returns the same instant no matter the time zone * {{getString()}} -> Adds the zone offset to the instant before rending it to a string, so rendering the result of {{getTimestamp()}} in the client's time zone produces the same result as this. > Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver > -- > > Key: CALCITE-5446 > URL: https://issues.apache.org/jira/browse/CALCITE-5446 > Project: Calcite > Issue Type: Bug > Components: avatica >Reporter: Julian Hyde >Assignee: Oliver Lee >Priority: Major > Labels: pull-request-available > Time Spent: 3h 10m > Remaining Estimate: 0h > > Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver. > Currently, Calcite types with {{SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE}} > are mapped to JDBC type (OTHER). As a result, the ResultSet.toString() > method just converts the (signed) {{long}} value to a string. > Here is what > [Progress|https://docs.progress.com/en-US/bundle/datadirect-connect-jdbc-51/page/TIMESTAMP-WITH-LOCAL-TIME-ZONE-Data-Type.html] > says: > {quote} > The Oracle {{TIMESTAMP WITH LOCAL TIME ZONE}} data type is mapped to the JDBC > {{TIMESTAMP}} data type. > When retrieving {{TIMESTAMP WITH LOCAL TIME ZONE}} columns, the value > returned to the user is converted to the time zone specified by the > {{TIME_ZONE}} session parameter. > When setting {{TIMESTAMP WITH LOCAL TIME ZONE}} columns: > * Using a timestamp (using {{PreparedStatement.setTimestamp}}, for example), > the value set is converted to the time zone specified by the {{TIME_ZONE}} > session parameter. > * Using a string (using {{PreparedStatement.setString()}}, for example), the > string is passed as-is to the server. The supplied string must be in the > format specified by the {{NLS_TIMESTAMP_TZ_FORMAT}} session parameter. If > not, the Oracle server generates an error when it attempts to convert the > string to the {{TIMESTAMP WITH LOCAL TIME ZONE}} type. > {quote} > We should definitely map to {{java.sql.Types.TIMESTAMP}}. > I'm not sure whether Calcite has a {{TIME_ZONE}} session parameter. But > values should be shifted from/to the desired time zone on the way in/out. > And maybe there should also be a Calcite server time zone that is the time > zone in which {{TIMESTAMP WITH LOCAL TIME ZONE}} values are represented. By > default this should be UTC, and using a time zone with a variable offset from > UTC (e.g. "Pacific/Los_Angeles") would be a bad idea. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5526) Handle unparsing of literals based on type system
Will Noble created CALCITE-5526: --- Summary: Handle unparsing of literals based on type system Key: CALCITE-5526 URL: https://issues.apache.org/jira/browse/CALCITE-5526 Project: Calcite Issue Type: Task Reporter: Will Noble [CALCITE-5424|https://issues.apache.org/jira/browse/CALCITE-5424] dealt with parsing date/time literals via a custom type system, however there is still no way to unparse them via the same type system. This is handled in [SqlDialect.unparseDateTimeLiteral|https://github.com/apache/calcite/blob/a0e119ea42def418957f214f539469f1aba76c18/core/src/main/java/org/apache/calcite/sql/SqlDialect.java#L468] which simply calls the literal object's {{toString()}} method. The literal object (a subclass of {{SqlAbstractDateTimeLiteral}}) should not be able to determine it's own unparsed representation by itself since it could be unparsed in any dialect. Since the current system for resolving custom types relies on access to the catalog, it appears we'll need to introduce to {{SqlDialect}} a dependency on {{CalciteCatalogReader}} so it can use the same mappings for unparsing as it does for parsing. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Comment Edited] (CALCITE-5488) Avatica double-adjusts timestamps when calling Array.getResultSet()
[ https://issues.apache.org/jira/browse/CALCITE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680016#comment-17680016 ] Will Noble edited comment on CALCITE-5488 at 2/13/23 7:26 PM: -- This is actually a pretty narrow bug. I'll step through the problem at a higher level than in the ticket description to illuminate what I see as the solution: Problem: 1. We get a {{ResultSet}} that contains an array that contains timestamps. 2. We properly adjust all the timestamps in the array when we call {{{}getArray(){}}}. 3. We decide to convert that array to a "sub-{{{}ResultSet{}}}" by calling {{java.sql.Array.getResultSet()}} on the array, causing the timestamps to be double-adjusted. It's not so much that any of your 4 example constraints were violated. Rather, it's that the process of of getting an array and then converting it to a result set involves an incorrect series of operations on the primitive ({{{}long{}}}) internal representation of the result set before it's abstracted into a {{java.sql.Timestamp}} object. My original proposal was to set the local calendar of any {{ResultSet}} created by {{java.sql.Array.getResultSet()}} to UTC. By static analysis of the code, I can verify that this will *only* affect this particular narrow case: getting time/date/timestamp values out of an array result set, as well as the return value of {{{}AvaticaResultSet.getLocalCalendar(){}}}, which is a public method but is not part of the JDBC spec and is not used anywhere in the Avatica codebase. However, this would create a problematic edge-case where an explicit calendar is provided to get the individual timestamps out of the array's result set; they would be incorrect. The only real solution I see now is to "de-adjust" the data in the result set during {{ArrayFactoryImpl.create()}} (in between steps 2 and 3 of the problem above), effectively undoing the first adjustment and making it so that the second adjustment (whether implicit or explicit) is correct. was (Author: wnoble): This is actually a pretty narrow bug. I'll step through the problem at a higher level than in the ticket description to illuminate what I see as the two possible solutions: Problem: 1. We get a {{ResultSet}} that contains an array that contains timestamps. 2. We properly adjust all the timestamps in the array when we call {{getArray()}}. 3. We decide to convert that array to a "sub-{{ResultSet}}" by calling {{java.sql.Array.getResultSet()}} on the array, causing the timestamps to be double-adjusted. It's not so much that any of your 4 example constraints were violated. Rather, it's that the process of of getting an array and then converting it to a result set involves an incorrect series of operations on the primitive ({{long}}) internal representation of the result set before it's abstracted into a {{java.sql.Timestamp}} object. My original proposal was to set the local calendar of any {{ResultSet}} created by {{java.sql.Array.getResultSet()}} to UTC. By static analysis of the code, I can verify that this will *only* affect this particular narrow case: getting time/date/timestamp values out of an array result set, as well as the return value of {{AvaticaResultSet.getLocalCalendar()}}, which is a public method but is not part of the JDBC spec and is not used anywhere in the Avatica codebase. However, this would create a problematic edge-case where an explicit calendar is provided to get the individual timestamps out of the array's result set; they would be incorrect. The only real solution I see now is to "de-adjust" the data in the result set during {{ArrayFactoryImpl.create()}} (in between steps 2 and 3 of the problem above), effectively undoing the first adjustment and making it so that the second adjustment (whether implicit or explicit) is correct. > Avatica double-adjusts timestamps when calling Array.getResultSet() > --- > > Key: CALCITE-5488 > URL: https://issues.apache.org/jira/browse/CALCITE-5488 > Project: Calcite > Issue Type: Bug > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > If you go into {{AvaticaResultSetConversionTest}} and delete the > [timeZone|https://github.com/apache/calcite-avatica/blob/dbe9b1d8c2e53474eb40cfaf5721aceca3bdb57f/core/src/test/java/org/apache/calcite/avatica/AvaticaResultSetConversionsTest.java#L1121] > property for the connection, which is currently set to "GMT", several > {{getString()}} and {{getArray()}} tests start to fail. The {{getString()}} > failures are self-explanatory -- the expected value has become incorrect > because of the time zone change -- but the {{getArray()}} failures are > tricky. They're double-adjusting the timestamps: > * Once when
[jira] [Updated] (CALCITE-5511) Substitute type names in error messages when type aliases are present
[ https://issues.apache.org/jira/browse/CALCITE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5511: Description: The type aliases covered in CALCITE-5346 may not be included in error messages properly. There should be a setting to do type unmapping when rendering error messages. Type unmapping will require access to a catalog. For example, in [SqlCastFunction.checkOperandTypes|https://github.com/apache/calcite/blob/d9a81b88ad561e7e4cedae93e805e0d7a53a7f1a/core/src/main/java/org/apache/calcite/sql/fun/SqlCastFunction.java#L161] it throws an error based on the canonical name of the {{RelDataType}} which may be confusing to end-users. was:The type aliases covered in CALCITE-5346 may not be included in error messages properly. Perhaps there should be a setting to do type unmapping when rendering error messages. > Substitute type names in error messages when type aliases are present > - > > Key: CALCITE-5511 > URL: https://issues.apache.org/jira/browse/CALCITE-5511 > Project: Calcite > Issue Type: Improvement >Reporter: Will Noble >Assignee: Oliver Lee >Priority: Major > > The type aliases covered in CALCITE-5346 may not be included in error > messages properly. There should be a setting to do type unmapping when > rendering error messages. Type unmapping will require access to a catalog. > For example, in > [SqlCastFunction.checkOperandTypes|https://github.com/apache/calcite/blob/d9a81b88ad561e7e4cedae93e805e0d7a53a7f1a/core/src/main/java/org/apache/calcite/sql/fun/SqlCastFunction.java#L161] > it throws an error based on the canonical name of the {{RelDataType}} which > may be confusing to end-users. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5511) Substitute type names in error messages when type aliases are present
[ https://issues.apache.org/jira/browse/CALCITE-5511?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5511: Summary: Substitute type names in error messages when type aliases are present (was: Think about type aliases in error messages) > Substitute type names in error messages when type aliases are present > - > > Key: CALCITE-5511 > URL: https://issues.apache.org/jira/browse/CALCITE-5511 > Project: Calcite > Issue Type: Improvement >Reporter: Will Noble >Priority: Major > > The type aliases covered in CALCITE-5346 may not be included in error > messages properly. Perhaps there should be a setting to do type unmapping > when rendering error messages. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5511) Think about type aliases in error messages
Will Noble created CALCITE-5511: --- Summary: Think about type aliases in error messages Key: CALCITE-5511 URL: https://issues.apache.org/jira/browse/CALCITE-5511 Project: Calcite Issue Type: Improvement Reporter: Will Noble The type aliases covered in CALCITE-5346 may not be included in error messages properly. Perhaps there should be a setting to do type unmapping when rendering error messages. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5509) Allow automatic type coercion for string literals in BigQuery functions
[ https://issues.apache.org/jira/browse/CALCITE-5509?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5509: Summary: Allow automatic type coercion for string literals in BigQuery functions (was: Allow automatic type coercion for strings in BigQuery functions) > Allow automatic type coercion for string literals in BigQuery functions > --- > > Key: CALCITE-5509 > URL: https://issues.apache.org/jira/browse/CALCITE-5509 > Project: Calcite > Issue Type: Improvement >Reporter: Will Noble >Priority: Minor > > BigQuery allows string literals to be automatically coerced into > {{{}TIMESTAMP{}}}, {{{}DATETIME{}}}, or {{DATE}} literals. See in the > [docs|https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical]: > {quote}String literals in canonical date format also implicitly coerce to > DATE type when used where a DATE-type expression is expected... > String literals with the canonical datetime format implicitly coerce to a > datetime literal when used where a datetime expression is expected... > String literals with the canonical timestamp format, including those with > time zone names, implicitly coerce to a timestamp literal when used where a > timestamp expression is expected. > {quote} > We should make sure that the BQ library functions respect this type coercion > in general. This only applies to literals; not string values in general. For > example, {{'2023-01-30'}} can be coerced to a timestamp, datetime, or date, > but {{SUBSTRING('x2023-01-30', 2)}} cannot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5509) Allow automatic type coercion for strings in BigQuery functions
Will Noble created CALCITE-5509: --- Summary: Allow automatic type coercion for strings in BigQuery functions Key: CALCITE-5509 URL: https://issues.apache.org/jira/browse/CALCITE-5509 Project: Calcite Issue Type: Improvement Reporter: Will Noble BigQuery allows string literals to be automatically coerced into {{{}TIMESTAMP{}}}, {{{}DATETIME{}}}, or {{DATE}} literals. See in the [docs|https://cloud.google.com/bigquery/docs/reference/standard-sql/lexical]: {quote}String literals in canonical date format also implicitly coerce to DATE type when used where a DATE-type expression is expected... String literals with the canonical datetime format implicitly coerce to a datetime literal when used where a datetime expression is expected... String literals with the canonical timestamp format, including those with time zone names, implicitly coerce to a timestamp literal when used where a timestamp expression is expected. {quote} We should make sure that the BQ library functions respect this type coercion in general. This only applies to literals; not string values in general. For example, {{'2023-01-30'}} can be coerced to a timestamp, datetime, or date, but {{SUBSTRING('x2023-01-30', 2)}} cannot. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5446) Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver
[ https://issues.apache.org/jira/browse/CALCITE-5446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680852#comment-17680852 ] Will Noble commented on CALCITE-5446: - {quote}values should be shifted from/to the desired time zone on the way in/out {quote} This must be done in Avatica. See CALCITE-5500. I'd like to do a bit of testing against my local Avatica branch with a draft fix before signing off on that PR. > Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver > -- > > Key: CALCITE-5446 > URL: https://issues.apache.org/jira/browse/CALCITE-5446 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Oliver Lee >Priority: Major > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Support TIMESTAMP WITH LOCAL TIME ZONE type in JDBC driver. > Currently, Calcite types with {{SqlTypeName.TIMESTAMP_WITH_LOCAL_TIME_ZONE}} > are mapped to JDBC type (OTHER). As a result, the ResultSet.toString() > method just converts the (signed) {{long}} value to a string. > Here is what > [Progress|https://docs.progress.com/en-US/bundle/datadirect-connect-jdbc-51/page/TIMESTAMP-WITH-LOCAL-TIME-ZONE-Data-Type.html] > says: > {quote} > The Oracle {{TIMESTAMP WITH LOCAL TIME ZONE}} data type is mapped to the JDBC > {{TIMESTAMP}} data type. > When retrieving {{TIMESTAMP WITH LOCAL TIME ZONE}} columns, the value > returned to the user is converted to the time zone specified by the > {{TIME_ZONE}} session parameter. > When setting {{TIMESTAMP WITH LOCAL TIME ZONE}} columns: > * Using a timestamp (using {{PreparedStatement.setTimestamp}}, for example), > the value set is converted to the time zone specified by the {{TIME_ZONE}} > session parameter. > * Using a string (using {{PreparedStatement.setString()}}, for example), the > string is passed as-is to the server. The supplied string must be in the > format specified by the {{NLS_TIMESTAMP_TZ_FORMAT}} session parameter. If > not, the Oracle server generates an error when it attempts to convert the > string to the {{TIMESTAMP WITH LOCAL TIME ZONE}} type. > {quote} > We should definitely map to {{java.sql.Types.TIMESTAMP}}. > I'm not sure whether Calcite has a {{TIME_ZONE}} session parameter. But > values should be shifted from/to the desired time zone on the way in/out. > And maybe there should also be a Calcite server time zone that is the time > zone in which {{TIMESTAMP WITH LOCAL TIME ZONE}} values are represented. By > default this should be UTC, and using a time zone with a variable offset from > UTC (e.g. "Pacific/Los_Angeles") would be a bad idea. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5308) Add support to Avatica for microsecond-precision timestamps
[ https://issues.apache.org/jira/browse/CALCITE-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5308: Description: Avatica [serializes timestamps|https://github.com/apache/calcite-avatica/blob/9dfc5cad3efe72e9b14513b15a988ba86ab5a4ea/core/src/main/java/org/apache/calcite/avatica/remote/TypedValue.java#L447] as the integer number of milliseconds since epoch. We would like support for microsecond precision. (was: Avatica [serializes timestamps|https://github.com/apache/calcite-avatica/blob/9dfc5cad3efe72e9b14513b15a988ba86ab5a4ea/core/src/main/java/org/apache/calcite/avatica/remote/TypedValue.java#L447] as the integer number of milliseconds since epoch. We would like support for microsecond precision, although we only need it for {{TIMESTAMP_WITH_LOCAL_TIME_ZONE}}, which is currently lacking proper support (The recommended corresponding Java type as per [the spec|https://download.oracle.com/otn-pub/jcp/jdbc-4_2-mrel2-spec/jdbc4.2-fr-spec.pdf?AuthParam=1664936014_d36468601dd3a115d7d95c975043ff3a] is {{java.time.OffsetDateTime}}).) > Add support to Avatica for microsecond-precision timestamps > --- > > Key: CALCITE-5308 > URL: https://issues.apache.org/jira/browse/CALCITE-5308 > Project: Calcite > Issue Type: Improvement > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Avatica [serializes > timestamps|https://github.com/apache/calcite-avatica/blob/9dfc5cad3efe72e9b14513b15a988ba86ab5a4ea/core/src/main/java/org/apache/calcite/avatica/remote/TypedValue.java#L447] > as the integer number of milliseconds since epoch. We would like support for > microsecond precision. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5308) Add support to Avatica for microsecond-precision timestamps
[ https://issues.apache.org/jira/browse/CALCITE-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5308: Summary: Add support to Avatica for microsecond-precision timestamps (was: Add support to Avatica for microsecond-precision timestamps (with timezone)) > Add support to Avatica for microsecond-precision timestamps > --- > > Key: CALCITE-5308 > URL: https://issues.apache.org/jira/browse/CALCITE-5308 > Project: Calcite > Issue Type: Improvement > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Avatica [serializes > timestamps|https://github.com/apache/calcite-avatica/blob/9dfc5cad3efe72e9b14513b15a988ba86ab5a4ea/core/src/main/java/org/apache/calcite/avatica/remote/TypedValue.java#L447] > as the integer number of milliseconds since epoch. We would like support for > microsecond precision, although we only need it for > {{TIMESTAMP_WITH_LOCAL_TIME_ZONE}}, which is currently lacking proper support > (The recommended corresponding Java type as per [the > spec|https://download.oracle.com/otn-pub/jcp/jdbc-4_2-mrel2-spec/jdbc4.2-fr-spec.pdf?AuthParam=1664936014_d36468601dd3a115d7d95c975043ff3a] > is {{java.time.OffsetDateTime}}). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5308) Add support to Avatica for microsecond-precision timestamps (with timezone)
[ https://issues.apache.org/jira/browse/CALCITE-5308?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680851#comment-17680851 ] Will Noble commented on CALCITE-5308: - I split this into a separate bug for {{TIMESTAMP WITH LOCAL TIME ZONE}}: CALCITE-5500 Now, this bug will just track extended precision for both {{TIMESTAMP}} and {{TIMESTAMP WITH LOCAL TIME ZONE}}. > Add support to Avatica for microsecond-precision timestamps (with timezone) > --- > > Key: CALCITE-5308 > URL: https://issues.apache.org/jira/browse/CALCITE-5308 > Project: Calcite > Issue Type: Improvement > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Avatica [serializes > timestamps|https://github.com/apache/calcite-avatica/blob/9dfc5cad3efe72e9b14513b15a988ba86ab5a4ea/core/src/main/java/org/apache/calcite/avatica/remote/TypedValue.java#L447] > as the integer number of milliseconds since epoch. We would like support for > microsecond precision, although we only need it for > {{TIMESTAMP_WITH_LOCAL_TIME_ZONE}}, which is currently lacking proper support > (The recommended corresponding Java type as per [the > spec|https://download.oracle.com/otn-pub/jcp/jdbc-4_2-mrel2-spec/jdbc4.2-fr-spec.pdf?AuthParam=1664936014_d36468601dd3a115d7d95c975043ff3a] > is {{java.time.OffsetDateTime}}). -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5500) Add support in Avatica for TIMESTAMP WITH LOCAL TIME ZONE
Will Noble created CALCITE-5500: --- Summary: Add support in Avatica for TIMESTAMP WITH LOCAL TIME ZONE Key: CALCITE-5500 URL: https://issues.apache.org/jira/browse/CALCITE-5500 Project: Calcite Issue Type: Improvement Components: avatica Reporter: Will Noble This is closely related to CALCITE-5446 but tracks the changes required in Avatica. This is where time zone conversion happens when the client invokes a getter method on a {{ResultSet}}. Here are the intended semantics for {{TIMESTAMP}} and {{TIMESTAMP WITH LOCAL TIME ZONE}}: {{TIMESTAMP}}: * {{getString()}} -> Always returns the same string no matter the client's time zone * {{getTimestamp()}} -> Subtracts the zone offset from the instant, so rendering the instant to a string in the client's time zone produces the same result as {{getString()}} {{TIMESTAMP WITH LOCAL TIME ZONE}}: * {{getTimestamp()}} -> Always returns the same instant no matter the time zone * {{getString()}} -> Adds the zone offset to the instant before rending it to a string, so rendering the result of {{getTimestamp()}} in the client's time zone produces the same result as this. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5492) Adjusting dates based on time zone doesn't always make sense
Will Noble created CALCITE-5492: --- Summary: Adjusting dates based on time zone doesn't always make sense Key: CALCITE-5492 URL: https://issues.apache.org/jira/browse/CALCITE-5492 Project: Calcite Issue Type: Improvement Components: avatica Reporter: Will Noble The JDBC API allows you to provide a {{Calendar}} argument to [ResultSet.getDate()|https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getDate(int,%20java.util.Calendar)] "to use in constructing the date" and Avatica currently uses it to adjust the millisecond value of the {{java.sql.Date}} object that comes out of a result set according to the time zone of the calendar, similar to how it adjust timestamps and time-of-day values. This might make sense if Calcite always represented date objects with millisecond precision, but since it often represents them as an integer number of days since 1970-01-01, adjusting based on time zone does not make sense in general. This has become an issue in the fix for [CALCITE-5488|https://issues.apache.org/jira/browse/CALCITE-5488] because date values may have to be "un-adjusted", but information may have already been lost due to truncating division to convert milliseconds to days. I'm not yet sure what the best way to proceed might be, but I'm wondering what the intended semantics here are. Is a date meant to have millisecond resolution (in which case representing it as an integer number of days does not make sense), or is it meant to have only day resolution (in which case applying a time zone does not make sense)? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5492) Adjusting dates based on time zone doesn't always make sense
[ https://issues.apache.org/jira/browse/CALCITE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5492: Priority: Minor (was: Major) > Adjusting dates based on time zone doesn't always make sense > > > Key: CALCITE-5492 > URL: https://issues.apache.org/jira/browse/CALCITE-5492 > Project: Calcite > Issue Type: Improvement > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > The JDBC API allows you to provide a {{Calendar}} argument to > [ResultSet.getDate()|https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getDate(int,%20java.util.Calendar)] > "to use in constructing the date" and Avatica currently uses it to adjust > the millisecond value of the {{java.sql.Date}} object that comes out of a > result set according to the time zone of the calendar, similar to how it > adjust timestamps and time-of-day values. > This might make sense if Calcite always represented date objects with > millisecond precision, but since it often represents them as an integer > number of days since 1970-01-01, adjusting based on time zone does not make > sense in general. This has become an issue in the fix for > [CALCITE-5488|https://issues.apache.org/jira/browse/CALCITE-5488] because > date values may have to be "un-adjusted", but information may have already > been lost due to truncating division to convert milliseconds to days. > I'm not yet sure what the best way to proceed might be, but I'm wondering > what the intended semantics here are. Is a date meant to have millisecond > resolution (in which case representing it as an integer number of days does > not make sense), or is it meant to have only day resolution (in which case > applying a time zone does not make sense)? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (CALCITE-5492) Adjusting dates based on time zone doesn't always make sense
[ https://issues.apache.org/jira/browse/CALCITE-5492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble reassigned CALCITE-5492: --- Assignee: Will Noble > Adjusting dates based on time zone doesn't always make sense > > > Key: CALCITE-5492 > URL: https://issues.apache.org/jira/browse/CALCITE-5492 > Project: Calcite > Issue Type: Improvement > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Major > > The JDBC API allows you to provide a {{Calendar}} argument to > [ResultSet.getDate()|https://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getDate(int,%20java.util.Calendar)] > "to use in constructing the date" and Avatica currently uses it to adjust > the millisecond value of the {{java.sql.Date}} object that comes out of a > result set according to the time zone of the calendar, similar to how it > adjust timestamps and time-of-day values. > This might make sense if Calcite always represented date objects with > millisecond precision, but since it often represents them as an integer > number of days since 1970-01-01, adjusting based on time zone does not make > sense in general. This has become an issue in the fix for > [CALCITE-5488|https://issues.apache.org/jira/browse/CALCITE-5488] because > date values may have to be "un-adjusted", but information may have already > been lost due to truncating division to convert milliseconds to days. > I'm not yet sure what the best way to proceed might be, but I'm wondering > what the intended semantics here are. Is a date meant to have millisecond > resolution (in which case representing it as an integer number of days does > not make sense), or is it meant to have only day resolution (in which case > applying a time zone does not make sense)? -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5488) Avatica double-adjusts timestamps when calling Array.getResultSet()
[ https://issues.apache.org/jira/browse/CALCITE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17680016#comment-17680016 ] Will Noble commented on CALCITE-5488: - This is actually a pretty narrow bug. I'll step through the problem at a higher level than in the ticket description to illuminate what I see as the two possible solutions: Problem: 1. We get a {{ResultSet}} that contains an array that contains timestamps. 2. We properly adjust all the timestamps in the array when we call {{getArray()}}. 3. We decide to convert that array to a "sub-{{ResultSet}}" by calling {{java.sql.Array.getResultSet()}} on the array, causing the timestamps to be double-adjusted. It's not so much that any of your 4 example constraints were violated. Rather, it's that the process of of getting an array and then converting it to a result set involves an incorrect series of operations on the primitive ({{long}}) internal representation of the result set before it's abstracted into a {{java.sql.Timestamp}} object. My original proposal was to set the local calendar of any {{ResultSet}} created by {{java.sql.Array.getResultSet()}} to UTC. By static analysis of the code, I can verify that this will *only* affect this particular narrow case: getting time/date/timestamp values out of an array result set, as well as the return value of {{AvaticaResultSet.getLocalCalendar()}}, which is a public method but is not part of the JDBC spec and is not used anywhere in the Avatica codebase. However, this would create a problematic edge-case where an explicit calendar is provided to get the individual timestamps out of the array's result set; they would be incorrect. The only real solution I see now is to "de-adjust" the data in the result set during {{ArrayFactoryImpl.create()}} (in between steps 2 and 3 of the problem above), effectively undoing the first adjustment and making it so that the second adjustment (whether implicit or explicit) is correct. > Avatica double-adjusts timestamps when calling Array.getResultSet() > --- > > Key: CALCITE-5488 > URL: https://issues.apache.org/jira/browse/CALCITE-5488 > Project: Calcite > Issue Type: Bug > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > If you go into {{AvaticaResultSetConversionTest}} and delete the > [timeZone|https://github.com/apache/calcite-avatica/blob/dbe9b1d8c2e53474eb40cfaf5721aceca3bdb57f/core/src/test/java/org/apache/calcite/avatica/AvaticaResultSetConversionsTest.java#L1121] > property for the connection, which is currently set to "GMT", several > {{getString()}} and {{getArray()}} tests start to fail. The {{getString()}} > failures are self-explanatory -- the expected value has become incorrect > because of the time zone change -- but the {{getArray()}} failures are > tricky. They're double-adjusting the timestamps: > * Once when they call > [getObject()|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L1432] > (which invokes {{getTimestamp()}} with the connection's default calendar, > which was previously GMT but is now the system default). > * Then, in > [ArrayImpl.equalContents()|https://github.com/apache/calcite-avatica/blob/820edf6f653607afb5a2a280a32f315aff1f64cb/core/src/main/java/org/apache/calcite/avatica/util/ArrayImpl.java#L233] > the arrays are converted to result sets for traversal via > [Array.getResultSet()|https://docs.oracle.com/javase/8/docs/api/java/sql/Array.html#getResultSet--] > , which uses the same time zone as the "factory" result set and eventually > invokes {{getObject()}} again, thus applying the time zone offset twice. > This is a bug in the implementation of {{Array.getResultSet()}} that only > manifests when the array contains timestamps and when the connection default > time zone is anything besides GMT, and this is not currently covered by > tests. It's easy to cause the failure, however, by changing the connection > default time zone for {{AvaticaResultSetConversionTest}} to anything besides > GMT or UTC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5488) Avatica double-adjusts timestamps when calling Array.getResultSet()
[ https://issues.apache.org/jira/browse/CALCITE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5488: Summary: Avatica double-adjusts timestamps when calling Array.getResultSet() (was: Calcite double-adjusts timestamps when calling Array.getResultSet()) > Avatica double-adjusts timestamps when calling Array.getResultSet() > --- > > Key: CALCITE-5488 > URL: https://issues.apache.org/jira/browse/CALCITE-5488 > Project: Calcite > Issue Type: Bug > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > If you go into {{AvaticaResultSetConversionTest}} and delete the > [timeZone|https://github.com/apache/calcite-avatica/blob/dbe9b1d8c2e53474eb40cfaf5721aceca3bdb57f/core/src/test/java/org/apache/calcite/avatica/AvaticaResultSetConversionsTest.java#L1121] > property for the connection, which is currently set to "GMT", several > {{getString()}} and {{getArray()}} tests start to fail. The {{getString()}} > failures are self-explanatory -- the expected value has become incorrect > because of the time zone change -- but the {{getArray()}} failures are > tricky. They're double-adjusting the timestamps: > * Once when they call > [getObject()|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L1432] > (which invokes {{getTimestamp()}} with the connection's default calendar, > which was previously GMT but is now the system default). > * Then, in > [ArrayImpl.equalContents()|https://github.com/apache/calcite-avatica/blob/820edf6f653607afb5a2a280a32f315aff1f64cb/core/src/main/java/org/apache/calcite/avatica/util/ArrayImpl.java#L233] > the arrays are converted to result sets for traversal via > [Array.getResultSet()|https://docs.oracle.com/javase/8/docs/api/java/sql/Array.html#getResultSet--] > , which uses the same time zone as the "factory" result set and eventually > invokes {{getObject()}} again, thus applying the time zone offset twice. > This is a bug in the implementation of {{Array.getResultSet()}} that only > manifests when the array contains timestamps and when the connection default > time zone is anything besides GMT, and this is not currently covered by > tests. It's easy to cause the failure, however, by changing the connection > default time zone for {{AvaticaResultSetConversionTest}} to anything besides > GMT or UTC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Updated] (CALCITE-5488) Calcite double-adjusts timestamps when calling Array.getResultSet()
[ https://issues.apache.org/jira/browse/CALCITE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble updated CALCITE-5488: Component/s: avatica > Calcite double-adjusts timestamps when calling Array.getResultSet() > --- > > Key: CALCITE-5488 > URL: https://issues.apache.org/jira/browse/CALCITE-5488 > Project: Calcite > Issue Type: Bug > Components: avatica >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > If you go into {{AvaticaResultSetConversionTest}} and delete the > [timeZone|https://github.com/apache/calcite-avatica/blob/dbe9b1d8c2e53474eb40cfaf5721aceca3bdb57f/core/src/test/java/org/apache/calcite/avatica/AvaticaResultSetConversionsTest.java#L1121] > property for the connection, which is currently set to "GMT", several > {{getString()}} and {{getArray()}} tests start to fail. The {{getString()}} > failures are self-explanatory -- the expected value has become incorrect > because of the time zone change -- but the {{getArray()}} failures are > tricky. They're double-adjusting the timestamps: > * Once when they call > [getObject()|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L1432] > (which invokes {{getTimestamp()}} with the connection's default calendar, > which was previously GMT but is now the system default). > * Then, in > [ArrayImpl.equalContents()|https://github.com/apache/calcite-avatica/blob/820edf6f653607afb5a2a280a32f315aff1f64cb/core/src/main/java/org/apache/calcite/avatica/util/ArrayImpl.java#L233] > the arrays are converted to result sets for traversal via > [Array.getResultSet()|https://docs.oracle.com/javase/8/docs/api/java/sql/Array.html#getResultSet--] > , which uses the same time zone as the "factory" result set and eventually > invokes {{getObject()}} again, thus applying the time zone offset twice. > This is a bug in the implementation of {{Array.getResultSet()}} that only > manifests when the array contains timestamps and when the connection default > time zone is anything besides GMT, and this is not currently covered by > tests. It's easy to cause the failure, however, by changing the connection > default time zone for {{AvaticaResultSetConversionTest}} to anything besides > GMT or UTC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5488) Calcite double-adjusts timestamps when calling Array.getResultSet()
Will Noble created CALCITE-5488: --- Summary: Calcite double-adjusts timestamps when calling Array.getResultSet() Key: CALCITE-5488 URL: https://issues.apache.org/jira/browse/CALCITE-5488 Project: Calcite Issue Type: Bug Reporter: Will Noble If you go into {{AvaticaResultSetConversionTest}} and delete the [timeZone|https://github.com/apache/calcite-avatica/blob/dbe9b1d8c2e53474eb40cfaf5721aceca3bdb57f/core/src/test/java/org/apache/calcite/avatica/AvaticaResultSetConversionsTest.java#L1121] property for the connection, which is currently set to "GMT", several {{getString()}} and {{getArray()}} tests start to fail. The {{getString()}} failures are self-explanatory -- the expected value has become incorrect because of the time zone change -- but the {{getArray()}} failures are tricky. They're double-adjusting the timestamps: * Once when they call [getObject()|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L1432] (which invokes {{getTimestamp()}} with the connection's default calendar, which was previously GMT but is now the system default). * Then, in [ArrayImpl.equalContents()|https://github.com/apache/calcite-avatica/blob/820edf6f653607afb5a2a280a32f315aff1f64cb/core/src/main/java/org/apache/calcite/avatica/util/ArrayImpl.java#L233] the arrays are converted to result sets for traversal via [Array.getResultSet()|https://docs.oracle.com/javase/8/docs/api/java/sql/Array.html#getResultSet--] , which uses the same time zone as the "factory" result set and eventually invokes {{getObject()}} again, thus applying the time zone offset twice. This is a bug in the implementation of {{Array.getResultSet()}} that only manifests when the array contains timestamps and when the connection default time zone is anything besides GMT, and this is not currently covered by tests. It's easy to cause the failure, however, by changing the connection default time zone for {{AvaticaResultSetConversionTest}} to anything besides GMT or UTC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Assigned] (CALCITE-5488) Calcite double-adjusts timestamps when calling Array.getResultSet()
[ https://issues.apache.org/jira/browse/CALCITE-5488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Will Noble reassigned CALCITE-5488: --- Assignee: Will Noble > Calcite double-adjusts timestamps when calling Array.getResultSet() > --- > > Key: CALCITE-5488 > URL: https://issues.apache.org/jira/browse/CALCITE-5488 > Project: Calcite > Issue Type: Bug >Reporter: Will Noble >Assignee: Will Noble >Priority: Minor > > If you go into {{AvaticaResultSetConversionTest}} and delete the > [timeZone|https://github.com/apache/calcite-avatica/blob/dbe9b1d8c2e53474eb40cfaf5721aceca3bdb57f/core/src/test/java/org/apache/calcite/avatica/AvaticaResultSetConversionsTest.java#L1121] > property for the connection, which is currently set to "GMT", several > {{getString()}} and {{getArray()}} tests start to fail. The {{getString()}} > failures are self-explanatory -- the expected value has become incorrect > because of the time zone change -- but the {{getArray()}} failures are > tricky. They're double-adjusting the timestamps: > * Once when they call > [getObject()|https://github.com/apache/calcite-avatica/blob/810acf80771310431d7ef576f3404299ebb8eaf2/core/src/main/java/org/apache/calcite/avatica/util/AbstractCursor.java#L1432] > (which invokes {{getTimestamp()}} with the connection's default calendar, > which was previously GMT but is now the system default). > * Then, in > [ArrayImpl.equalContents()|https://github.com/apache/calcite-avatica/blob/820edf6f653607afb5a2a280a32f315aff1f64cb/core/src/main/java/org/apache/calcite/avatica/util/ArrayImpl.java#L233] > the arrays are converted to result sets for traversal via > [Array.getResultSet()|https://docs.oracle.com/javase/8/docs/api/java/sql/Array.html#getResultSet--] > , which uses the same time zone as the "factory" result set and eventually > invokes {{getObject()}} again, thus applying the time zone offset twice. > This is a bug in the implementation of {{Array.getResultSet()}} that only > manifests when the array contains timestamps and when the connection default > time zone is anything besides GMT, and this is not currently covered by > tests. It's easy to cause the failure, however, by changing the connection > default time zone for {{AvaticaResultSetConversionTest}} to anything besides > GMT or UTC. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5487) Avatica Result Sets do not properly handle TIMESTAMP WITH LOCAL TIME ZONE
Will Noble created CALCITE-5487: --- Summary: Avatica Result Sets do not properly handle TIMESTAMP WITH LOCAL TIME ZONE Key: CALCITE-5487 URL: https://issues.apache.org/jira/browse/CALCITE-5487 Project: Calcite Issue Type: Bug Components: avatica Reporter: Will Noble With the addition of proper support for {{TIMESTAMP WITH LOCAL TIME ZONE}} to Calcite being done as part of CALCITE-5346, Avatica needs to support the proper semantics for this type. Whereas a regular {{TIMESTAMP}} always has the same string representation but can have different instants depending on the calendar (time zone) passed to the result set getter, a {{TIMESTAMP WITH LOCAL TIME ZONE}} does the inverse: it always represents the same instant, but could have a different string representation depending on the calendar passed to the result set getter. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5424) Customize handling of literals based on type system
[ https://issues.apache.org/jira/browse/CALCITE-5424?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17678434#comment-17678434 ] Will Noble commented on CALCITE-5424: - Offline, Julian expressed concern about moving the BigQuery Quidem tests to the core parser, citing merge conflicts and confusion. I think it's worth it, because we need the core parser to properly support the BQ dialect. To my understanding, success-condition testing in {{CoreQuidemTest}} implies that the more-permissive Babel parser will also succeed in those cases, but success-condition testing in {{BabelQuidemTest}} does not imply that it will work with the core parser. On the other hand, failure-condition testing has inverted implications. So, I think it makes most sense to have all necessary success conditions tested in the core parser to ensure they're successful in both parsers for posterity. > Customize handling of literals based on type system > --- > > Key: CALCITE-5424 > URL: https://issues.apache.org/jira/browse/CALCITE-5424 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Julian Hyde >Priority: Major > > Currently if you write {{TIMESTAMP '1969-07-20 22:56:00'}} it will be > converted to a value of type {{{}TIMESTAMP{}}}. But if someone has set a > custom type mapping where the name {{TIMESTAMP}} corresponds to the > {{TIMESTAMP WITH LOCAL TIME ZONE}} type then the literal value should be of > that type. > Custom type mappings are specified by defining user-defined types in the > schema; see the method > [Schema.getType|https://calcite.apache.org/javadocAggregate/org/apache/calcite/schema/Schema.html#getType(java.lang.String)]. > The contents of the string are currently handled by the parser. But in order > to implement this change, the parsing of that string will need to be deferred > to validation time (when the actual type is known, not just its name). This > applies to literals {{TIMESTAMP}}, {{DATE}}, {{TIME}}, {{ARRAY}}, > {{INTERVAL}}. The parser currently emits subclasses of {{SqlLiteral}}, namely > {{SqlTimestampLiteral}}, {{SqlDateLiteral}}, {{SqlTimeLiteral}}, > {{SqlArrayLiteral}}, {{SqlIntervalLiteral}}. But after this change, the > parser will emit a new subclass, {{SqlTaggedLiteral}} instead of all of these. > Also as part of this change, add BigQuery-compatible literals: > * {{NUMERIC '0'}} > * {{BIGNUMERIC '0'}} > * {{DATETIME '2014-09-27 12:30:00.45'}} > * {{JSON 'json_formatted_data'}} > > And a {{TIMESTAMP WITH LOCAL TIME ZONE}} literal: > * {{TIMESTAMP WITH LOCAL TIME ZONE '1969-07-21 02:56:00'}} -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Created] (CALCITE-5445) Avatica should display Measures as the proper DATA_TYPE
Will Noble created CALCITE-5445: --- Summary: Avatica should display Measures as the proper DATA_TYPE Key: CALCITE-5445 URL: https://issues.apache.org/jira/browse/CALCITE-5445 Project: Calcite Issue Type: Bug Components: avatica Reporter: Will Noble Assignee: Julian Hyde Filing this for tracking purposes. According to Julian: For a measure of type {{FLOAT}} we would return {{DATA_TYPE = java.sql.Types.FLOAT}} and {{TYPE_NAME = "Measure"}}. We should not be returning {{DATA_TYPE = java.sql.Types.OTHER}}. -- This message was sent by Atlassian Jira (v8.20.10#820010)
[jira] [Commented] (CALCITE-5180) Implement BigQuery functions for DATE, TIME, TIMESTAMP, DATETIME
[ https://issues.apache.org/jira/browse/CALCITE-5180?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637515#comment-17637515 ] Will Noble commented on CALCITE-5180: - This is the first I'm hearing of extreme BQ compatibility. Grepping case-insensitively for {{EBQC}} and {{'extreme.*compatibility'}} both turned up nothing in Calcite. Is this an existing setting? I'm assuming you're going with Oracle's definitions of these types, in which case they have a handy guide to [picking which type to use|https://docs.oracle.com/cd/E11882_01/server.112/e10729/ch4datetime.htm#NLSPG244]. Here's my take: * {{TIMESTAMP WITH TIME ZONE}} is clearly not appropriate because BigQuery timestamps have no time zone. * {{TIMESTAMP WITH LOCAL TIME ZONE}} "is appropriate when the original time zone is of no interest, but the relative times of events are important and daylight saving adjustment does not have to be accurate. The time zone conversion that this data type performs to and from the database time zone is asymmetrical, due to the daylight saving adjustment. Because this data type does not preserve the time zone information, it does not distinguish values near the adjustment in fall, whether they are daylight saving time or standard time. This confusion between distinct instants can cause an application to behave unexpectedly, especially if the adjustment takes place during the normal working hours of a user." -- this also doesn't seem appropriate to me. * {{TIMESTAMP}} seems most appropriate to me. In Oracle, this is "the time of an event without the time zone", and according to the [BigQuery docs|https://cloud.google.com/bigquery/docs/reference/standard-sql/timestamp_functions#timezone_definitions], a {{TIMESTAMP}} "represents an absolute point in time, independent of any time zone". Seems like a perfect fit. It's also very straightforward. I believe we should add a new datatype to represent {{DATETIME}}. Neither it, not BQ's {{TIMESTAMP}}, map naturally to any of the time zone- related datetime types in the Oracle docs. > Implement BigQuery functions for DATE, TIME, TIMESTAMP, DATETIME > > > Key: CALCITE-5180 > URL: https://issues.apache.org/jira/browse/CALCITE-5180 > Project: Calcite > Issue Type: Bug >Reporter: Julian Hyde >Assignee: Will Noble >Priority: Major > Labels: pull-request-available > Time Spent: 3h 20m > Remaining Estimate: 0h > > Implement missing BigQuery functions for DATE, TIME, TIMESTAMP, DATETIME. > Functions include: > * CURRENT_DATE, CURRENT_DATETIME, CURRENT_TIME with timezone; > * EXTRACT with time unit values WEEK(weekday), DATE, TIME; > * various DATE, TIME, TIMESTAMP, DATETIME constructor functions; > * DATE_ADD, DATETIME_ADD, TIME_ADD, TIMESTAMP_ADD; > * DATE_SUB, DATETIME_SUB, TIME_SUB, TIMESTAMP_SUB; > * DATE_DIFF, DATETIME_DIFF, TIME_DIFF, TIMESTAMP_DIFF; > * DATE_TRUNC, DATETIME_TRUNC, TIME_TRUNC, TIMESTAMP_TRUNC; > * LAST_DAY; > * FORMAT_DATE, FORMAT_DATETIME, FORMAT_TIME, FORMAT_TIMESTAMP; > * PARSE_DATE, PARSE_DATETIME, PARSE_TIME, PARSE_TIMESTAMP. > The following functions are already implemented: CURRENT_DATETME, > DATE(string), DATE_FROM_UNIX_DATE, TIMESTAMP_MICROS, TIMESTAMP_MILLIS, > TIMESTAMP_SECONDS, UNIX_MICROS, UNIX_MILLIS, UNIX_SECONDS, UNIX_DATE. -- This message was sent by Atlassian Jira (v8.20.10#820010)