[ 
https://issues.apache.org/jira/browse/CALCITE-6055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Will Noble updated CALCITE-6055:
--------------------------------
    Description: 
Background:

In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, 
minute, and second, henceforth referred to as clock-calendar parameters. This 
does not define an unambiguous instant in time, since timestamps in different 
time zones can have different parameters at the same instant. It is analogous 
to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} 
type a different definition: “an absolute point in time, independent of any 
time zone or convention such as Daylight Savings Time.”

ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} 
simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both 
a set of clock-calendar parameters and an unambiguous instant in time. The 
final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According 
to Oracle, it “is normalized to the database time zone, and the time zone 
offset is not stored as part of the column data. When users retrieve the data, 
Oracle returns it in the users' local session time zone.” This act of 
conversion between the local time zone and the database time zone means that 
users interacting with this kind of timestamp will observe the same instant in 
time even though they see different clock-calendar parameters. However, the 
local time zone is not considered “part of the data”. Thus, it has the 
semantics of an un-zoned, absolute point in time, analogous to a BigQuery 
{{TIMESTAMP}}.

This creates the confusing situation where a BigQuery {{DATETIME}} is actually 
an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO 
{{TIMESTAMP WITH LOCAL TIME ZONE}}.

Problem:

This is very similar to CALCITE-5424, which only applied to literals (e.g. 
{{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the 
{{DateTimeTypeName}} 
construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019]
 which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS 
TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a 
{{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that 
and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion, 
we can look up the type names in the UDT map. If no mapping is found for 
{{DATETIME}} (which would occur for any dialect besides BigQuery), an error 
would occur during validation, because that type does not exist in that 
dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the 
canonical {{TIMESTAMP}} type, preserving the existing behavior for all other 
dialects.

In summary:

{{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP WITH 
LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all other 
dialects.
{{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} for 
BigQuery, but throw a validation error for all other dialects.

  was:
Background:

In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, 
minute, and second, henceforth referred to as clock-calendar parameters. This 
does not define an unambiguous instant in time, since timestamps in different 
time zones can have different parameters at the same instant. It is analogous 
to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} 
type a different definition: “an absolute point in time, independent of any 
time zone or convention such as Daylight Savings Time.”

ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} 
simply attaches a time zone to the existing {{TIMESTAMP}} type, making it both 
a set of clock-calendar parameters and an unambiguous instant in time. The 
final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. According 
to Oracle, it “is normalized to the database time zone, and the time zone 
offset is not stored as part of the column data. When users retrieve the data, 
Oracle returns it in the users' local session time zone.” This act of 
conversion between the local time zone and the database time zone means that 
users interacting with this kind of timestamp will observe the same instant in 
time even though they see different clock-calendar parameters. However, the 
local time zone is not considered “part of the data”. Thus, it has the 
semantics of an un-zoned, absolute point in time, analogous to a BigQuery 
{{TIMESTAMP}}.

This creates the confusing situation where a BigQuery {{DATETIME}} is actually 
an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an ISO 
{{TIMESTAMP WITH LOCAL TIME ZONE}}.

Problem:

This is very similar to CALCITE-5424, which only applied to literals (e.g. 
{{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the 
{{DateTimeTypeName}} 
construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019]
 which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS 
TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a 
{{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that 
and the {{TIMESTAMP}} type names. During validation and sql-to-rel conversion, 
we can look up the type names in the UDT map. If no mapping is found for 
{{DATETIME}} (which would occur for any dialect besides BigQuery), an error 
would occur during validation, because that type does not exist in that 
dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on the 
canonical {{TIMESTAMP}} type, preserving the existing behavior for all other 
dialects.


> Customize handling of type name based on type system
> ----------------------------------------------------
>
>                 Key: CALCITE-6055
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6055
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Will Noble
>            Priority: Minor
>              Labels: pull-request-available
>
> Background:
> In ISO SQL, a {{TIMESTAMP}} comprises the parameters year, month, day, hour, 
> minute, and second, henceforth referred to as clock-calendar parameters. This 
> does not define an unambiguous instant in time, since timestamps in different 
> time zones can have different parameters at the same instant. It is analogous 
> to a BigQuery {{DATETIME}}. BigQuery chose to give its native {{TIMESTAMP}} 
> type a different definition: “an absolute point in time, independent of any 
> time zone or convention such as Daylight Savings Time.”
> ISO SQL has 2 other {{TIMESTAMP}} types. The {{TIMESTAMP WITH TIME ZONE}} 
> simply attaches a time zone to the existing {{TIMESTAMP}} type, making it 
> both a set of clock-calendar parameters and an unambiguous instant in time. 
> The final type -- {{TIMESTAMP WITH LOCAL TIME ZONE}} -- is more subtle. 
> According to Oracle, it “is normalized to the database time zone, and the 
> time zone offset is not stored as part of the column data. When users 
> retrieve the data, Oracle returns it in the users' local session time zone.” 
> This act of conversion between the local time zone and the database time zone 
> means that users interacting with this kind of timestamp will observe the 
> same instant in time even though they see different clock-calendar 
> parameters. However, the local time zone is not considered “part of the 
> data”. Thus, it has the semantics of an un-zoned, absolute point in time, 
> analogous to a BigQuery {{TIMESTAMP}}.
> This creates the confusing situation where a BigQuery {{DATETIME}} is 
> actually an ISO {{TIMESTAMP}}, and a BigQuery {{TIMESTAMP}} is actually an 
> ISO {{TIMESTAMP WITH LOCAL TIME ZONE}}.
> Problem:
> This is very similar to CALCITE-5424, which only applied to literals (e.g. 
> {{TIMESTAMP '2023-10-17 12:00:00'}}). We need a similar solution for [the 
> {{DateTimeTypeName}} 
> construction|https://github.com/apache/calcite/blob/6f79436c178beec639e559d9152c237bbf8ec3e8/core/src/main/codegen/templates/Parser.jj#L6019]
>  which is used in other contexts (e.g. {{CAST('2023-10-17 12:00:00' AS 
> TIMESTAMP)}}). Seems like we can use a very similar approach: introduce a 
> {{DATETIME}} type name, and use {{SqlUserDefinedTypeNameSpec}} for both that 
> and the {{TIMESTAMP}} type names. During validation and sql-to-rel 
> conversion, we can look up the type names in the UDT map. If no mapping is 
> found for {{DATETIME}} (which would occur for any dialect besides BigQuery), 
> an error would occur during validation, because that type does not exist in 
> that dialect. If no mapping is found for {{TIMESTAMP}}, we would fall back on 
> the canonical {{TIMESTAMP}} type, preserving the existing behavior for all 
> other dialects.
> In summary:
> {{CAST('2023-10-17 12:00:00' AS TIMESTAMP)}} should produce a {{TIMESTAMP 
> WITH LOCAL TIME ZONE}} for BigQuery, but a regular {{TIMESTAMP}} for all 
> other dialects.
> {{CAST('2023-10-17 12:00:00' AS DATETIME)}} should produce a {{TIMESTAMP}} 
> for BigQuery, but throw a validation error for all other dialects.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to