[ https://issues.apache.org/jira/browse/SPARK-35662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Gengliang Wang resolved SPARK-35662. ------------------------------------ Fix Version/s: 3.4.0 Resolution: Fixed > Support Timestamp without time zone data type > --------------------------------------------- > > Key: SPARK-35662 > URL: https://issues.apache.org/jira/browse/SPARK-35662 > Project: Spark > Issue Type: New Feature > Components: SQL > Affects Versions: 3.4.0 > Reporter: Gengliang Wang > Assignee: Apache Spark > Priority: Major > Fix For: 3.4.0 > > > Spark SQL today supports the TIMESTAMP data type. However the semantics > provided actually match TIMESTAMP WITH LOCAL TIMEZONE as defined by Oracle. > Timestamps embedded in a SQL query or passed through JDBC are presumed to be > in session local timezone and cast to UTC before being processed. > These are desirable semantics in many cases, such as when dealing with > calendars. > In many (more) other cases, such as when dealing with log files it is > desirable that the provided timestamps not be altered. > SQL users expect that they can model either behavior and do so by using > TIMESTAMP WITHOUT TIME ZONE for time zone insensitive data and TIMESTAMP WITH > LOCAL TIME ZONE for time zone sensitive data. > Most traditional RDBMS map TIMESTAMP to TIMESTAMP WITHOUT TIME ZONE and will > be surprised to see TIMESTAMP WITH LOCAL TIME ZONE, a feature that does not > exist in the standard. > In this new feature, we will introduce TIMESTAMP WITH LOCAL TIMEZONE to > describe the existing timestamp type and add TIMESTAMP WITHOUT TIME ZONE for > standard semantic. > Using these two types will provide clarity. > We will also allow users to set the default behavior for TIMESTAMP to either > use TIMESTAMP WITH LOCAL TIME ZONE or TIMESTAMP WITHOUT TIME ZONE. > h3. Milestone 1 – Spark Timestamp equivalency ( The new Timestamp type > TimestampWithoutTZ meets or exceeds all function of the existing SQL > Timestamp): > * Add a new DataType implementation for TimestampWithoutTZ. > * Support TimestampWithoutTZ in Dataset/UDF. > * TimestampWithoutTZ literals > * TimestampWithoutTZ arithmetic(e.g. TimestampWithoutTZ - > TimestampWithoutTZ, TimestampWithoutTZ - Date) > * Datetime functions/operators: dayofweek, weekofyear, year, etc > * Cast to and from TimestampWithoutTZ, cast String/Timestamp to > TimestampWithoutTZ, cast TimestampWithoutTZ to string (pretty > printing)/Timestamp, with the SQL syntax to specify the types > * Support sorting TimestampWithoutTZ. > h3. Milestone 2 – Persistence: > * Ability to create tables of type TimestampWithoutTZ > * Ability to write to common file formats such as Parquet and JSON. > * INSERT, SELECT, UPDATE, MERGE > * Discovery > h3. Milestone 3 – Client support > * JDBC support > * Hive Thrift server > h3. Milestone 4 – PySpark and Spark R integration > * Python UDF can take and return TimestampWithoutTZ > * DataFrame support -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org