[ 
https://issues.apache.org/jira/browse/SPARK-35662?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Gengliang Wang resolved SPARK-35662.
------------------------------------
    Fix Version/s: 3.4.0
       Resolution: Fixed

> Support Timestamp without time zone data type
> ---------------------------------------------
>
>                 Key: SPARK-35662
>                 URL: https://issues.apache.org/jira/browse/SPARK-35662
>             Project: Spark
>          Issue Type: New Feature
>          Components: SQL
>    Affects Versions: 3.4.0
>            Reporter: Gengliang Wang
>            Assignee: Apache Spark
>            Priority: Major
>             Fix For: 3.4.0
>
>
> Spark SQL today supports the TIMESTAMP data type. However the semantics 
> provided actually match TIMESTAMP WITH LOCAL TIMEZONE as defined by Oracle. 
> Timestamps embedded in a SQL query or passed through JDBC are presumed to be 
> in session local timezone and cast to UTC before being processed.
>  These are desirable semantics in many cases, such as when dealing with 
> calendars.
>  In many (more) other cases, such as when dealing with log files it is 
> desirable that the provided timestamps not be altered.
>  SQL users expect that they can model either behavior and do so by using 
> TIMESTAMP WITHOUT TIME ZONE for time zone insensitive data and TIMESTAMP WITH 
> LOCAL TIME ZONE for time zone sensitive data.
>  Most traditional RDBMS map TIMESTAMP to TIMESTAMP WITHOUT TIME ZONE and will 
> be surprised to see TIMESTAMP WITH LOCAL TIME ZONE, a feature that does not 
> exist in the standard.
> In this new feature, we will introduce TIMESTAMP WITH LOCAL TIMEZONE to 
> describe the existing timestamp type and add TIMESTAMP WITHOUT TIME ZONE for 
> standard semantic.
>  Using these two types will provide clarity.
>  We will also allow users to set the default behavior for TIMESTAMP to either 
> use TIMESTAMP WITH LOCAL TIME ZONE or TIMESTAMP WITHOUT TIME ZONE.
> h3. Milestone 1 – Spark Timestamp equivalency ( The new Timestamp type 
> TimestampWithoutTZ meets or exceeds all function of the existing SQL 
> Timestamp):
>  * Add a new DataType implementation for TimestampWithoutTZ.
>  * Support TimestampWithoutTZ in Dataset/UDF.
>  * TimestampWithoutTZ literals
>  * TimestampWithoutTZ arithmetic(e.g. TimestampWithoutTZ - 
> TimestampWithoutTZ, TimestampWithoutTZ - Date)
>  * Datetime functions/operators: dayofweek, weekofyear, year, etc
>  * Cast to and from TimestampWithoutTZ, cast String/Timestamp to 
> TimestampWithoutTZ, cast TimestampWithoutTZ to string (pretty 
> printing)/Timestamp, with the SQL syntax to specify the types
>  * Support sorting TimestampWithoutTZ.
> h3. Milestone 2 – Persistence:
>  * Ability to create tables of type TimestampWithoutTZ
>  * Ability to write to common file formats such as Parquet and JSON.
>  * INSERT, SELECT, UPDATE, MERGE
>  * Discovery
> h3. Milestone 3 – Client support
>  * JDBC support
>  * Hive Thrift server
> h3. Milestone 4 – PySpark and Spark R integration
>  * Python UDF can take and return TimestampWithoutTZ
>  * DataFrame support



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to