[PR] [SPARK-56876][SQL][4.X] Add TimestampNTZNanosType and TimestampLTZNanosType [spark]

via GitHub Mon, 25 May 2026 07:22:56 -0700


MaxGekk opened a new pull request, #56099:
URL: https://github.com/apache/spark/pull/56099


   ### What changes were proposed in this pull request?
   
   This is a backport of https://github.com/apache/spark/pull/55952 to 
branch-4.x.
   
   In the PR, I propose to extend the Spark SQL type system, and add new 
classes to Scala/Java APIs:
   
   * TimestampNTZNanosType(p)represents the SQL data type TIMESTAMP\_NTZ(p)  
   * TimestampLTZNanosType(p)represents TIMESTAMP\_LTZ(p)
   
   They are public API entry points only, and have no SQL/DDL/datasource 
integration in this PR.
   
   The classes align with the SQL standard’s direction for optional feature 
F555, “Enhanced seconds precision”: datetime types can carry fractional seconds 
with precision p in the SECOND field beyond the traditional six decimal places 
(microseconds). Here p is restricted to 7, 8, and 9, i.e. the 
nanosecond-capable band (up to nine fractional digits, nanoseconds in the 
second field).
   
   The logical layout documented on the classes matches this precision story: 
epoch microseconds plus nanoseconds within that microsecond, with a default 
estimated width of 10 bytes for planning (8 \+ 2).
   
   Parameterless timestamp\_ntz / timestamp\_ltz are unchanged and remain the 
existing microsecond-oriented types.
   
   ### Why are the changes needed?
   
   New timestamp types are useful for Spark SQL users because they allow:
   
   1. Represent timestamp without time zone and timestamp with local time zone 
with fractional-second precision 7–9, in line with SQL optional feature F555 
(Enhanced seconds precision).  
   2. Describe schemas from other systems that already use nanosecond-capable 
timestamps, without overloading microsecond timestamp\_ntz / timestamp\_ltz 
types.  
   3. Migrate SQL and metadata that distinguish NTZ and LTZ at sub-microsecond 
precision toward Spark in small, reviewable steps.  
   4. Prepare later work to read and write such columns from datasources and 
JDBC, and to apply optimizations that depend on precise timestamp types.
   
   ### Does this PR introduce *any* user-facing change?
   
   Public API adds two new types in org.apache.spark.sql.types; they cannot yet 
be used in DataFrames, schemas read from datasources, or SQL DDL.
   
   ### How was this patch tested?
   
   By extending DataTypeSuite (round-trip and precision bounds for the new 
types, including invalid precisions).
   ```
   $ build/sbt "test:testOnly *DataTypeSuite"
   ```
   Plus SparkThrowableSuite / error-json validation if error-conditions.json is 
updated.
   
   ### Was this patch authored or co-authored using generative AI tooling?
   Generated-by: Claude Opus 4.7
   
   Authored-by: Maxim Gekk <[email protected]>
   (cherry picked from commit 1e59b7b49b14f85f7409911e7b70169c1c085dda)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[PR] [SPARK-56876][SQL][4.X] Add TimestampNTZNanosType and TimestampLTZNanosType [spark]

Reply via email to