[ 
https://issues.apache.org/jira/browse/SPARK-57163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Max Gekk reassigned SPARK-57163:
--------------------------------

    Assignee: Max Gekk

> Map TIMESTAMP_LTZ(6) and TIMESTAMP_NTZ(6) to TimestampType and 
> TimestampNTZType
> -------------------------------------------------------------------------------
>
>                 Key: SPARK-57163
>                 URL: https://issues.apache.org/jira/browse/SPARK-57163
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>    Affects Versions: 4.3.0
>            Reporter: Max Gekk
>            Assignee: Max Gekk
>            Priority: Minor
>              Labels: pull-request-available, starter
>
> h2. What
> Map the microsecond fractional precision {{6}} of the parameterized timestamp
> spellings to the existing GA timestamp types:
> * {{TIMESTAMP_NTZ(6)}} -> {{TimestampNTZType}}
> * {{TIMESTAMP_LTZ(6)}} -> {{TimestampType}}
> * {{TIMESTAMP(6) WITHOUT TIME ZONE}} -> {{TimestampNTZType}}
> * {{TIMESTAMP(6) WITH LOCAL TIME ZONE}} -> {{TimestampType}}
> * {{TIMESTAMP(6)}} -> the session default type ({{spark.sql.timestampType}})
> h2. Why
> This is a sub-task of SPARK-56822 (SPIP: Timestamps with nanosecond 
> precision).
> The SPIP introduces nanosecond-capable types {{TimestampNTZNanosType}} and
> {{TimestampLTZNanosType}} for fractional precision {{p}} in [7, 9]. 
> Microsecond
> precision ({{p}} = 6) is exactly what the existing {{TimestampType}} and
> {{TimestampNTZType}} already model. Today, however, {{TIMESTAMP_NTZ(6)}} /
> {{TIMESTAMP_LTZ(6)}} are rejected with {{INVALID_TIMESTAMP_PRECISION}}, which 
> is
> surprising: an explicit {{(6)}} should be accepted and resolve to the 
> equivalent
> microsecond type, giving users a consistent precision model where {{p}} = 6
> means microseconds.
> h2. Current behavior
> Parsing {{TIMESTAMP_NTZ(6)}} or {{TIMESTAMP_LTZ(6)}} throws:
> {code}
> [INVALID_TIMESTAMP_PRECISION] ... precision 6 ...
> {code}
> because {{TimestampNTZNanosType}} / {{TimestampLTZNanosType}} only allow
> precision in [7, 9].
> h2. Scope / where to change
> Two parsing surfaces route the precision and both must map 6 to the
> microsecond types:
> # {{sql/api/.../catalyst/parser/DataTypeAstBuilder.scala}} - the SQL DDL 
> parser.
>   Methods {{parseTimestampLtzNanosPrecision}} and
>   {{parseTimestampNtzNanosPrecision}} (these also back the bare and zoned
>   {{TIMESTAMP(p)}} cases).
> # {{sql/api/.../sql/types/DataType.scala}} - {{nameToType}}, the
>   {{typeName}}/JSON-string parser ({{TIMESTAMP_LTZ_NANOS_TYPE}} and
>   {{TIMESTAMP_NTZ_NANOS_TYPE}} branches), so that string round-trips such as
>   {{timestamp_ntz(6)}} resolve consistently.
> h2. Out of scope
> * Precision {{p}} in [0, 5]. These imply rounding/truncation semantics that 
> are
>   not modeled yet and are left for a separate follow-up. Keep this task to
>   {{p}} = 6 only.
> * Any change to the nanosecond-capable types ({{p}} in [7, 9]).
> h2. Open design decisions (please confirm with reviewers)
> * *Preview-flag gating*: nanos parsing is gated behind
>   {{spark.sql.timestampNanosTypes.enabled}}. Since {{p}} = 6 resolves to a GA
>   type, it should arguably be accepted regardless of the flag. Decide and
>   document whether {{TIMESTAMP_*(6)}} requires the preview flag (preference:
>   do NOT require it).
> h2. Acceptance criteria
> * {{CatalystSqlParser.parseDataType("TIMESTAMP_NTZ(6)")}} returns
>   {{TimestampNTZType}}; {{"TIMESTAMP_LTZ(6)"}} returns {{TimestampType}}.
> * The zoned spellings {{TIMESTAMP(6) WITHOUT TIME ZONE}} /
>   {{TIMESTAMP(6) WITH LOCAL TIME ZONE}} and bare {{TIMESTAMP(6)}} resolve as
>   listed under "What".
> * {{DataType.fromDDL}} / {{typeName}} round-trip is consistent for these
>   spellings.
> * Existing assertions in {{DataTypeParserSuite}} that expect
>   {{INVALID_TIMESTAMP_PRECISION}} for precision 6 are updated to expect the
>   microsecond types; {{p}} = 10 (and other out-of-[7,9] non-6 values) still
>   throw {{INVALID_TIMESTAMP_PRECISION}}.
> h2. Tests
> * {{sql/catalyst/.../parser/DataTypeParserSuite.scala}} - update the
>   "TIMESTAMP(6) WITHOUT TIME ZONE" / "timestamp(6)" cases (currently asserting
>   the error) and add positive cases for {{TIMESTAMP_NTZ(6)}} /
>   {{TIMESTAMP_LTZ(6)}}.
> * Add a JSON/typeName round-trip case (e.g. {{timestamp_ntz(6)}}) in the
>   relevant {{DataTypeSuite}}.
> h2. Notes for first-time contributors
> This is a good-first-issue. Build/test a single module with SBT:
> {code}
> build/sbt 'sql/testOnly *DataTypeParserSuite'
> {code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to