[
https://issues.apache.org/jira/browse/SPARK-56969?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Max Gekk updated SPARK-56969:
-----------------------------
Description:
h2. Summary
Introduce a preview SQL configuration that gates use of
{{TimestampNTZNanosType(p)}} and {{TimestampLTZNanosType(p)}} ({{p}} in [7, 9])
added in [SPARK-56876|https://issues.apache.org/jira/browse/SPARK-56876]. The
flag is *off by default* in production (default {{true}} under
{{Utils.isTesting}}, following {{spark.sql.timeType.enabled}} /
{{TIME_TYPE_ENABLED}}).
h2. Background
Logical types and parser support exist
([SPARK-56876|https://issues.apache.org/jira/browse/SPARK-56876],
[SPARK-56965|https://issues.apache.org/jira/browse/SPARK-56965]). Physical row
storage lands in
[SPARK-56981|https://issues.apache.org/jira/browse/SPARK-56981]. Without
gating, partially implemented nanos types could surface in schemas, casts, or
connectors before the SPIP preview milestone is ready.
{{TimeType}} precedent: {{spark.sql.timeType.enabled}} with
{{TypeUtils.failUnsupportedDataType}} and {{UNSUPPORTED_TIME_TYPE}}.
h2. Scope
h3. 1. SQLConf implementation
* New entry (name TBD; suggest {{spark.sql.timestampNanos.preview.enabled}}):
** {{.internal()}} — preview / unstable
** {{.booleanConf}} with {{.createWithDefault(Utils.isTesting)}} (mirror
{{TIME_TYPE_ENABLED}})
** {{.version("4.2.0")}}
** {{.doc(...)}} — see *Documentation* below
* Accessor on {{SQLConf}} (e.g. {{isTimestampNanosPreviewEnabled}})
h3. 2. Analysis gating
* Extend {{TypeUtils.failUnsupportedDataType}} to reject schemas/plans that
*recursively* contain {{TimestampNTZNanosType}} or {{TimestampLTZNanosType}}
when the flag is disabled
* Error class {{UNSUPPORTED_TIMESTAMP_NANOS_TYPE}} (or single message covering
both NTZ/LTZ) in {{error-conditions.json}} and {{QueryCompilationErrors}},
analogous to {{UNSUPPORTED_TIME_TYPE}}
* Error message must name the conf key and how to enable preview (supports 12b
discoverability)
h3. 3. Binding policy
* Register the conf in {{configs-without-binding-policy-exceptions}} if
required (see {{spark.sql.timeType.enabled}} precedent)
h3. 4. Documentation (sub-task 12b — same PR)
*Audience:* operators and advanced users.
* *{{SQLConf}} {{.doc()}} text* must cover:
** What enabling allows (nanos-capable {{TIMESTAMP_NTZ(p)}} /
{{TIMESTAMP_LTZ(p)}}, {{p}} in 7–9, in schemas and analyzed plans)
** Preview / unstable status under
[SPARK-56822|https://issues.apache.org/jira/browse/SPARK-56822]
** Default value and testing default via {{Utils.isTesting}}
** That unparameterized {{TIMESTAMP}} / {{TIMESTAMP_NTZ}} / {{TIMESTAMP_LTZ}}
remain microsecond types
** What may still fail when enabled (casts, Parquet read, literals, etc. until
their JIRAs land)
* *{{docs/configuration.md}}* — add a row/section for the new key (follow
existing {{spark.sql.*}} entries):
** Key name (exact string as implemented)
** Default, version (4.2.0), scope, preview disclaimer
** Cross-link to SPIP JIRA or future sql-ref when available
* *Short enablement note* (minimal user discovery before full sql-ref pass in
12c):
** One paragraph in {{docs/sql-ref-datatypes.md}} *or* a note in
{{docs/configuration.md}} intro pointing to the conf
** Example: {{SET spark.sql.timestampNanos.preview.enabled=true;}}
** Do *not* document capabilities that are not yet merged; list only what works
at ship time
h2. Tests
* Unit: {{SQLConf}} entry exists; default matches {{Utils.isTesting}} pattern;
readable via {{SQLConf.get}}
* Unit: {{TypeUtils.failUnsupportedDataType}} throws when flag off and schema
contains {{TimestampNTZNanosType(9)}} / {{TimestampLTZNanosType(9)}}; passes
when flag on
* Unit: error message references the conf key (documentation/discoverability)
* Regression: existing {{DataTypeSuite}} / types without nanos types unaffected
* Optional: {{AnalysisSuite}} or {{DDLParsingSuite}} — {{CREATE TABLE t (c
TIMESTAMP_NTZ(9))}} fails with new error when preview off
([SPARK-56965|https://issues.apache.org/jira/browse/SPARK-56965] parser merged)
h2. Acceptance criteria
* With preview *disabled* (default outside tests), any analyzed schema or plan
that recursively contains {{TimestampNTZNanosType}} or
{{TimestampLTZNanosType}} fails with a clear error naming
{{spark.sql.timestampNanos.preview.enabled}} (exact key as implemented)
* With preview *enabled* (or in tests), {{failUnsupportedDataType}} does not
block those types (downstream JIRAs may still fail for other reasons)
* No change to {{TimestampType}}, {{TimestampNTZType}}, or zero-arg
{{TIMESTAMP}} / {{TIMESTAMP_NTZ}} DDL semantics
* {{SQLConf}} {{.doc()}} text is sufficient for generated configuration
reference
* {{docs/configuration.md}} documents the key with correct default, version,
and enablement instructions
* At least one public doc location (configuration.md and/or sql-ref-datatypes)
lets users find how to turn on preview without reading source
* Conf discoverable via {{SET spark.sql.timestampNanos.preview.enabled}}
h2. Dependencies
* Logical types: [SPARK-56876|https://issues.apache.org/jira/browse/SPARK-56876]
* Parser: [SPARK-56965|https://issues.apache.org/jira/browse/SPARK-56965]
(merged)
* Recommended before first user-visible Parquet read or cast PRs
* Physical rows:
[SPARK-56981|https://issues.apache.org/jira/browse/SPARK-56981] (in progress —
gating can land independently)
was:
h3. Summary
Introduce a preview SQL configuration that gates use of
{{TimestampNTZNanosType(p)}} and {{TimestampLTZNanosType(p)}} (p ∈ [7, 9])
added in SPARK-56876. The flag is *off by default* (except in tests, following
{{{}spark.sql.timeType.enabled{}}}).
h3. What to do
* Add {{SQLConf}} entry (name TBD; suggest
{{{}spark.sql.timestampNanos.preview.enabled{}}}, {{{}.internal(){}}}, default
{{{}false{}}}, default {{true}} under {{Utils.isTesting}} — mirror
{{{}TIME_TYPE_ENABLED{}}}).
* Add accessor on {{SQLConf}} (e.g. {{{}isTimestampNanosPreviewEnabled{}}}).
* Extend {{TypeUtils.failUnsupportedDataType}} to reject schemas/plans that
contain {{TimestampNTZNanosType}} or {{TimestampLTZNanosType}} recursively when
the flag is disabled.
* Add error class {{UNSUPPORTED_TIMESTAMP_NANOS_TYPE}} (or reuse a single
message for both NTZ/LTZ) in {{error-conditions.json}} and
{{{}QueryCompilationErrors{}}}, analogous to {{{}UNSUPPORTED_TIME_TYPE{}}}.
* Document the conf in {{SQLConf.scala}} (when true, nanosecond-capable
timestamp types may appear in table schemas, casts, etc.).
* Register conf in binding-policy exceptions list if required (see
{{configs-without-binding-policy-exceptions}} for {{timeType.enabled}}
precedent).
h3. Tests
* Unit: {{SQLConf}} entry exists, default false in non-test mode, readable via
{{{}SQLConf.get{}}}.
* Unit: {{TypeUtils.failUnsupportedDataType}} throws when flag off and schema
contains {{TimestampNTZNanosType(9)}} / {{{}TimestampLTZNanosType(9){}}};
passes when flag on.
* Regression: existing {{DataTypeSuite}} / types without nanos types
unaffected.
* Optional: small {{AnalysisSuite}} or {{DDLParsingSuite}} — {{CREATE TABLE t
(c TIMESTAMP_NTZ(9))}} fails with new error when preview off once parser JIRA
exists (can be follow-up test in parser JIRA if parser not merged yet).
h3. Acceptance criteria
* With preview *disabled* (default), any analyzed schema or cast target that
recursively contains {{TimestampNTZNanosType}} or {{TimestampLTZNanosType}}
fails analysis with a clear, documented error (sqlState consistent with other
unsupported-type errors).
* With preview {*}enabled{*}, {{failUnsupportedDataType}} does not block those
types (feature JIRAs may still fail for other reasons, e.g. missing UnsafeRow).
* No change to behavior of {{{}TimestampType{}}}, {{{}TimestampNTZType{}}}, or
zero-arg {{TIMESTAMP}} / {{TIMESTAMP_NTZ}} DDL.
* Conf documented and discoverable via {{SET
spark.sql.timestampNanos.preview.enabled}} (exact key as implemented).
> Add SQLConf preview flag for nanosecond-capable timestamp types
> ---------------------------------------------------------------
>
> Key: SPARK-56969
> URL: https://issues.apache.org/jira/browse/SPARK-56969
> Project: Spark
> Issue Type: Sub-task
> Components: SQL
> Affects Versions: 4.2.0
> Reporter: Max Gekk
> Priority: Major
>
> h2. Summary
> Introduce a preview SQL configuration that gates use of
> {{TimestampNTZNanosType(p)}} and {{TimestampLTZNanosType(p)}} ({{p}} in [7,
> 9]) added in [SPARK-56876|https://issues.apache.org/jira/browse/SPARK-56876].
> The flag is *off by default* in production (default {{true}} under
> {{Utils.isTesting}}, following {{spark.sql.timeType.enabled}} /
> {{TIME_TYPE_ENABLED}}).
> h2. Background
> Logical types and parser support exist
> ([SPARK-56876|https://issues.apache.org/jira/browse/SPARK-56876],
> [SPARK-56965|https://issues.apache.org/jira/browse/SPARK-56965]). Physical
> row storage lands in
> [SPARK-56981|https://issues.apache.org/jira/browse/SPARK-56981]. Without
> gating, partially implemented nanos types could surface in schemas, casts, or
> connectors before the SPIP preview milestone is ready.
> {{TimeType}} precedent: {{spark.sql.timeType.enabled}} with
> {{TypeUtils.failUnsupportedDataType}} and {{UNSUPPORTED_TIME_TYPE}}.
> h2. Scope
> h3. 1. SQLConf implementation
> * New entry (name TBD; suggest {{spark.sql.timestampNanos.preview.enabled}}):
> ** {{.internal()}} — preview / unstable
> ** {{.booleanConf}} with {{.createWithDefault(Utils.isTesting)}} (mirror
> {{TIME_TYPE_ENABLED}})
> ** {{.version("4.2.0")}}
> ** {{.doc(...)}} — see *Documentation* below
> * Accessor on {{SQLConf}} (e.g. {{isTimestampNanosPreviewEnabled}})
> h3. 2. Analysis gating
> * Extend {{TypeUtils.failUnsupportedDataType}} to reject schemas/plans that
> *recursively* contain {{TimestampNTZNanosType}} or {{TimestampLTZNanosType}}
> when the flag is disabled
> * Error class {{UNSUPPORTED_TIMESTAMP_NANOS_TYPE}} (or single message
> covering both NTZ/LTZ) in {{error-conditions.json}} and
> {{QueryCompilationErrors}}, analogous to {{UNSUPPORTED_TIME_TYPE}}
> * Error message must name the conf key and how to enable preview (supports
> 12b discoverability)
> h3. 3. Binding policy
> * Register the conf in {{configs-without-binding-policy-exceptions}} if
> required (see {{spark.sql.timeType.enabled}} precedent)
> h3. 4. Documentation (sub-task 12b — same PR)
> *Audience:* operators and advanced users.
> * *{{SQLConf}} {{.doc()}} text* must cover:
> ** What enabling allows (nanos-capable {{TIMESTAMP_NTZ(p)}} /
> {{TIMESTAMP_LTZ(p)}}, {{p}} in 7–9, in schemas and analyzed plans)
> ** Preview / unstable status under
> [SPARK-56822|https://issues.apache.org/jira/browse/SPARK-56822]
> ** Default value and testing default via {{Utils.isTesting}}
> ** That unparameterized {{TIMESTAMP}} / {{TIMESTAMP_NTZ}} / {{TIMESTAMP_LTZ}}
> remain microsecond types
> ** What may still fail when enabled (casts, Parquet read, literals, etc.
> until their JIRAs land)
> * *{{docs/configuration.md}}* — add a row/section for the new key (follow
> existing {{spark.sql.*}} entries):
> ** Key name (exact string as implemented)
> ** Default, version (4.2.0), scope, preview disclaimer
> ** Cross-link to SPIP JIRA or future sql-ref when available
> * *Short enablement note* (minimal user discovery before full sql-ref pass in
> 12c):
> ** One paragraph in {{docs/sql-ref-datatypes.md}} *or* a note in
> {{docs/configuration.md}} intro pointing to the conf
> ** Example: {{SET spark.sql.timestampNanos.preview.enabled=true;}}
> ** Do *not* document capabilities that are not yet merged; list only what
> works at ship time
> h2. Tests
> * Unit: {{SQLConf}} entry exists; default matches {{Utils.isTesting}}
> pattern; readable via {{SQLConf.get}}
> * Unit: {{TypeUtils.failUnsupportedDataType}} throws when flag off and schema
> contains {{TimestampNTZNanosType(9)}} / {{TimestampLTZNanosType(9)}}; passes
> when flag on
> * Unit: error message references the conf key (documentation/discoverability)
> * Regression: existing {{DataTypeSuite}} / types without nanos types
> unaffected
> * Optional: {{AnalysisSuite}} or {{DDLParsingSuite}} — {{CREATE TABLE t (c
> TIMESTAMP_NTZ(9))}} fails with new error when preview off
> ([SPARK-56965|https://issues.apache.org/jira/browse/SPARK-56965] parser
> merged)
> h2. Acceptance criteria
> * With preview *disabled* (default outside tests), any analyzed schema or
> plan that recursively contains {{TimestampNTZNanosType}} or
> {{TimestampLTZNanosType}} fails with a clear error naming
> {{spark.sql.timestampNanos.preview.enabled}} (exact key as implemented)
> * With preview *enabled* (or in tests), {{failUnsupportedDataType}} does not
> block those types (downstream JIRAs may still fail for other reasons)
> * No change to {{TimestampType}}, {{TimestampNTZType}}, or zero-arg
> {{TIMESTAMP}} / {{TIMESTAMP_NTZ}} DDL semantics
> * {{SQLConf}} {{.doc()}} text is sufficient for generated configuration
> reference
> * {{docs/configuration.md}} documents the key with correct default, version,
> and enablement instructions
> * At least one public doc location (configuration.md and/or
> sql-ref-datatypes) lets users find how to turn on preview without reading
> source
> * Conf discoverable via {{SET spark.sql.timestampNanos.preview.enabled}}
> h2. Dependencies
> * Logical types:
> [SPARK-56876|https://issues.apache.org/jira/browse/SPARK-56876]
> * Parser: [SPARK-56965|https://issues.apache.org/jira/browse/SPARK-56965]
> (merged)
> * Recommended before first user-visible Parquet read or cast PRs
> * Physical rows:
> [SPARK-56981|https://issues.apache.org/jira/browse/SPARK-56981] (in progress
> — gating can land independently)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]