doyong365 opened a new issue, #10685: URL: https://github.com/apache/seatunnel/issues/10685
### Search before asking - [x] I had searched in the [issues](https://github.com/apache/seatunnel/issues?q=is%3Aissue+label%3A%22bug%22) and found no similar issues. ### What happened NTZ timestamp columns from source databases are stored as LTZ in the target without timezone conversion. SeaTunnel's JDBC connectors do not distinguish NTZ from LTZ — both are mapped to the same internal type (`SqlType.TIMESTAMP` / `LOCAL_DATE_TIME_TYPE`) and unconditionally written as LTZ to the target (e.g. Iceberg `TimestampType.withZone()`). When reading, `rs.getTimestamp()` → `.toLocalDateTime()` applies the JVM timezone, and `Timestamp.valueOf(LocalDateTime).toInstant()` in the Spark layer reverses it — the two offsets cancel out, leaving the wall-clock value stored directly as UTC regardless of the source timezone. **Affected source types per database:** | Source DB | Type | NTZ/LTZ | Result | |---|---|---|---| | MySQL | `DATETIME` | NTZ | ⛔ Stored as LTZ | | MySQL | `TIMESTAMP` | LTZ | ✅ Correct | | MariaDB | `DATETIME` | NTZ | ⛔ Stored as LTZ | | PostgreSQL | `timestamp` | NTZ | ⛔ Stored as LTZ | | PostgreSQL | `timestamptz` | LTZ | ✅ Correct | | Oracle | `DATE` | NTZ | ⛔ Stored as LTZ | | Oracle | `TIMESTAMP` | NTZ | ⛔ Stored as LTZ | | MSSQL | `DATETIME` | NTZ | ⛔ Stored as LTZ | | DB2 | `TIMESTAMP` | NTZ | ⛔ Stored as LTZ | **Verified with the following test cases (Spark engine, JVM=KST+9):** | Source TZ | Source type | Source value | Result | |---|---|---|---| | KST(+9) | TIMESTAMP_NTZ | 2026-01-01 00:00:00 | 2026-01-01 00:00:00 UTC ❌ | | KST(+9) | TIMESTAMP_LTZ | 2026-01-01 00:00:00 KST | 2025-12-31 15:00:00 UTC ✅ | | UTC | TIMESTAMP_NTZ | 2026-01-01 00:00:00 | 2026-01-01 00:00:00 UTC ➖ | | CST(-6) | TIMESTAMP_NTZ | 2026-01-01 00:00:00 | 2026-01-01 00:00:00 UTC ❌ | | CST(-6) | TIMESTAMP_LTZ | 2026-01-01 00:00:00 CST | 2026-01-01 06:00:00 UTC ✅ | The NTZ case at UTC timezone appears correct only by coincidence (zero offset), not because of correct handling. ### Root cause The core issue is that SeaTunnel's internal type system has no way to distinguish NTZ from LTZ timestamp types. All DB-specific TypeConverters collapse both into the same internal type (`SqlType.TIMESTAMP` / `LOCAL_DATE_TIME_TYPE`), and the entire pipeline unconditionally writes every timestamp as LTZ to the target. **1. TypeConverters — NTZ and LTZ both mapped to `LOCAL_DATE_TIME_TYPE`** All DB-specific TypeConverters map NTZ and LTZ timestamp types to the identical SeaTunnel internal type, making it structurally impossible to distinguish them downstream. - `MySqlTypeConverter.java` `convert(BasicTypeDefine)` — `DATETIME` (NTZ) and `TIMESTAMP` (LTZ) both map to `LOCAL_DATE_TIME_TYPE` - `PostgresTypeConverter.java` `convert(BasicTypeDefine)` — `timestamp` (NTZ) and `timestamptz` (LTZ) both map to `LOCAL_DATE_TIME_TYPE` - `OracleTypeConverter.java` `convert(BasicTypeDefine)` — `DATE`, `TIMESTAMP` (NTZ) and `TIMESTAMP WITH TIME ZONE`, `TIMESTAMP WITH LOCAL TIME ZONE` (LTZ) all map to `LOCAL_DATE_TIME_TYPE` - `SqlServerTypeConverter.java` `convert(BasicTypeDefine)` — `DATETIME`, `DATETIME2` (NTZ) and `DATETIMEOFFSET` (LTZ) all map to `LOCAL_DATE_TIME_TYPE` - `GenericTypeConverter.java` `convert(BasicTypeDefine)` — `Types.TIMESTAMP` (covers both NTZ and LTZ reported by JDBC driver) maps to `LOCAL_DATE_TIME_TYPE` **2. `AbstractJdbcRowConverter.java` — `toInternal()`** NTZ and LTZ columns are read through the same code path. `rs.getTimestamp()` applies the JVM default timezone, which is correct for LTZ but corrupts NTZ wall-clock values. There is no separate `TIMESTAMP_TZ` case. **3. `AbstractJdbcRowConverter.java` — `setValueToStatementByDataType()`** On write, `Timestamp.valueOf(LocalDateTime)` re-applies the JVM default timezone, shifting the epoch value regardless of whether the original source type was NTZ or LTZ. **4. `InternalRowConverter.java` — `convert()` / `reconvert()`** In the Spark engine translation layer, `toLocalDateTime()` in `AbstractJdbcRowConverter` first maps the NTZ wall-clock value to a `LocalDateTime` while applying the JVM timezone offset (e.g. `2026-01-01T00:00:00Z` epoch → KST+9 → `2026-01-01T09:00:00`), and then `Timestamp.valueOf()` in `InternalRowConverter` immediately reverses it (e.g. KST-9 → epoch `2026-01-01T00:00:00Z`). The two JVM timezone offsets cancel each other out, so the NTZ wall-clock numeric value is preserved as-is in UTC — meaning a value that represents `2026-01-01 00:00:00 KST` is incorrectly stored as `2026-01-01 00:00:00 UTC` instead of the correct `2025-12-31 15:00:00 UTC`. **5. `TypeConverterUtils.java` — `convert(SeaTunnelDataType)`** SeaTunnel's `TIMESTAMP` (NTZ) is mapped to Spark's `TimestampType` which is always UTC-based (LTZ). There is no NTZ-aware handling anywhere in the codebase. **6. `IcebergTypeMapper.java` — `toIcebergType()`** SeaTunnel's `TIMESTAMP` is unconditionally mapped to `Types.TimestampType.withZone()` (LTZ). `withoutZone()` (NTZ) is never called. ### Expected behavior - NTZ source columns should be stored as NTZ in the target (e.g. Iceberg TimestampType.withoutZone()) - LTZ source columns should be stored as LTZ in the target (e.g. Iceberg TimestampType.withZone()) - The distinction should be preserved end-to-end through the SeaTunnel type system using SqlType.TIMESTAMP (NTZ) vs SqlType.TIMESTAMP_TZ (LTZ) ### Actual behavior All NTZ timestamp columns are stored as LTZ in the target. The wall-clock value is written directly as UTC without timezone conversion, because SeaTunnel has no mechanism to carry NTZ/LTZ distinction through its internal type system. ### SeaTunnel Version 2.3.10 ### SeaTunnel Config ```conf env { parallelism = 1 job.mode = "BATCH" } source { Jdbc { url = "jdbc:mysql://localhost:3306/testdb" driver = "com.mysql.cj.jdbc.Driver" user = "root" password = "password" query = "SELECT datetime_col, timestamp_col FROM test_table" } } sink { Iceberg { catalog_name = "local" namespace = "default" table = "test_iceberg" } } ``` ### Running Command ```shell ./bin/start-seatunnel-spark-3-connector-v2.sh --config config/test.conf ``` ### Error Exception ```log No explicit exception is thrown. The job completes successfully but NTZ timestamp values are stored with incorrect timezone semantics in the target. ``` ### Zeta or Flink or Spark Version zeta 2.3.10 flink 1.17.2 spark 3.3.3 ### Java or Scala Version Java 11 ### Screenshots _No response_ ### Are you willing to submit PR? - [x] Yes I am willing to submit a PR! ### Code of Conduct - [x] I agree to follow this project's [Code of Conduct](https://www.apache.org/foundation/policies/conduct) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
