talatuyarer opened a new pull request, #15475:
URL: https://github.com/apache/iceberg/pull/15475
This PR fixes the issue where nanosecond precision timestamps were being
truncated to millisecond precision when using Apache Flink with Apache Iceberg
V3 tables. Now you can actually use those fancy TIMESTAMP(9) and
TIMESTAMP_LTZ(9) types without losing precision!
**The Problem**
When inserting data with nanosecond precision using Flink SQL like:
```sql
INSERT INTO my_table VALUES (TIMESTAMP '2025-01-15 10:30:45.123456789');
```
The data would mysteriously lose precision and come back as:
```
2025-01-15 10:30:45.123456
```
**Core Issues:**
`Iceberg Flink Core`: RowDataWrapper was converting all timestamps to
microseconds regardless of precision. StructRowData and RowDataUtil were also
truncating precision (and struggling with pre-1970 negative nanosecond
timestamps) when reading data back.
`Type Mapping`: FlinkTypeToType was always mapping Flink's TIMESTAMP(9) to
microsecond Iceberg types, actively causing Flink to cast the data down to
TIMESTAMP(6).
`Avro Conversion`: Flink's native Avro converters (AvroSchemaConverter,
AvroToRowDataConverters, and RowDataToAvroConverters) dropped nanosecond
precision during schema conversion and row mapping, requiring us to override
and patch them within Iceberg to explicitly support timestamp-nanos.
`ORC & Parquet`: Writers and readers (FlinkOrcWriters, FlinkOrcReader, etc.)
were hardcoded to lose nanosecond precision when interacting with the
underlying files.
**Data Format Support:**
✅ Parquet: Fixed writers and readers to preserve nanosecond precision
✅ Avro: Already working perfectly (verified with tests)
✅ ORC
If you want to test by yourself.
```sql
-- Create table with nanosecond precision
CREATE TABLE test_table (
id BIGINT,
ts TIMESTAMP(9),
ts_tz TIMESTAMP_LTZ(9)
) WITH (
'connector' = 'iceberg',
'catalog-name' = 'hadoop_catalog',
'warehouse' = 'gs://my-bucket/warehouse',
'format-version' = 3
);
-- Insert with nanosecond precision
INSERT INTO test_table VALUES
(1, TIMESTAMP '2025-01-15 10:30:45.123456789', TIMESTAMP '2025-01-15
10:30:45.123456789');
-- Query and verify precision is preserved
SELECT * FROM test_table;
-- Should show: 2025-01-15 10:30:45.123456789 (not truncated!)
```
I reopen PR because previous one closed.
https://github.com/apache/iceberg/pull/14245
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]