talatuyarer opened a new pull request, #15475:
URL: https://github.com/apache/iceberg/pull/15475

   This PR fixes the issue where nanosecond precision timestamps were being 
truncated to millisecond precision when using Apache Flink with Apache Iceberg 
V3 tables. Now you can actually use those fancy TIMESTAMP(9) and 
TIMESTAMP_LTZ(9) types without losing precision! 
   
   **The Problem**
   When inserting data with nanosecond precision using Flink SQL like:
   ```sql
   INSERT INTO my_table VALUES (TIMESTAMP '2025-01-15 10:30:45.123456789');
   ```
   The data would mysteriously lose precision and come back as:
   ```
   2025-01-15 10:30:45.123456
   ```
   
   **Core Issues:**
   
   `Iceberg Flink Core`: RowDataWrapper was converting all timestamps to 
microseconds regardless of precision. StructRowData and RowDataUtil were also 
truncating precision (and struggling with pre-1970 negative nanosecond 
timestamps) when reading data back.
   `Type Mapping`: FlinkTypeToType was always mapping Flink's TIMESTAMP(9) to 
microsecond Iceberg types, actively causing Flink to cast the data down to 
TIMESTAMP(6).
   `Avro Conversion`: Flink's native Avro converters (AvroSchemaConverter, 
AvroToRowDataConverters, and RowDataToAvroConverters) dropped nanosecond 
precision during schema conversion and row mapping, requiring us to override 
and patch them within Iceberg to explicitly support timestamp-nanos.
   `ORC & Parquet`: Writers and readers (FlinkOrcWriters, FlinkOrcReader, etc.) 
were hardcoded to lose nanosecond precision when interacting with the 
underlying files.
   
   **Data Format Support:**
   ✅ Parquet: Fixed writers and readers to preserve nanosecond precision
   ✅ Avro: Already working perfectly (verified with tests)
   ✅ ORC
   
   If you want to test by yourself.
   ```sql
   -- Create table with nanosecond precision
   CREATE TABLE test_table (
       id BIGINT,
       ts TIMESTAMP(9),
       ts_tz TIMESTAMP_LTZ(9)
   ) WITH (
       'connector' = 'iceberg',
       'catalog-name' = 'hadoop_catalog',
       'warehouse' = 'gs://my-bucket/warehouse',
       'format-version' = 3
   );
   
   -- Insert with nanosecond precision
   INSERT INTO test_table VALUES 
   (1, TIMESTAMP '2025-01-15 10:30:45.123456789', TIMESTAMP '2025-01-15 
10:30:45.123456789');
   
   -- Query and verify precision is preserved
   SELECT * FROM test_table;
   -- Should show: 2025-01-15 10:30:45.123456789 (not truncated!)
   ```
   
   I reopen PR because previous one closed. 
   https://github.com/apache/iceberg/pull/14245


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to