kevinjqliu commented on issue #2167: URL: https://github.com/apache/iceberg-rust/issues/2167#issuecomment-3953093967
> So the issue is likely in Snowflake's Arrow schema reader — specifically how it deserializes the embedded Arrow schema from the Parquet footer and maps the timezone annotation to a Snowflake type. Snowflake is probably reading the Arrow metadata to get richer type information (like timezone), and its Arrow schema parser doesn't recognize "+00:00" as a valid timezone. > The native Parquet schema itself doesn't store a timezone string at all — it just has isAdjustedToUTC as a boolean. So the "+00:00" string can only be coming from the Arrow schema embedded in the footer, which means the failure point is in how Snowflake processes that Arrow metadata. > This also explains why Spark works fine — Spark writes "UTC" in its embedded Arrow schema, so Snowflake's Arrow reader handles it without issue. From claude, this makes sense. The issue is with reading the schema annotation and not from reading the parquet file with timestamptz > I wonder if we can switch to "UTC" constant or make it configurable. I think this is reasonable -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
