joellubi opened a new issue, #39466:
URL: https://github.com/apache/arrow/issues/39466

   ### Describe the bug, including details regarding any error messages, 
version, and platform.
   
   Arrow 
[documentation](https://docs.rs/arrow/latest/arrow/datatypes/enum.DataType.html#variant.Timestamp)
 indicates that Timestamps are always measured from the UNIX epoch, and the 
presence of timezone information is used to determine whether the value has 
"instant" semantics or "wall clock" semantics. Timestamps with _any_ timezone 
have "instant" semantics and the epoch is always in UTC. The timezone field may 
be used by applications to display a localized time on read.
   
   Parquet also has a notion of "instant" vs "local" semantics which is 
specified with `isAdjustedToUTC=true` or `isAdjustedToUTC=false`, respectively 
([source](https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#timestamp)).
 Parquet does not store timezone information, expecting physical 
representations to already be in UTC when using "instant" semantics.
   
   This means that Arrow timestamps in _any_ timezone are already "instants" 
physically represented in UTC and would map to Parquet "instants" directly with 
`isAdjustedToUTC=true`. Otherwise, the Arrow physical representation has 
"local" semantics and should map to Parquet with `isAdjustedToUTC=false`.
   
   This means that `isAdjustedToUTC` in Parquet should be set based on whether 
an Arrow timezone was present or not, which is how other implementations handle 
the conversion 
([C++](https://github.com/apache/arrow/blob/bec03856799a69bf0e6d4419ab7bc565afd070fe/cpp/src/parquet/arrow/schema.cc#L142),
 
[Rust](https://github.com/apache/arrow-rs/blob/2f383e764aa2b79e52d562e24eb0d1dce41f5ce7/parquet/src/arrow/schema/mod.rs#L401)).
 In the current [Go 
implementation](https://github.com/apache/arrow/blob/bec03856799a69bf0e6d4419ab7bc565afd070fe/go/parquet/pqarrow/schema.go#L128),
 Parquet "instant" timestamps are produced for Arrow timestamps _without_ a 
timezone or that are specifically in UTC. This does not align with the 
documentation and other implementations.
   
   ### Component(s)
   
   Go, Parquet


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to