Hi all,
I recently opened a pull request[1] in the apache/arrow project to extend the
Arrow Parquet writer to support writing TIME data with isAdjustedToUTC=false.
For your reference, the Arrow Parquet writer defaults to exporting Arrow time
data as Parquet TIME data with isAdjustedToUTC=true, even though Arrow time
types are timezone-agnostic, in order to follow the Parquet Compatibility
Guidelines[2] regarding the deprecation of ConvertedTypes TIME_MILLIS and
TIME_MICROS.
However, progress on this pull request has stalled because the Arrow community
is confused about the purpose of the isAdjustedToUTC parameter[3], and some
members think the best strategy moving forward would be to deprecate it[4].
Could the Parquet community provide some background about why the TIME type is
parameterized on isAdjustedToUTC? Are there plans to deprecate this parameter
in the future?
Any information the community can provide would be much appreciated, because it
will help the Arrow community decide whether it makes sense to add a flag for
controlling this parameter to the Arrow Parquet writer.
I appreciate the community's time and thoughts on this issue.
Thank you!
Best,
Sarah Gilmore
[1] https://github.com/apache/arrow/pull/47316
[2]
https://github.com/apache/parquet-format/blob/master/LogicalTypes.md#deprecated-time-convertedtype
[3] https://github.com/apache/arrow/pull/43268#discussion_r1686692052
[4] https://github.com/apache/arrow/pull/47316#issuecomment-3190380012