Florian Jetter created ARROW-5888: ------------------------------------- Summary: [Python][C++] Parquet write metadata not roundtrip safe for timezone timestamps Key: ARROW-5888 URL: https://issues.apache.org/jira/browse/ARROW-5888 Project: Apache Arrow Issue Type: Bug Reporter: Florian Jetter
The timezone is not roundtrip safe for timezones other than UTC when storing to parquet. Expected behavior would be that the timezone is properly reconstructed {code:python} schema = pa.schema( [ pa.field("no_tz", pa.timestamp('us')), pa.field("no_tz", pa.timestamp('us', tz="UTC")), pa.field("no_tz", pa.timestamp('us', tz="Europe/Berlin")), ] ) buf = pa.BufferOutputStream() pq.write_metadata( schema, buf, coerce_timestamps="us" ) pq_bytes = buf.getvalue().to_pybytes() reader = pa.BufferReader(pq_bytes) parquet_file = pq.ParquetFile(reader) parquet_file.schema.to_arrow_schema() # Output: # no_tz: timestamp[us] # utc: timestamp[us, tz=UTC] # europe: timestamp[us, tz=UTC] {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)