[ https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Markovtsev Vadim updated ARROW-8066: ------------------------------------ Description: The original description is at [https://github.com/pandas-dev/pandas/issues/32587] was: The original description is at [https://github.com/pandas-dev/pandas/issues/32587] |h4. Code Sample, a copy-pastable example if possible import pandas as pd from datetime import datetime, timezone df = pd.DataFrame.from_records([ (1, datetime.now().replace(tzinfo=timezone.utc)), (2, datetime.now().replace(tzinfo=timezone.min))], columns=["1", "2"]) print(df["2"]) print() df.to_feather("/tmp/1") df2 = pd.read_feather("/tmp/1") print(df2["2"]) This code will output: {{0 2020-03-10 18:13:49.405598+00:00 1 2020-03-10 18:13:49.405626-23:59 Name: 2, dtype: object 0 2020-03-10 18:13:49.405598 1 2020-03-10 18:13:49.405626 Name: 2, dtype: datetime64[ns]}}h4. Problem description The round-trip dtype changed from the correct {{object}} to incorrect {{datetime64}}. Thus the timezones were discarded in Arrow and the timestamps became invalid.h4. Expected Output (identical) {{0 2020-03-10 18:13:49.405598+00:00 1 2020-03-10 18:13:49.405626-23:59 Name: 2, dtype: object 0 2020-03-10 18:13:49.405598+00:00 1 2020-03-10 18:13:49.405626-23:59 Name: 2, dtype: object}}h4. Output of {{pd.show_versions()}}| > PyArrow discards timezones > -------------------------- > > Key: ARROW-8066 > URL: https://issues.apache.org/jira/browse/ARROW-8066 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.16.0 > Reporter: Markovtsev Vadim > Priority: Major > > The original description is at > [https://github.com/pandas-dev/pandas/issues/32587] -- This message was sent by Atlassian Jira (v8.3.4#803005)