[ 
https://issues.apache.org/jira/browse/ARROW-8066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Markovtsev Vadim updated ARROW-8066:
------------------------------------
    Description: 
The original description is at 
[https://github.com/pandas-dev/pandas/issues/32587]



  was:
The original description is at 
[https://github.com/pandas-dev/pandas/issues/32587]

 
|h4. Code Sample, a copy-pastable example if possible
import pandas as pd from datetime import datetime, timezone df = 
pd.DataFrame.from_records([ (1, datetime.now().replace(tzinfo=timezone.utc)), 
(2, datetime.now().replace(tzinfo=timezone.min))], columns=["1", "2"]) 
print(df["2"]) print() df.to_feather("/tmp/1") df2 = pd.read_feather("/tmp/1") 
print(df2["2"])
This code will output: {{0    2020-03-10 18:13:49.405598+00:00
1    2020-03-10 18:13:49.405626-23:59
Name: 2, dtype: object

0   2020-03-10 18:13:49.405598
1   2020-03-10 18:13:49.405626
Name: 2, dtype: datetime64[ns]}}h4. Problem description
The round-trip dtype changed from the correct {{object}} to incorrect 
{{datetime64}}. Thus the timezones were discarded in Arrow and the timestamps 
became invalid.h4. Expected Output
(identical) {{0    2020-03-10 18:13:49.405598+00:00
1    2020-03-10 18:13:49.405626-23:59
Name: 2, dtype: object

0    2020-03-10 18:13:49.405598+00:00
1    2020-03-10 18:13:49.405626-23:59
Name: 2, dtype: object}}h4. Output of {{pd.show_versions()}}|
 
 
 
 


> PyArrow discards timezones
> --------------------------
>
>                 Key: ARROW-8066
>                 URL: https://issues.apache.org/jira/browse/ARROW-8066
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python
>    Affects Versions: 0.16.0
>            Reporter: Markovtsev Vadim
>            Priority: Major
>
> The original description is at 
> [https://github.com/pandas-dev/pandas/issues/32587]



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to