[ https://issues.apache.org/jira/browse/ARROW-5359?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
ASF GitHub Bot updated ARROW-5359: ---------------------------------- Labels: pull-request-available (was: ) > [Python] timestamp_as_object support for pa.Table.to_pandas in pyarrow > ---------------------------------------------------------------------- > > Key: ARROW-5359 > URL: https://issues.apache.org/jira/browse/ARROW-5359 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.13.0 > Environment: Ubuntu > Reporter: Joe Muruganandam > Priority: Major > Labels: pull-request-available > Time Spent: 10m > Remaining Estimate: 0h > > Creating ticket for issue reported in > github([https://github.com/apache/arrow/issues/4284]) > h2. pyarrow (Issue with timestamp conversion from arrow to pandas) > pyarrow Table.to_pandas has option date_as_object but does not have similar > option for timestamp. When a timestamp column in arrow table is converted to > pandas the target datetype is pd.Timestamp and pd.Timestamp does not handle > time > 2262-04-11 23:47:16.854775807 and hence in the below scenario the date > is transformed to incorrect value. Adding timestamp_as_object option in > pa.Table.to_pandas will help in this scenario. > #Python(3.6.8) > import pandas as pd > import pyarrow as pa > pd.*version* > '0.24.1' > pa.*version* > '0.13.0' > import datetime > df = pd.DataFrame(\{"test_date": > [datetime.datetime(3000,12,31,12,0),datetime.datetime(3100,12,31,12,0)]}) > df > test_date > 0 3000-12-31 12:00:00 > 1 3100-12-31 12:00:00 > pa_table = pa.Table.from_pandas(df) > pa_table[0] > Column name='test_date' type=TimestampType(timestamp[us]) > [ > [ > 32535172800000000, > 35690846400000000 > ] > ] > pa_table.to_pandas() > test_date > 0 1831-11-22 12:50:52.580896768 > 1 1931-11-22 12:50:52.580896768 -- This message was sent by Atlassian Jira (v8.3.4#803005)