[ https://issues.apache.org/jira/browse/ARROW-17192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17621174#comment-17621174 ]
Alenka Frim commented on ARROW-17192: ------------------------------------- This is said to be a known issue due to the fact that pandas, for now, only supports {{datetime64}} data type in nanosecond resolution. So when you write to a feather file the pandas dataframe gets converted to an arrow table and the conversion infers the datetime to microsecond resolution. As a workaround you can use {{feather.read_table}} to read the feather file into an Arrow table and then use {{to_pandas}} to convert it into a pandas dataframe, but you will have to add {{timestamp_as_object=True}} keyword so that PyArrow doesn't try to convert the timestamp to {{{}datetime64[ns]{}}}: {code:python} >>> feather.read_table("to_trash.feather").to_pandas(timestamp_as_object=True) date 0 1654-01-01 00:00:00 1 1920-01-01 00:00:00 {code} But I think we should still pass through {{**kwargs}} in {{read_feather}} to {{to_pandas()}} so that one could specify {{timestamp_as_object=True}} keyword there also. So I am keeping the Jira open and will try to make a PR for it in the following week. Contributions are also welcome, I can help if needed. > [Python] .to_pandas can't read_feather if a date column contains dates > before 1677 and after 2262 > -------------------------------------------------------------------------------------------------- > > Key: ARROW-17192 > URL: https://issues.apache.org/jira/browse/ARROW-17192 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Environment: Any environment > Reporter: Adrien Pacifico > Priority: Major > > A feather file with a column containing dates lower than 1677 or greater than > 2262 cannot be read with pandas, du to `.to_pandas` method. > To reproduce the issue: > {code:java} > ### create feather file > import pandas as pd > from datetime import datetime > df = pd.DataFrame({"date": [ > datetime.fromisoformat("1654-01-01"), > datetime.fromisoformat("1920-01-01"), > ],}) > df.to_feather("to_trash.feather") > ### read feather file > from pyarrow.feather import read_feather > read_feather("to_trash.feather") > {code} > > I think that the expected behavior would be to have an object column > contining datetime objects. > I think that the problem comes from _array_like_to_pandas method : > [https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L1584] > or from `_to_pandas()` > [https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L2742] > or from `to_pandas`: > [https://github.com/apache/arrow/blob/76f45a6892b13391fdede4c72934f75f6d56143c/python/pyarrow/array.pxi#L673] -- This message was sent by Atlassian Jira (v8.20.10#820010)