[ https://issues.apache.org/jira/browse/ARROW-5125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Rok Mihevc updated ARROW-5125: ------------------------------ External issue URL: https://github.com/apache/arrow/issues/16705 > [Python] Cannot roundtrip extreme dates through pyarrow > ------------------------------------------------------- > > Key: ARROW-5125 > URL: https://issues.apache.org/jira/browse/ARROW-5125 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.13.0 > Environment: Windows 10, Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 > 2019, 22:22:05) > Reporter: Max Bolingbroke > Assignee: Micah Kornfield > Priority: Major > Labels: pull-request-available, windows > Fix For: 0.15.0 > > Time Spent: 1h 10m > Remaining Estimate: 0h > > You can roundtrip many dates through a pyarrow array: > > {noformat} > >>> pa.array([datetime.date(1980, 1, 1)], type=pa.date32())[0] > datetime.date(1980, 1, 1){noformat} > > But (on Windows at least), not extreme ones: > > {noformat} > >>> pa.array([datetime.date(1960, 1, 1)], type=pa.date32())[0] > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "pyarrow\scalar.pxi", line 74, in pyarrow.lib.ArrayValue.__repr__ > File "pyarrow\scalar.pxi", line 226, in pyarrow.lib.Date32Value.as_py > OSError: [Errno 22] Invalid argument > >>> pa.array([datetime.date(3200, 1, 1)], type=pa.date32())[0] > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "pyarrow\scalar.pxi", line 74, in pyarrow.lib.ArrayValue.__repr__ > File "pyarrow\scalar.pxi", line 226, in pyarrow.lib.Date32Value.as_py > {noformat} > This is because datetime.utcfromtimestamp and datetime.timestamp fail on > these dates, but it seems we should be able to totally avoid invoking this > function when deserializing dates. Ideally we would be able to roundtrip > these as datetimes too, of course, but it's less clear that this will be > easy. For some context on this see [https://bugs.python.org/issue29097]. > This may be related to ARROW-3176 and ARROW-4746 -- This message was sent by Atlassian Jira (v8.20.10#820010)