[ https://issues.apache.org/jira/browse/ARROW-1718?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Bryan Cutler updated ARROW-1718: -------------------------------- Description: When calling {{Array.from_pandas}} with a pandas.Series of dates and specifying the desired pyarrow type, an error occurs. If the type is not specified then {{from_pandas}} will interpret the data as a timestamp type. {code} import pandas as pd import pyarrow as pa import datetime arr = pa.array([datetime.date(2017, 10, 23)]) c = pa.Column.from_array("d", arr) s = c.to_pandas() print(s) # 0 2017-10-23 # Name: d, dtype: datetime64[ns] result = pa.Array.from_pandas(s, type=pa.date32()) print(result) """ Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pyarrow/array.pxi", line 295, in pyarrow.lib.Array.__repr__ (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:26221) File "/home/bryan/.local/lib/python2.7/site-packages/pyarrow-0.7.2.dev21+ng028f2cd-py2.7-linux-x86_64.egg/pyarrow/formatting.py", line 28, in array_format values.append(value_format(x, 0)) File "/home/bryan/.local/lib/python2.7/site-packages/pyarrow-0.7.2.dev21+ng028f2cd-py2.7-linux-x86_64.egg/pyarrow/formatting.py", line 49, in value_format return repr(x) File "pyarrow/scalar.pxi", line 63, in pyarrow.lib.ArrayValue.__repr__ (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:19535) File "pyarrow/scalar.pxi", line 137, in pyarrow.lib.Date32Value.as_py (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:20368) ValueError: year is out of range """ {code} was: When calling {Array.from_pandas} with a pandas.Series of dates and specifying the desired pyarrow type, an error occurs. If the type is not specified then {from_pandas} will interpret the data as a timestamp type. {code} import pandas as pd import pyarrow as pa import datetime arr = pa.array([datetime.date(2017, 10, 23)]) c = pa.Column.from_array("d", arr) s = c.to_pandas() print(s) # 0 2017-10-23 # Name: d, dtype: datetime64[ns] result = pa.Array.from_pandas(s, type=pa.date32()) print(result) """ Traceback (most recent call last): File "<stdin>", line 1, in <module> File "pyarrow/array.pxi", line 295, in pyarrow.lib.Array.__repr__ (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:26221) File "/home/bryan/.local/lib/python2.7/site-packages/pyarrow-0.7.2.dev21+ng028f2cd-py2.7-linux-x86_64.egg/pyarrow/formatting.py", line 28, in array_format values.append(value_format(x, 0)) File "/home/bryan/.local/lib/python2.7/site-packages/pyarrow-0.7.2.dev21+ng028f2cd-py2.7-linux-x86_64.egg/pyarrow/formatting.py", line 49, in value_format return repr(x) File "pyarrow/scalar.pxi", line 63, in pyarrow.lib.ArrayValue.__repr__ (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:19535) File "pyarrow/scalar.pxi", line 137, in pyarrow.lib.Date32Value.as_py (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:20368) ValueError: year is out of range """ {code} > [Python] Creating a pyarrow.Array of date type from pandas causes error > ----------------------------------------------------------------------- > > Key: ARROW-1718 > URL: https://issues.apache.org/jira/browse/ARROW-1718 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Reporter: Bryan Cutler > Assignee: Wes McKinney > Fix For: 0.8.0 > > > When calling {{Array.from_pandas}} with a pandas.Series of dates and > specifying the desired pyarrow type, an error occurs. If the type is not > specified then {{from_pandas}} will interpret the data as a timestamp type. > {code} > import pandas as pd > import pyarrow as pa > import datetime > arr = pa.array([datetime.date(2017, 10, 23)]) > c = pa.Column.from_array("d", arr) > s = c.to_pandas() > print(s) > # 0 2017-10-23 > # Name: d, dtype: datetime64[ns] > result = pa.Array.from_pandas(s, type=pa.date32()) > print(result) > """ > Traceback (most recent call last): > File "<stdin>", line 1, in <module> > File "pyarrow/array.pxi", line 295, in pyarrow.lib.Array.__repr__ > (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:26221) > File > "/home/bryan/.local/lib/python2.7/site-packages/pyarrow-0.7.2.dev21+ng028f2cd-py2.7-linux-x86_64.egg/pyarrow/formatting.py", > line 28, in array_format > values.append(value_format(x, 0)) > File > "/home/bryan/.local/lib/python2.7/site-packages/pyarrow-0.7.2.dev21+ng028f2cd-py2.7-linux-x86_64.egg/pyarrow/formatting.py", > line 49, in value_format > return repr(x) > File "pyarrow/scalar.pxi", line 63, in pyarrow.lib.ArrayValue.__repr__ > (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:19535) > File "pyarrow/scalar.pxi", line 137, in pyarrow.lib.Date32Value.as_py > (/home/bryan/git/arrow/python/build/temp.linux-x86_64-2.7/lib.cxx:20368) > ValueError: year is out of range > """ > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)