[ https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Carlo Mazzaferro updated ARROW-11445: ------------------------------------- Description: While I have not dug deep enough in the Arrow codebase, it seems to me that this is caused by the new numpy release: [https://github.com/numpy/numpy/releases] The issue below in fact is not observed when using numpy 0.19.* {{{{>>> pandas.__version__, pa.__version__, numpy.__version__}}}} {{ {{('1.2.1', '2.0.0', '1.20.0')}}}} {{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})}}}} {{ {{>>> pandas.__version__, pa.__version__, numpy.__version__}}}} {{ {{('1.2.1', '2.0.0', '1.20.0')}}}} {{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})}}}} {{>>> pa.Table.from_pandas(df)}} {{Traceback (most recent call last):}} {{ File "<input>", line 1, in <module>}} {{ pa.Table.from_pandas(df)}} {{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in dataframe_to_arrays}} {{ for c, f in zip(columns_to_convert, convert_fields)]}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in <listcomp>}} {{ for c, f in zip(columns_to_convert, convert_fields)]}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 574, in convert_column}} {{ raise e}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 568, in convert_column}} {{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}} {{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}} {{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}} {{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}} {{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}} {{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column a with type int64')}} was: While I have not dug deep enough in the Arrow codebase, it seems to me that this is caused by the new numpy release: [https://github.com/numpy/numpy/releases] The issue below in fact is not observed when using numpy 0.19.* {{>>> pandas.__version__, pa.__version__, numpy.__version__}} {{('1.2.1', '2.0.0', '1.20.0')}} {{>>> df = pandas.DataFrame(\{'a': [1,2,3]})}} {{>>> pandas.__version__, pa.__version__, numpy.__version__}} {{('1.2.1', '2.0.0', '1.20.0')}} {{>>> df = pandas.DataFrame(\{'a': [1,2,3]})}} {{>>> pa.Table.from_pandas(df)}} {{Traceback (most recent call last):}} {{ File "<input>", line 1, in <module>}} {{ pa.Table.from_pandas(df)}} {{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in dataframe_to_arrays}} {{ for c, f in zip(columns_to_convert, convert_fields)]}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 588, in <listcomp>}} {{ for c, f in zip(columns_to_convert, convert_fields)]}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 574, in convert_column}} {{ raise e}} {{ File "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", line 568, in convert_column}} {{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}} {{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}} {{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}} {{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}} {{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}} {{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion failed for column a with type int64')}} > Type conversion failure on numpy 0.1.20 > --------------------------------------- > > Key: ARROW-11445 > URL: https://issues.apache.org/jira/browse/ARROW-11445 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 2.0.0 > Environment: Python 3.7.4 > Mac OS > Reporter: Carlo Mazzaferro > Priority: Major > > While I have not dug deep enough in the Arrow codebase, it seems to me that > this is caused by the new numpy release: > [https://github.com/numpy/numpy/releases] > The issue below in fact is not observed when using numpy 0.19.* > > {{{{>>> pandas.__version__, pa.__version__, numpy.__version__}}}} > {{ {{('1.2.1', '2.0.0', '1.20.0')}}}} > {{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})}}}} > {{ {{>>> pandas.__version__, pa.__version__, numpy.__version__}}}} > {{ {{('1.2.1', '2.0.0', '1.20.0')}}}} > {{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})}}}} > {{>>> pa.Table.from_pandas(df)}} > {{Traceback (most recent call last):}} > {{ File "<input>", line 1, in <module>}} > {{ pa.Table.from_pandas(df)}} > {{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}} > {{ File > "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", > line 588, in dataframe_to_arrays}} > {{ for c, f in zip(columns_to_convert, convert_fields)]}} > {{ File > "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", > line 588, in <listcomp>}} > {{ for c, f in zip(columns_to_convert, convert_fields)]}} > {{ File > "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", > line 574, in convert_column}} > {{ raise e}} > {{ File > "/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py", > line 568, in convert_column}} > {{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}} > {{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}} > {{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}} > {{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}} > {{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}} > {{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion > failed for column a with type int64')}} -- This message was sent by Atlassian Jira (v8.3.4#803005)