[jira] [Updated] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Carlo Mazzaferro (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Mazzaferro updated ARROW-11445:
-
Description: 
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

 
{code:java}
>>> pandas.__version__, pa.__version__, numpy.__version__
('1.2.1', '2.0.0', '1.20.0')
>>> df = pandas.DataFrame({'a': numpy.random.randn(10), 'b': 
>>> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
>>> list(range(9)) + [numpy.nan]})
>>> pa.Table.from_pandas(df)
Traceback (most recent call last):
  File "", line 1, in 
pa.Table.from_pandas(df)
  File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas
  File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays
for c, f in zip(columns_to_convert, convert_fields)]
  File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in 
for c, f in zip(columns_to_convert, convert_fields)]
  File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column
raise e
  File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column
result = pa.array(col, type=type_, from_pandas=True, safe=safe)
  File "pyarrow/array.pxi", line 292, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array
  File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type
  File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')

{code}

  was:
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

{{ >>> pandas.__version__, pa.__version__, numpy.__version__}}
{{('1.2.1', '2.0.0', '1.20.0')}}
{{>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
list(range(9)) + [numpy.nan]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')}}


> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Priority: Major
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
>  
>  
>  
> {code:java}
> >>> pandas.__version__, pa.__version__, numpy.__version__
> ('1.2.1', '2.0.0', '1.20.0')
> >>> df = pandas.DataFrame({'a': numpy.random.randn(10), 'b': 
> >>> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
> >>> list(range(9)) + [numpy.nan]})
> >>> pa.Table.from_pandas(df)
> Traceback (most recent call last):
>   File "", line 1, in 
> pa.Table.from_pandas(df)
>   File "pyarrow/table.pxi", line 1394, in 

[jira] [Updated] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Carlo Mazzaferro (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Mazzaferro updated ARROW-11445:
-
Description: 
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

{{ >>> pandas.__version__, pa.__version__, numpy.__version__}}
{{('1.2.1', '2.0.0', '1.20.0')}}
{{>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
list(range(9)) + [numpy.nan]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')}}

  was:
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

 >>> pandas.__version__, pa.__version__, numpy.__version__
('1.2.1', '2.0.0', '1.20.0')
>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
>>> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
>>> list(range(9)) + [numpy.nan]})
>>> pa.Table.from_pandas(df)
Traceback (most recent call last):
 File "", line 1, in 
 pa.Table.from_pandas(df)
 File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays
 for c, f in zip(columns_to_convert, convert_fields)]
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in 
 for c, f in zip(columns_to_convert, convert_fields)]
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column
 raise e
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column
 result = pa.array(col, type=type_, from_pandas=True, safe=safe)
 File "pyarrow/array.pxi", line 292, in pyarrow.lib.array
 File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array
 File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type
 File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')


> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Priority: Major
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
>  
>  
> {{ >>> pandas.__version__, pa.__version__, numpy.__version__}}
> {{('1.2.1', '2.0.0', '1.20.0')}}
> {{>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
> list(range(9)) + [numpy.nan]})}}
> {{>>> pa.Table.from_pandas(df)}}
> {{Traceback (most recent call last):}}
> {{ File "", line 1, in }}
> {{ pa.Table.from_pandas(df)}}
> {{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
> {{ File 
> 

[jira] [Updated] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Carlo Mazzaferro (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Mazzaferro updated ARROW-11445:
-
Description: 
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

 >>> pandas.__version__, pa.__version__, numpy.__version__
('1.2.1', '2.0.0', '1.20.0')
>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
>>> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
>>> list(range(9)) + [numpy.nan]})
>>> pa.Table.from_pandas(df)
Traceback (most recent call last):
 File "", line 1, in 
 pa.Table.from_pandas(df)
 File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays
 for c, f in zip(columns_to_convert, convert_fields)]
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in 
 for c, f in zip(columns_to_convert, convert_fields)]
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column
 raise e
 File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column
 result = pa.array(col, type=type_, from_pandas=True, safe=safe)
 File "pyarrow/array.pxi", line 292, in pyarrow.lib.array
 File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array
 File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type
 File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')

  was:
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

 
{quote}{{>>> pandas.__version__, pa.__version__, numpy.__version__}}
{{('1.2.1', '2.0.0', '1.20.0')}}
{{>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
list(range(9)) + [numpy.nan]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')}}
{quote}


> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Priority: Major
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
>  
>  
>  >>> pandas.__version__, pa.__version__, numpy.__version__
> ('1.2.1', '2.0.0', '1.20.0')
> >>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
> >>> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
> >>> list(range(9)) + [numpy.nan]})
> >>> pa.Table.from_pandas(df)
> Traceback (most recent call last):
>  File "", line 1, in 
>  pa.Table.from_pandas(df)
>  File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas
>  File 
> 

[jira] [Updated] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Carlo Mazzaferro (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Mazzaferro updated ARROW-11445:
-
Description: 
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

 

 
{quote}{{>>> pandas.__version__, pa.__version__, numpy.__version__}}
{{('1.2.1', '2.0.0', '1.20.0')}}
{{>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
list(range(9)) + [numpy.nan]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type float64')}}
{quote}

  was:
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

{{ }}{{>>> pandas._version, pa.version, numpy.version_}}
{{ ('1.2.1', '2.0.0', '1.20.0')}}
{{ >>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
{{ >>> pandas._version, pa.version, numpy.version_}}
{{ ('1.2.1', '2.0.0', '1.20.0')}}
{{ >>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type int64')}}{{ }}

 


> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Priority: Major
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
>  
>  
>  
> {quote}{{>>> pandas.__version__, pa.__version__, numpy.__version__}}
> {{('1.2.1', '2.0.0', '1.20.0')}}
> {{>>> df = pandas.DataFrame(\{'a': numpy.random.randn(10), 'b': 
> numpy.random.randn(7).tolist() + [None, pandas.NA, numpy.nan], 'c': 
> list(range(9)) + [numpy.nan]})}}
> {{>>> pa.Table.from_pandas(df)}}
> {{Traceback (most recent call last):}}
> {{ File "", line 1, in }}
> {{ 

[jira] [Updated] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Carlo Mazzaferro (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Mazzaferro updated ARROW-11445:
-
Description: 
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

{{ }}{{>>> pandas._version, pa.version, numpy.version_}}
{{ ('1.2.1', '2.0.0', '1.20.0')}}
{{ >>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
{{ >>> pandas._version, pa.version, numpy.version_}}
{{ ('1.2.1', '2.0.0', '1.20.0')}}
{{ >>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type int64')}}{{ }}

 

  was:
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

>>> pandas.__version__, pa.__version__, numpy.__version__
{{ {{('1.2.1', '2.0.0', '1.20.0')
{{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})
{{ {{>>> pandas.__version__, pa.__version__, numpy.__version__
{{ {{('1.2.1', '2.0.0', '1.20.0')
{{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type int64')}}


> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Priority: Major
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
> {{ }}{{>>> pandas._version, pa.version, numpy.version_}}
> {{ ('1.2.1', '2.0.0', '1.20.0')}}
> {{ >>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
> {{ >>> pandas._version, pa.version, numpy.version_}}
> {{ ('1.2.1', '2.0.0', '1.20.0')}}
> {{ >>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
> {{>>> pa.Table.from_pandas(df)}}
> {{Traceback (most recent call last):}}
> {{ File "", line 1, in }}
> 

[jira] [Updated] (ARROW-11445) Type conversion failure on numpy 0.1.20

2021-01-31 Thread Carlo Mazzaferro (Jira)


 [ 
https://issues.apache.org/jira/browse/ARROW-11445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Carlo Mazzaferro updated ARROW-11445:
-
Description: 
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

>>> pandas.__version__, pa.__version__, numpy.__version__
{{ {{('1.2.1', '2.0.0', '1.20.0')
{{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})
{{ {{>>> pandas.__version__, pa.__version__, numpy.__version__
{{ {{('1.2.1', '2.0.0', '1.20.0')
{{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type int64')}}

  was:
While I have not dug deep enough in the Arrow codebase, it seems to me that 
this is caused by the new numpy release: 
[https://github.com/numpy/numpy/releases] 

The issue below in fact is not observed when using numpy 0.19.*

 

{{>>> pandas.__version__, pa.__version__, numpy.__version__}}
{{('1.2.1', '2.0.0', '1.20.0')}}
{{>>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
{{>>> pandas.__version__, pa.__version__, numpy.__version__}}
{{('1.2.1', '2.0.0', '1.20.0')}}
{{>>> df = pandas.DataFrame(\{'a': [1,2,3]})}}
{{>>> pa.Table.from_pandas(df)}}
{{Traceback (most recent call last):}}
{{ File "", line 1, in }}
{{ pa.Table.from_pandas(df)}}
{{ File "pyarrow/table.pxi", line 1394, in pyarrow.lib.Table.from_pandas}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in dataframe_to_arrays}}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 588, in }}
{{ for c, f in zip(columns_to_convert, convert_fields)]}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 574, in convert_column}}
{{ raise e}}
{{ File 
"/Users/carlomazzafero/.virtualenvs/arr/lib/python3.7/site-packages/pyarrow/pandas_compat.py",
 line 568, in convert_column}}
{{ result = pa.array(col, type=type_, from_pandas=True, safe=safe)}}
{{ File "pyarrow/array.pxi", line 292, in pyarrow.lib.array}}
{{ File "pyarrow/array.pxi", line 79, in pyarrow.lib._ndarray_to_array}}
{{ File "pyarrow/array.pxi", line 67, in pyarrow.lib._ndarray_to_type}}
{{ File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status}}
{{pyarrow.lib.ArrowTypeError: ('Did not pass numpy.dtype object', 'Conversion 
failed for column a with type int64')}}


> Type conversion failure on numpy 0.1.20
> ---
>
> Key: ARROW-11445
> URL: https://issues.apache.org/jira/browse/ARROW-11445
> Project: Apache Arrow
>  Issue Type: Bug
>  Components: Python
>Affects Versions: 2.0.0
> Environment: Python 3.7.4
> Mac OS
>Reporter: Carlo Mazzaferro
>Priority: Major
>
> While I have not dug deep enough in the Arrow codebase, it seems to me that 
> this is caused by the new numpy release: 
> [https://github.com/numpy/numpy/releases] 
> The issue below in fact is not observed when using numpy 0.19.*
>  
> >>> pandas.__version__, pa.__version__, numpy.__version__
> {{ {{('1.2.1', '2.0.0', '1.20.0')
> {{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})
> {{ {{>>> pandas.__version__, pa.__version__, numpy.__version__
> {{ {{('1.2.1', '2.0.0', '1.20.0')
> {{ {{>>> df = pandas.DataFrame({'a': [1,2,3]})
> {{>>> pa.Table.from_pandas(df)}}
> {{Traceback (most