[
https://issues.apache.org/jira/browse/ARROW-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Rok Mihevc updated ARROW-2124:
------------------------------
External issue URL: https://github.com/apache/arrow/issues/18094
> [Python] ArrowInvalid raised if the first item of a nested list of numpy
> arrays is empty
> ----------------------------------------------------------------------------------------
>
> Key: ARROW-2124
> URL: https://issues.apache.org/jira/browse/ARROW-2124
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 0.8.0
> Reporter: George Sakkis
> Assignee: Uwe Korn
> Priority: Major
> Labels: pull-request-available
> Fix For: 0.9.0
>
>
> See example below:
> {noformat}
> In [1]: import numpy as np
> In [2]: import pandas as pd
> In [3]: import pyarrow as pa
> In [4]: num_lists = [[2,3,4], [3,6,7,8], [], [2]]
> In [5]: series = pd.Series([np.array(s, dtype=float) for s in num_lists])
> In [6]: pa.array(series)
> Out[6]:
> <pyarrow.lib.ListArray object at 0x7f0db8ad1688>
> [
> [2.0,
> 3.0,
> 4.0],
> [3.0,
> 6.0,
> 7.0,
> 8.0],
> [],
> [2.0]
> ]
> In [7]: num_lists.append([])
> In [8]: series = pd.Series([np.array(s, dtype=float) for s in num_lists])
> In [9]: pa.array(series)
> Out[9]:
> <pyarrow.lib.ListArray object at 0x7f0db8ad1e58>
> [
> [2.0,
> 3.0,
> 4.0],
> [3.0,
> 6.0,
> 7.0,
> 8.0],
> [],
> [2.0],
> []
> ]
> In [10]: num_lists.insert(0, [])
> In [11]: series = pd.Series([np.array(s, dtype=float) for s in num_lists])
> In [12]: pa.array(series)
> ---------------------------------------------------------------------------
> ArrowInvalid Traceback (most recent call last)
> <ipython-input-99-fc3a903278e6> in <module>()
> ----> 1 pa.array(series)
> array.pxi in pyarrow.lib.array()
> array.pxi in pyarrow.lib._ndarray_to_array()
> error.pxi in pyarrow.lib.check_status()
> ArrowInvalid: trying to convert NumPy type object but got float64
> {noformat}
--
This message was sent by Atlassian Jira
(v8.20.10#820010)