Joris Van den Bossche created ARROW-5287:
--------------------------------------------
Summary: [Python] automatic type inference for arrays of tuples
Key: ARROW-5287
URL: https://issues.apache.org/jira/browse/ARROW-5287
Project: Apache Arrow
Issue Type: Improvement
Components: Python
Reporter: Joris Van den Bossche
Arrays of tuples are support to be converted to either ListArray or
StructArray, if you specify the type explicitly:
{code}
In [6]: pa.array([(1, 2), (3, 4, 5)], type=pa.list_(pa.int64()))
Out[6]:
<pyarrow.lib.ListArray object at 0x7f1b01a4d408>
[
[
1,
2
],
[
3,
4,
5
]
]
In [7]: pa.array([(1, 2), (3, 4)], type=pa.struct([('a', pa.int64()), ('b',
pa.int64())]))
Out[7]:
<pyarrow.lib.StructArray object at 0x7f1b01a51b88>
-- is_valid: all not null
-- child 0 type: int64
[
1,
3
]
-- child 1 type: int64
[
2,
4
]
{code}
But not when no type is specified:
{code}
In [8]: pa.array([(1, 2), (3, 4)])
---------------------------------------------------------------------------
ArrowInvalid Traceback (most recent call last)
<ipython-input-8-ab2d80c7486d> in <module>
----> 1 pa.array([(1, 2), (3, 4)])
~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib.array()
~/scipy/repos/arrow/python/pyarrow/array.pxi in pyarrow.lib._sequence_to_array()
~/scipy/repos/arrow/python/pyarrow/error.pxi in pyarrow.lib.check_status()
ArrowInvalid: Could not convert (1, 2) with type tuple: did not recognize
Python value type when inferring an Arrow data type
{code}
Do we want to do automatic type inference for tuples as well? (defaulting to
the ListArray case, just as arrays of python lists are supported)
Or was there a specific reason to not support this by default?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)