Marco Neumann created ARROW-6872:
------------------------------------
Summary: [C++][Python] Empty table with dictionary-columns raises
ArrowNotImplementedError
Key: ARROW-6872
URL: https://issues.apache.org/jira/browse/ARROW-6872
Project: Apache Arrow
Issue Type: Bug
Components: C++, Python
Affects Versions: 0.15.0
Reporter: Marco Neumann
h2. Abstract
As a pyarrow user, I would expect that I can create an empty table out of every
schema that I created via pandas. This does not work for dictionary types (e.g.
{{"category"}} dtypes).
h2. Test Case
This code:
{code:python}
import pandas as pd
import pyarrow as pa
df = pd.DataFrame({"x": pd.Series(["x", "y"], dtype="category")})
table = pa.Table.from_pandas(df)
schema = table.schema
table_empty = schema.empty_table() # boom
{code}
produces this exception:
{noformat}
Traceback (most recent call last):
File "arrow_bug.py", line 8, in <module>
table_empty = schema.empty_table()
File "pyarrow/types.pxi", line 860, in __iter__
File "pyarrow/array.pxi", line 211, in pyarrow.lib.array
File "pyarrow/array.pxi", line 36, in pyarrow.lib._sequence_to_array
File "pyarrow/error.pxi", line 86, in pyarrow.lib.check_status
pyarrow.lib.ArrowNotImplementedError: Sequence converter for type
dictionary<values=string, indices=int8, ordered=0> not implemented
{noformat}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)