martin-liu opened a new pull request, #13402:
URL: https://github.com/apache/arrow/pull/13402
When do `Table.from_pandas(df)`, current code didn't ensure index name is
str, so that it will fail if **non-str index name** in df.
Code to reproduce:
```python
import pandas as pd
import pyarrow as pa
df = pd.DataFrame({0: [1, 2, 3], 1: [4, 5, 6]})
df = df.set_index(0)
pa.Table.from_pandas(df)
```
Error:
```python
---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Input In [3], in <module>
4 df = pd.DataFrame({0: [1, 2, 3], 1: [4, 5, 6]})
5 df = df.set_index(0)
----> 6 pa.Table.from_pandas(df)
File
~/src/mlpsandboxrt/venv/lib/python3.8/site-packages/pyarrow/table.pxi:1394, in
pyarrow.lib.Table.from_pandas()
File
~/src/mlpsandboxrt/venv/lib/python3.8/site-packages/pyarrow/pandas_compat.py:610,
in dataframe_to_arrays(df, schema, preserve_index, nthreads, columns, safe)
608 for name, type_ in zip(all_names, types):
609 name = name if name is not None else 'None'
--> 610 fields.append(pa.field(name, type_))
611 schema = pa.schema(fields)
613 pandas_metadata = construct_metadata(df, column_names, index_columns,
614 index_descriptors,
preserve_index,
615 types)
File
~/src/mlpsandboxrt/venv/lib/python3.8/site-packages/pyarrow/types.pxi:1698, in
pyarrow.lib.field()
File stringsource:15, in
string.from_py.__pyx_convert_string_from_py_std__in_string()
TypeError: expected bytes, int found
```
This PR uses `_column_name_to_strings` to convert the index name to str
before use it.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]