[ https://issues.apache.org/jira/browse/ARROW-2711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Antoine Pitrou resolved ARROW-2711. ----------------------------------- Resolution: Fixed Issue resolved by pull request 2309 [https://github.com/apache/arrow/pull/2309] > [Python/C++] Pandas-Arrow doesn't roundtrip when column of lists has empty > first element > ---------------------------------------------------------------------------------------- > > Key: ARROW-2711 > URL: https://issues.apache.org/jira/browse/ARROW-2711 > Project: Apache Arrow > Issue Type: Bug > Components: Python > Affects Versions: 0.9.0 > Reporter: Thomas Buhrmann > Assignee: Antoine Pitrou > Priority: Major > Labels: pull-request-available > Fix For: 0.10.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > Hi, I thought this had been fixed in the past, but this simple use case still > breaks: > > {code:java} > df = pd.DataFrame(dict(x=[[], ["a"]])) > tbl = pyarrow.Table.from_pandas(df) > print(tbl.schema) > {code} > results in a wrong inferred type of "list<item: null>": > > {noformat} > x: list<item: null> > child 0, item: null > __index_level_0__: int64 > metadata > -------- > {b'pandas': b'{"index_columns": ["__index_level_0__"], "column_indexes": > [{"na' > b'me": null, "field_name": null, "pandas_type": "unicode", > "numpy_' > b'type": "object", "metadata": {"encoding": "UTF-8"}}], > "columns":' > b' [{"name": "x", "field_name": "x", "pandas_type": > "list[empty]",' > b' "numpy_type": "object", "metadata": null}, {"name": null, > "fiel' > b'd_name": "__index_level_0__", "pandas_type": "int64", > "numpy_typ' > b'e": "int64", "metadata": null}], "pandas_version": > "0.22.0"}'}{noformat} > When converting the Table back to pandas all elements are now None too: > > {code:java} > df2 = tbl.to_pandas() > print(df2) > x > 0 [] > 1 [None] > {code} > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)