Karthik created ARROW-15142: ------------------------------- Summary: Cannot mix struct and non-struct, non-null values error when saving nested types with PyArrow Key: ARROW-15142 URL: https://issues.apache.org/jira/browse/ARROW-15142 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 6.0.1 Reporter: Karthik
When trying to save a Pandas dataframe with a nested type (list within list, list within dict) using pyarrow engine, the following error is encountered {color:#e75c58}ArrowInvalid{color}: ('cannot mix list and non-list, non-null values', 'Conversion failed for column A with type object') Repro: {code:java} import pandas as pd x = pd.DataFrame({"A": [[24, 27, [1, 1]]]}) x.to_parquet('/tmp/a.pqt', engine="pyarrow") {code} Doing a bit of googling, it appears that this is a known Arrow shortcoming. However, this is a commonly encountered datastructure, and 'fastparquet' handles this seamlessly. Is there a proposed timeline/plan for fixing this? -- This message was sent by Atlassian Jira (v8.20.1#820001)