Farzad Abdolhosseini created ARROW-8868: -------------------------------------------
Summary: [Python] Feather format cannot store/retrieve lists correctly? Key: ARROW-8868 URL: https://issues.apache.org/jira/browse/ARROW-8868 Project: Apache Arrow Issue Type: Bug Components: Python Affects Versions: 0.17.1 Environment: Python 3.8.2 PyArrow 0.17.1 Pandas 1.0.3 Linux (Manjaro) Reporter: Farzad Abdolhosseini I'm seeing a very weird behavior when I try to store and retrieve a Pandas data-frame using the Feather format. Simplified example: {code:python} >>> import pandas as pd >>> df = pd.DataFrame(data={"scalar": [1, 2], "array": [[1], [7]]}) >>> df scalar array 0 1 [1] 1 2 [7] >>> df.to_feather("test.ft") >>> pd.read_feather("test.ft") scalar array 0 1 [16] 1 2 [1045468844972122628] {code} As you can see, the retrieved data is incorrect. I was originally trying to use the `feather-format` (not using Pandas directly) and that didn't work well either. By playing around with the data-frame that is to be stored I can also get different but still incorrect behavior, e.g. a larger list, an error that says the file size is incorrect, or simply a segmentation fault. This is my first time using Feather/Arrow BTW. -- This message was sent by Atlassian Jira (v8.3.4#803005)