Benjamin created ARROW-7885: ------------------------------- Summary: Serialize dask dataframe Key: ARROW-7885 URL: https://issues.apache.org/jira/browse/ARROW-7885 Project: Apache Arrow Issue Type: Wish Reporter: Benjamin
Currently pyarrow knows how to serialize pandas dataframes but not dask dataframes. {code:java} SerializationCallbackError: pyarrow does not know how to serialize objects of type <class 'dask.dataframe.core.DataFrame'>. {code} Pickling the dask dataframe foregoes the benefits of using pyarrow for the sub dataframes. Pyarrow support for serializing dask dataframes would allow storing dataframes efficiently in a database instead of a file system (e.g. parquet). -- This message was sent by Atlassian Jira (v8.3.4#803005)