Re: [I] [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs [arrow]

via GitHub Fri, 13 Sep 2024 17:35:05 -0700


yuxi-liu-wired commented on issue #21526:
URL: https://github.com/apache/arrow/issues/21526#issuecomment-2350741670


   > In case someone wants to load a pandas dataframe I want to share my 
workaround.
   > 
   > For me installing `fastparquet` and specifying the `eninge='fastparquet'` 
argument in the `load_parquet` function worked.
   
   Concurring. If the parquet file contains a dictionary/list/struct, then the 
following
   
   ```python
   import pandas as pd
   df = pd.read_parquet(parquet_path)
   ```
   
   throws an error "ArrowNotImplementedError: Nested data conversions not 
implemented for chunked array outputs"
   
   But only *if* the parquet file is over 1 GB. If it is under 1 GB, then it 
loads with no problems.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [Python] read_row_group fails with Nested data conversions not implemented for chunked array outputs [arrow]

Reply via email to