[ 
https://issues.apache.org/jira/browse/ARROW-18171?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Miles Granger updated ARROW-18171:
----------------------------------
    Summary: [Python] Feature to append row groups to existing parquet file  
(was: Feature to append row groups to existing parquet file)

> [Python] Feature to append row groups to existing parquet file
> --------------------------------------------------------------
>
>                 Key: ARROW-18171
>                 URL: https://issues.apache.org/jira/browse/ARROW-18171
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Parquet, Python
>            Reporter: Nischith
>            Priority: Minor
>
> This is related to pyarrow.
> Right now, it's possible to append row groups to parquet file as long as the 
> writer is open. Once the writer is closed, it's not possible to append new 
> row group to a parquet file. 
> the only option in such situation is to either recreate the file or write 
> multiple files to the dataset.
>  
> This is possible with fastparquet using _append=True_ parameter. - [API — 
> fastparquet 0.7.1 documentation 
> |https://fastparquet.readthedocs.io/en/latest/api.html#fastparquet.write]
> Feature to append row groups to existing file can be beneficial in pyarrow as 
> well.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to