[ 
https://issues.apache.org/jira/browse/ARROW-18171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627736#comment-17627736
 ] 

Miles Granger commented on ARROW-18171:
---------------------------------------

[Relevant SO discussion | 
https://stackoverflow.com/questions/47113813/using-pyarrow-how-do-you-append-to-parquet-file]
 

> Feature to append row groups to existing parquet file
> -----------------------------------------------------
>
>                 Key: ARROW-18171
>                 URL: https://issues.apache.org/jira/browse/ARROW-18171
>             Project: Apache Arrow
>          Issue Type: New Feature
>          Components: Parquet, Python
>            Reporter: Nischith
>            Priority: Minor
>
> This is related to pyarrow.
> Right now, it's possible to append row groups to parquet file as long as the 
> writer is open. Once the writer is closed, it's not possible to append new 
> row group to a parquet file. 
> the only option in such situation is to either recreate the file or write 
> multiple files to the dataset.
>  
> This is possible with fastparquet using _append=True_ parameter. - [API — 
> fastparquet 0.7.1 documentation 
> |https://fastparquet.readthedocs.io/en/latest/api.html#fastparquet.write]
> Feature to append row groups to existing file can be beneficial in pyarrow as 
> well.
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to