[ https://issues.apache.org/jira/browse/ARROW-18171?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17627736#comment-17627736 ]
Miles Granger commented on ARROW-18171: --------------------------------------- [Relevant SO discussion | https://stackoverflow.com/questions/47113813/using-pyarrow-how-do-you-append-to-parquet-file] > Feature to append row groups to existing parquet file > ----------------------------------------------------- > > Key: ARROW-18171 > URL: https://issues.apache.org/jira/browse/ARROW-18171 > Project: Apache Arrow > Issue Type: New Feature > Components: Parquet, Python > Reporter: Nischith > Priority: Minor > > This is related to pyarrow. > Right now, it's possible to append row groups to parquet file as long as the > writer is open. Once the writer is closed, it's not possible to append new > row group to a parquet file. > the only option in such situation is to either recreate the file or write > multiple files to the dataset. > > This is possible with fastparquet using _append=True_ parameter. - [API — > fastparquet 0.7.1 documentation > |https://fastparquet.readthedocs.io/en/latest/api.html#fastparquet.write] > Feature to append row groups to existing file can be beneficial in pyarrow as > well. > -- This message was sent by Atlassian Jira (v8.20.10#820010)