What do you mean by “wipe out all existing parquet files before a write 
operation”? Are these all files that already exist in the same output 
directory? Can you purge this directory before or just use a new output 
directory for every pipeline run?

To write Parquet files you need to use ParquetIO.sink() with FileIO.write() and 
I don’t think it will clean up the output directory before write. Though, if 
there are the name collisions between existing and new output files (it depends 
on used naming strategy) then I think the old files will be overwritten by new 
ones. 



> On 25 Jan 2021, at 19:10, Tao Li <t...@zillow.com> wrote:
> 
> Hi Beam community,
>  
> Does ParquetIO support an overwrite behavior when saving files? More 
> specifically, I would like to wipe out all existing parquet files before a 
> write operation. Is there a ParquetIO API to support that? Thanks!

Reply via email to