What do you mean by “wipe out all existing parquet files before a write operation”? Are these all files that already exist in the same output directory? Can you purge this directory before or just use a new output directory for every pipeline run?
To write Parquet files you need to use ParquetIO.sink() with FileIO.write() and I don’t think it will clean up the output directory before write. Though, if there are the name collisions between existing and new output files (it depends on used naming strategy) then I think the old files will be overwritten by new ones. > On 25 Jan 2021, at 19:10, Tao Li <t...@zillow.com> wrote: > > Hi Beam community, > > Does ParquetIO support an overwrite behavior when saving files? More > specifically, I would like to wipe out all existing parquet files before a > write operation. Is there a ParquetIO API to support that? Thanks!