If you do a groupByKey followed by a fan out right before you're write steps, you'll prevent the write steps from starting until all the data has been grouped.
I'd recommend reading up about fusion: https://cloud.google.com/dataflow/service/dataflow-service-desc#preventing-fusion Sent from my iPhone > On Sep 5, 2017, at 11:21, Jacob Marble <[email protected]> wrote: > > Good morning- > > Given a batch pipeline with 3 file inputs and 4 file outputs, is there a way > to prevent the 4 TextIO.write() steps from starting until all of the > TexIO.write() steps are ready? > > The idea here is to fail on any exceptions before persisting any output data, > making cleanup easier. > > Thanks! > > Jacob
