I believe that by default windows will only trigger one time [1]. This has definitely caught me by surprise before.
I think that default strategy might fine for a batch pipeline, but typically does not for streaming (which I assume you’re using because you mentioned Flink). I believe you’ll want to add a non-default triggering mechanism to the window strategy that you mentioned. I would recommend reading through the triggering docs[2] for background. The Repeatedly.forever[3] function may work for your use case. Something like: Window.into(FixedWindows.of(Duration.standardMinutes(10))) .triggering(Repeatedly.forever(AfterWatermark.pastEndOfWindow())) .withAllowedLateness(Duration.ZERO) .discardingFiredPanes(); [1] https://beam.apache.org/documentation/programming-guide/#default-trigger [2] https://beam.apache.org/documentation/programming-guide/#triggers [3] https://beam.apache.org/releases/javadoc/2.29.0/org/apache/beam/sdk/transforms/windowing/Repeatedly.html On Mon, Jun 14, 2021 at 07:39 Eddy G <kaotix...@gmail.com> wrote: > As seen in this image https://imgur.com/a/wrZET97, I'm trying to deal > with late data (intentionally stopped my consumer so data has been > accumulating for several days now). Now, with the following Window... I'm > using Beam 2.27 and Flink 1.12. > > > Window.into(FixedWindows.of(Duration.standardMinutes(10))) > > And several parsing stages after, once it's time to write within the > ParquetIO stage... > > FileIO > .<String, MyClass>writeDynamic() > .by(...) > .via(...) > .to(...) > .withNaming(...) > .withDestinationCoder(StringUtf8Coder.of()) > .withNumShards(options.getNumShards()) > > it won't send bytes across all stages so no data is being written, still > it accumulates in the first stage seen in the image and won't go further > than that. > > Any reason why this may be happening? Wrong windowing strategy? >