Here is a pseudocode (sorry) of what I am doing right now:
PCollection> writtenFiles = dataStream
.withFixedWindow(
duration = 1H,
trigger = AfterWatermark.pastEndOfWindow()
.withLateFirings(AfterFirst.of(
AfterPane.elementCountAtLeast(lateCount),
Af
For what it's worth, in Flink directly we found that this pattern is
generally not a well working one: windowing data in large windows in order
to perform large bulk writes.
Instead, the sinks (to file systems) continuously write (possibly across
different destination files) files, ensure persiste
I looked into Wait.on() but doc say it waits untill window is completely
done, so it is not quite fir for my case, as my lateness can be a day or
two and I would like to compact and publish hourly data sooner.
What i am thinking of is write triggers under different location than
target. I will hav