Re: Write bulks files from streaming app

2018-07-22 Thread Jozef Vilcek
Here is a pseudocode (sorry) of what I am doing right now: PCollection> writtenFiles = dataStream .withFixedWindow( duration = 1H, trigger = AfterWatermark.pastEndOfWindow() .withLateFirings(AfterFirst.of( AfterPane.elementCountAtLeast(lateCount), Af

Re: Write bulks files from streaming app

2018-07-22 Thread Stephan Ewen
For what it's worth, in Flink directly we found that this pattern is generally not a well working one: windowing data in large windows in order to perform large bulk writes. Instead, the sinks (to file systems) continuously write (possibly across different destination files) files, ensure persiste

Re: Write bulks files from streaming app

2018-07-22 Thread Jozef Vilcek
I looked into Wait.on() but doc say it waits untill window is completely done, so it is not quite fir for my case, as my lateness can be a day or two and I would like to compact and publish hourly data sooner. What i am thinking of is write triggers under different location than target. I will hav