I would like to process a stream of data firom different customers, producing output say once every 15 minutes. The results will then be loaded into another system for stoage and querying.
I have been using TumblingEventTimeWindows in my prototype, but I am concerned that all the windows will start and stop at the same time and cause batch load effects on the back-end data store. What I think I would like is that the windows could have a different start offset for each key, (using a hash function that I would supply) Thus deterministically, key "ca:fe:ba:be" would always start based on an initail offset of 00:07 UTC while say key "de:ad:be:ef" would always start based on an initial offset of say 00:02 UTC Is this possible? Or do I just have to find some way of queuing up my writes using back-pressure? Thanks in advance -stephenc P.S. I can trade assistance with Flink for assistance with Maven or Jenkins if my questions are too wierysome!