I would like to process a stream of data firom different customers,
producing output say once every 15 minutes. The results will then be loaded
into another system for stoage and querying.

I have been using TumblingEventTimeWindows in my prototype, but I am
concerned that all the windows will start and stop at the same time and
cause batch load effects on the back-end data store.

What I think I would like is that the windows could have a different start
offset for each key, (using a hash function that I would supply)

Thus deterministically, key "ca:fe:ba:be" would always start based on an
initail offset of 00:07 UTC while say key "de:ad:be:ef" would always start
based on an initial offset of say 00:02 UTC

Is this possible? Or do I just have to find some way of queuing up my
writes using back-pressure?

Thanks in advance

-stephenc

P.S. I can trade assistance with Flink for assistance with Maven or Jenkins
if my questions are too wierysome!

Reply via email to