It's something like:
DataStreamSource<String> stream =
env.addSource(getKafkaConsumer(parameterTool)); stream
.map(getEventToDomainMapper())
.keyBy(getKeySelector())
.window(ProcessingTimeSessionWindows.withGap(Time.minutes(90)))
.reduce(getReducer())
.map(getToJsonMapper())
.addSink(getKafkaProducer(parameterTool));
Each new event may be reduced against the existent state from the
window, so normally it's okay to have only 1 row in memory.
I'm new to Flink and have not yet reached the "incremental aggregates"
but, if i understand correctly, it fits my case.
Vadim.
On 20.02.2017 17:47, Timo Walther wrote:
Hi Vadim,
this of course depends on your use case. The question is how large is
your state per pane and how much memory is available for Flink?
Are you using incremental aggregates such that only the aggregated
value per pane has to be kept in memory?
Regards,
Timo
Am 20/02/17 um 16:34 schrieb Vadim Vararu:
HI guys,
Is it okay to have very many (tens of thousands or hundreds of
thousand) of session windows?
Thanks, Vadim.