Hello,

We are currently working on implementing a data pipeline using Beam on top
of Flink. We have an unbounded data source that sends us some financial
positions data. For each account, we perform certain aggregations (let’s
assume it’s summation for simplicity) across all products owned by the
account. I’m processing the data in windows of 30 seconds.
At any time when I am processing a window I want to be able to access the
aggregation from the previous window and add it to the current aggregation.
The stateful API in beam stores data by (key,window) and hence I can’t
really store a global state that I could access with account being the key.

The only way I could think was to write the data  of the previous window to
some DB and then read it in my following window which I don’t think is
efficient because I have to wait until the previous window is written to
the DB. Do you have suggestions on how to approach this problem?

Thank you.
-- 
Regards,
Harshvardhan

Reply via email to