You can re-window after the first aggregation (say, into the global
window) and state will be stored with respect to this window.
On Thu, Jun 21, 2018 at 12:02 PM Harshvardhan Agrawal
<harshvardhan.ag...@gmail.com> wrote:
>
> Hello,
>
> We are currently working on implementing a data pipeline using Beam on top of
> Flink. We have an unbounded data source that sends us some financial
> positions data. For each account, we perform certain aggregations (let’s
> assume it’s summation for simplicity) across all products owned by the
> account. I’m processing the data in windows of 30 seconds.
> At any time when I am processing a window I want to be able to access the
> aggregation from the previous window and add it to the current aggregation.
> The stateful API in beam stores data by (key,window) and hence I can’t really
> store a global state that I could access with account being the key.
>
> The only way I could think was to write the data of the previous window to
> some DB and then read it in my following window which I don’t think is
> efficient because I have to wait until the previous window is written to the
> DB. Do you have suggestions on how to approach this problem?
>
> Thank you.
> --
> Regards,
> Harshvardhan