Hi,

This is very typical usage. Beam abstraction requires that all events must belong to a window ( and default window is a single global window), so it is not possible to delete a window information. But what You really want is to overwrite the current window information set for the first aggregation with new window information set for second aggregation. This is very typical for Beam pipelines to set new windows per each aggregation operation.

In practice I imagine You would need to just

|"back to global window" >>WindowInto(GlobalWindows())

before second aggregation operation in case of bounded sources. If You are working with unbounded source (streaming aggregation), using  global window directly will fail as it needs to wait for infinite to get all data before emitting output. IN this case You would need to also set some trigger and accumulation_mode to tell beam when to emit elements for the aggregated keys.

Hope this helps. If not please provide a bit more details what exactly You are trying to do so that I can try to help. Also You could try to get more familiar with the abstraction with the help of https://tour.beam.apache.org/.

Best
Wiśniowski Piotr


On 1.08.2023 09:17, Guagliardo, Patrizio via user wrote:

Hi together,

I have a question regarding Apache Beam in Python:

When I create a window with timestamps and then make a groupby, then the information for the windows remains in the elements. Afterwards, I want to make another groupby (something like a cumsum) by certain keys, but that does not work as the windows are still there and so it takes the keys + windows to split my data in the combineFn. Question now: How can I “delete” the windows and timestamps so that I would combine records just by the defined keys in the last combineFn.

Hope this is clear.

Best,

Patrizio


------------------------------------------------------------------------
This e-mail and any attachments may be confidential or legally privileged. If you received this message in error or are not the intended recipient, you should destroy the e-mail message and any attachments or copies, and you are prohibited from retaining, distributing, disclosing or using any information contained herein. Please inform us of the erroneous delivery by return e-mail. Thank you for your cooperation. For more information on how we use your personal data please see our Privacy Notice <https://www.oliverwyman.com/policies/privacy-notice.html>.

Reply via email to