I also did not succeed in making this pattern work some time ago. In the
link below there's my mail thread with code example - do you have a similar
use-case?

https://lists.apache.org/thread/9l74o4vqbtfgc5vkj9qq0xofffmtxswc

Will keep watching this thread for insights.

Best Regards,
Pavel Solomin

Tel: +351 962 950 692 | Skype: pavel_solomin | Linkedin
<https://www.linkedin.com/in/pavelsolomin>





On Tue, 22 Feb 2022 at 18:46, Steve Niemitz <sniem...@twitter.com> wrote:

> We had a team try to use the "slowly updating global window side inputs"
> pattern (on dataflow) to update some metadata in their pipeline every
> minute, but surprisingly ran into errors that the side input PCollection
> contained more than one element, [1] although this only manifested
> intermittently.
>
> My theory on why this breaks is as follows, can someone check my logic?
>
> Given that GenerateSequence operates on processing time, (although this
> might not actually matter) it's possible that if processing the source is
> delayed for whatever reason, the source may emit multiple elements at once
> in a single bundle.  For example, if I configure the source to generate an
> element every 10 seconds, and the evaluation of the source is delayed for
> 30 seconds, I'd get a bundle with 3 elements in it. (or so it seems)  All
> elements are then windowed into the global window, so they all end up in
> the same window.
>
> If a bundle with 3 elements enters
> the AfterProcessingTime.pastFirstElementInPane() state machine, all 3
> elements will be emitted in that pane.  This will then propagate down and
> break on the singleton view combiner.
>
> Is my thought process here correct?  Is the example here just buggy?
>
> [1] "pcollection view being accessed as a singleton despite having more
> than one input."
>

Reply via email to