[DISCUSS] Side input consistency guarantees for triggers with multiple firings

Lukasz Cwik Thu, 11 Apr 2019 09:28:04 -0700

Today, we define that a side input becomes available to be consumed once at
least one firing occurs or when the runner detects that no such output
could be produced (e.g. watermark is beyond the end of the window when
using the default trigger). For triggers that fire at most once, consumers
are guaranteed to have a consistent view of the contents of the side input.
But what happens when the trigger fire multiple times?


Lets say we have a pipeline containing:
ParDo(A) --> PCollectionView S
         \-> PCollectionView T

  ...
   |
ParDo(C) <-(side input)- PCollectionView S and PCollectionView T
   |
  ...

1) Lets say ParDo(A) outputs (during a single bundle) X and Y to
PCollectionView S, should ParDo(C) see be guaranteed to see X only if it
can also see Y (and vice versa)?

2) Lets say ParDo(A) outputs (during a single bundle) X to PCollectionView
S and Y to PCollectionView T, should ParDo(C) be guaranteed to see X only
if it can also see Y?

[DISCUSS] Side input consistency guarantees for triggers with multiple firings

Reply via email to