On Thu, Feb 22, 2024 at 10:16 AM Robert Bradshaw <rober...@google.com> wrote:
>
> On Thu, Feb 22, 2024 at 9:37 AM Reuven Lax via dev <dev@beam.apache.org> 
> wrote:
> >
> > On Thu, Feb 22, 2024 at 9:26 AM Kenneth Knowles <k...@apache.org> wrote:
> >>
> >> Wow I love your input Reuven. Of course "the source" that you are applying 
> >> backpressure to is often a runner's shuffle so it may be state anyhow, but 
> >> it is good to give the runner the choice of how to figure that out and 
> >> maybe chain backpressure further.
> >
> >
> > Sort of - however most (streaming) runners apply backpressure through 
> > shuffle as well. This means that while some amount of data will accumulate 
> > in shuffle, eventually the backpressure will push back to the source. 
> > Caveat of course is that this is mostly true for streaming runners, not 
> > batch runners.
>
> For batch it's still preferable to keep the data upstream in shuffle
> (which has less size limitations) than state (which must reside in
> worker memory, though only one key at a time).

And for drain (or even cancel), it's preferable to have as much as
possible upstream in the source than sitting in state.

Reply via email to