friendly bump 🙃 Anyone have thoughts or answers?  Thanks!

On Thu, Nov 3, 2022 at 3:07 PM Evan Galpin <egal...@apache.org> wrote:

> Hi folks,
>
> Hoping to get some definitive answers with respect to streaming pipeline
> bundle retry semantics on Dataflow.  I understand that a bundle containing
> a "poison pill" (bad data, let's say it causes a null pointer exception
> when processing in DoFn) will be retried indefinitely.  What I'm not clear
> on are the implications of those retries.
>
>
>    1. Is it the case that a worker will continuously retry the same
>    "poison pill" bundle, and not be able to work on any other/new bundles
>    indefinitely after receiving the first poison pill? I've noticed that a
>    small number poison pills can cause all processing to stall, even if the
>    bad data represents only a very small percentage of the overall data being
>    processed
>    2. Is there any implication with windowing and this retry/stall
>    scenario?  I've noticed that the scenario where all processing stalls
>    entirely is more common for a pipeline where all data is globally
>    windowed.  I don't, however, have a solid understanding of how to explain
>    that observation; I'd really appreciate any insights that can aid in
>    understanding
>
> Thanks,
> Evan
>

Reply via email to