friendly bump 🙃 Anyone have thoughts or answers? Thanks!
On Thu, Nov 3, 2022 at 3:07 PM Evan Galpin wrote:
> Hi folks,
>
> Hoping to get some definitive answers with respect to streaming pipeline
> bundle retry semantics on Dataflow. I understand that a bundle containing
> a "poison pill" (bad data, let's say it causes a null pointer exception
> when processing in DoFn) will be retried indefinitely. What I'm not clear
> on are the implications of those retries.
>
>
>1. Is it the case that a worker will continuously retry the same
>"poison pill" bundle, and not be able to work on any other/new bundles
>indefinitely after receiving the first poison pill? I've noticed that a
>small number poison pills can cause all processing to stall, even if the
>bad data represents only a very small percentage of the overall data being
>processed
>2. Is there any implication with windowing and this retry/stall
>scenario? I've noticed that the scenario where all processing stalls
>entirely is more common for a pipeline where all data is globally
>windowed. I don't, however, have a solid understanding of how to explain
>that observation; I'd really appreciate any insights that can aid in
>understanding
>
> Thanks,
> Evan
>