friendly bump 🙃 Anyone have thoughts or answers? Thanks! On Thu, Nov 3, 2022 at 3:07 PM Evan Galpin <egal...@apache.org> wrote:
> Hi folks, > > Hoping to get some definitive answers with respect to streaming pipeline > bundle retry semantics on Dataflow. I understand that a bundle containing > a "poison pill" (bad data, let's say it causes a null pointer exception > when processing in DoFn) will be retried indefinitely. What I'm not clear > on are the implications of those retries. > > > 1. Is it the case that a worker will continuously retry the same > "poison pill" bundle, and not be able to work on any other/new bundles > indefinitely after receiving the first poison pill? I've noticed that a > small number poison pills can cause all processing to stall, even if the > bad data represents only a very small percentage of the overall data being > processed > 2. Is there any implication with windowing and this retry/stall > scenario? I've noticed that the scenario where all processing stalls > entirely is more common for a pipeline where all data is globally > windowed. I don't, however, have a solid understanding of how to explain > that observation; I'd really appreciate any insights that can aid in > understanding > > Thanks, > Evan >