One other nice enhancement around this would be if a transform could indicate that it was executing a "slow" operation.
A good example is writing in BigQueryIO, it's very reasonable/normal for a load job to run for more than 5 minutes, and the "stuck" message can be confusing to users. The rewording to "operation ongoing" in the PR seems like a good improvement here as well though. On Thu, Jan 9, 2020 at 8:26 PM Pablo Estrada <pabl...@google.com> wrote: > Hello Beam users and community, > > The Beam Python SDK, and Java workers have a utility where they will print > a log message whenever there's an execution thread where no state > transitions happen for over five minutes. > > These messages are common in two scenarios: > 1. A deadlock happening in the worker (very uncommon, but possible) > 2. An operation simply takes over 5 minutes (e.g. a slow RPC, waiting for > an external event, etc). > > The old phrasing of these logs has often been a bit confusing, and > led users to think that there was actual stuckness in the pipeline, when > reality was more harmless: an operation was just slow. > > I am introducing a change[1] for the Apache Beam SDK to rephrase these > logs, and make them less confusing. > > If you ever used these logs for your debugging, know that the string will > change, but the logs will remain : ). > If you didn't know about these, now you do, and hopefully they will be > useful to you! : ) > > Thanks! > -P. > > [1] https://github.com/apache/beam/pull/10446/files >