Re: [FYI] Rephrasing the 'lull'/processing stuck logs

2020-01-09 Thread Steve Niemitz
One other nice enhancement around this would be if a transform could
indicate that it was executing a "slow" operation.

A good example is writing in BigQueryIO, it's very reasonable/normal for a
load job to run for more than 5 minutes, and the "stuck" message can be
confusing to users.  The rewording to "operation ongoing" in the PR seems
like a good improvement here as well though.

On Thu, Jan 9, 2020 at 8:26 PM Pablo Estrada  wrote:

> Hello Beam users and community,
>
> The Beam Python SDK, and Java workers have a utility where they will print
> a log message whenever there's an execution thread where no state
> transitions happen for over five minutes.
>
> These messages are common in two scenarios:
> 1. A deadlock happening in the worker (very uncommon, but possible)
> 2. An operation simply takes over 5 minutes (e.g. a slow RPC, waiting for
> an external event, etc).
>
> The old phrasing of these logs has often been a bit confusing, and
> led users to think that there was actual stuckness in the pipeline, when
> reality was more harmless: an operation was just slow.
>
> I am introducing a change[1] for the Apache Beam SDK to rephrase these
> logs, and make them less confusing.
>
> If you ever used these logs for your debugging, know that the string will
> change, but the logs will remain : ).
> If you didn't know about these, now you do, and hopefully they will be
> useful to you! : )
>
> Thanks!
> -P.
>
> [1] https://github.com/apache/beam/pull/10446/files
>


[FYI] Rephrasing the 'lull'/processing stuck logs

2020-01-09 Thread Pablo Estrada
Hello Beam users and community,

The Beam Python SDK, and Java workers have a utility where they will print
a log message whenever there's an execution thread where no state
transitions happen for over five minutes.

These messages are common in two scenarios:
1. A deadlock happening in the worker (very uncommon, but possible)
2. An operation simply takes over 5 minutes (e.g. a slow RPC, waiting for
an external event, etc).

The old phrasing of these logs has often been a bit confusing, and
led users to think that there was actual stuckness in the pipeline, when
reality was more harmless: an operation was just slow.

I am introducing a change[1] for the Apache Beam SDK to rephrase these
logs, and make them less confusing.

If you ever used these logs for your debugging, know that the string will
change, but the logs will remain : ).
If you didn't know about these, now you do, and hopefully they will be
useful to you! : )

Thanks!
-P.

[1] https://github.com/apache/beam/pull/10446/files