damccorm opened a new pull request, #38894:
URL: https://github.com/apache/beam/pull/38894
Previously, when the service asked the worker to abort, it threw
ReadLoopAbortedException, which extended InterruptedException.
MapTaskExecutor caught this and, in an attempt to preserve the
interrupted status, set the interrupted bit on the thread.
However, since this was a logical abort and not a real thread interrupt,
setting the interrupted bit caused subsequent operations on the thread
(like the backoff sleep in DataflowBatchWorkerHarness) to immediately
fail with InterruptedException, leading to all worker threads dying
and the harness hanging.
This fix:
1. Changes ReadLoopAbortedException to extend InterruptedIOException
instead of InterruptedException, as it is a logical I/O abort and
should not trigger thread interrupt handling.
2. Hardens DataflowBatchWorkerHarness to throw a RuntimeException
if all worker threads die, ensuring the JVM exits with a non-zero
code so the runner can restart it.
Ref: b/512366613
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]