damccorm opened a new pull request, #38894:
URL: https://github.com/apache/beam/pull/38894

   Previously, when the service asked the worker to abort, it threw
   ReadLoopAbortedException, which extended InterruptedException.
   MapTaskExecutor caught this and, in an attempt to preserve the
   interrupted status, set the interrupted bit on the thread.
   However, since this was a logical abort and not a real thread interrupt,
   setting the interrupted bit caused subsequent operations on the thread
   (like the backoff sleep in DataflowBatchWorkerHarness) to immediately
   fail with InterruptedException, leading to all worker threads dying
   and the harness hanging.
   
   This fix:
   1. Changes ReadLoopAbortedException to extend InterruptedIOException
      instead of InterruptedException, as it is a logical I/O abort and
      should not trigger thread interrupt handling.
   2. Hardens DataflowBatchWorkerHarness to throw a RuntimeException
      if all worker threads die, ensuring the JVM exits with a non-zero
      code so the runner can restart it.
   
   Ref: b/512366613
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to