janniklinde opened a new pull request, #2346:
URL: https://github.com/apache/systemds/pull/2346

   This patch introduces a new failure-propagation mechanism for out-of-core 
(OOC) tasks via the `LocalTaskQueue`.
   
   Previously, unexpected exceptions in OOC tasks could silently fail, leaving 
upstream tasks waiting indefinitely because their output streams were never 
closed. To address this, we now propagate exceptions through the queue 
hierarchy, ensuring upstream and downstream threads are properly interrupted.
   
   `LocalTaskQueue` maintains an exception state that allows both enqueue and 
dequeue operations to rethrow the stored exception, propagating errors across 
dependent queues. When a failure occurs, all related queues are notified, 
cascading the exception until it reaches the main thread and any other affected 
tasks.
   
   Additionally, a common OOC task submission method was added to 
`OOCInstruction` to replace manual submission via `CommonThreadPool`. This 
ensures consistent exception propagation and simplifies OOC task management.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to