danielhumanmod commented on issue #1344:
URL:
https://github.com/apache/datafusion-ballista/issues/1344#issuecomment-3693741643
Thanks for the insight, @milenkovicm — job-level dependency is a really good
idea and definitely aligns well with where the scheduler could evolve. It also
fits nicely on top of the current stage/shuffle-level dependency logic.
I took a quick look at the implementation, and the main changes I’m thinking
about are:
1. If needed, split jobs and build the dependency context in the
scheduler (execute_query()).
2. Add job-level dependency tracking in TaskManager for pending
jobs.
3. Trigger downstream tasks on JobFinished events.
I’ll start with a small PoC around this, but happy to adjust the direction
based on your feedback.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]