thinkharderdev commented on issue #650:
URL: https://github.com/apache/arrow-ballista/issues/650#issuecomment-1422388406

   > We can just leverage the current DataFusion Metrics system and TaskStatus 
update rpc and add necessary throttling/checking/aborting logic when we handle 
the Task finish event in the Ballista Scheduler.
   
   This would be good first step, but I don't think it really solves the issue. 
We would need to wait for tasks to finish which means if we had really long 
running tasks running concurrently it could easily overload the system with no 
way to cancel since all tasks are taking a long time to complete. 
   
   Agree that Spark's accumulator causes issues but I think if we can design 
something more purpose-built (and used only internally by the scheduler, not 
something exposed through the public API) then it could be relatively 
lightweight. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to