[GitHub] [arrow-ballista] mingmwang commented on issue #650: Add Accumulator API

via GitHub Tue, 07 Feb 2023 23:19:52 -0800


mingmwang commented on issue #650:
URL: https://github.com/apache/arrow-ballista/issues/650#issuecomment-1422131979


   If we just want cancel tasks early, protect systems from heavy queries/heavy 
scans, I think we do not need to introduce the
   Accumulator. Spark's Accumulator and Metrics system is very heavy and cause 
lots of memory issues and management burden to the Spark Driver, the 
Accumulator need to be registered in the Spark Driver and life cycle is managed 
by the Spark Driver.
   
   We can just leverage the current DataFusion Metrics system and TaskStatus 
update rpc and add necessary throttling/checking/aborting logic when we handle 
the Task finish event in the Ballista Scheduler. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow-ballista] mingmwang commented on issue #650: Add Accumulator API

Reply via email to