thinkharderdev commented on issue #655:
URL: https://github.com/apache/arrow-ballista/issues/655#issuecomment-1424191101

   @yahoNanJing @mingmwang I prototyped something on our fork here 
https://github.com/coralogix/arrow-ballista/commit/9887f7757f33225769de52874ef10aa7fa6e4b57
   
   The basic gist is:
   
   1. Add a new executor status `Fenced` indicating executor is shutting down
   2. When executor begins shutdown, immediately send a heartbeat with status 
`Fenced`
   3. Schedulers should only consider executors with `Active` status as alive. 
   4. Executor still sends `executor_stopped` rpc immediately when it begin 
shutdown
   5. But when the scheduler receives that rpc is waits a configurable amount 
of time (default 30s) before removing the executor 
   
   If this seems sensible I can work on upstreaming it. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to