I think a pod disruption budget might actually work here. It can select the
spark driver pod using a label. Using that with a minAvailable value that's
appropriate here could do it.
In a more general sense, we do plan on some future work to support driver
recovery which should help long running
Hi,
What would be the recommended approach to wait for spark driver pod to
complete the currently running job before it gets evicted to new nodes
while maintenance on the current node is goingon (kernel upgrade,hardware
maintenance etc..) using drain command
I don’t think I can use