Github user jerryshao commented on the issue: https://github.com/apache/spark/pull/19145 >But if we restart the RM, then, the lost containers in the NM will be reported to RM as lost again because of recovery Since you already enabled RM and NM recovery, IIUC the failure of RM/NM will not lead to container exit. And after RM/NM restart, it will recover the persistent container metadata, so I think there should be no lost containers reported. Sorry I'm not so familiar with this part in YARN.
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org