Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/19145
  
    Did you enable RM or NM recovery, can you please clarify it?
    
    Normally, if we assume there's are 2 containers running on this NM, after 
10 minutes, RM will detect the failure of NM and relaunch 2 lost containers in 
other NMs, and the total number of executors should still be the same. But 
things will be different if we enabled NM recovery, because now the failure of 
NM will not lead to container lost.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to