Hi, all Here is my question: is there a mechanisms that when one container exit abnormally, yarn will prefer to dispatch the container on other NM?
We have a cluster with 3 NMs(each NM 135g mem) and 1 RM, and we running a job which start 13 container(= 1 AM + 12 executor containers). Each NM has 4 executor container and the mem configured for each executor container is 30g. There is a interesting test, when we killed 4 containers in one NM1, only 2 containers restarted on NM1, other 2 containers reserved on the NM2 and NM3. Any idea? Fei.