Hi everyone,
I was trying to understand the process that makes the resources of a
container available again to the ResourceManager.
As far as I can guess from the logs, the AM:
- sends a stop request to the NodeManager for the specific container
- suddenly tells the RM about the release of the resources, which become
available (queues are re-sorted).
Actually, I was expecting the RM to wait for an acknowledgment from the NM
(through NM->RM heartbeat) about the real end of the container, but it
looks to me that the resources are made available upon receiving this info
from the AM (AM->RM heartbeat).
Maybe the container decommission time is so small to be irrelevant?

The logs are at INFO level, and I can't change it to DEBUG since I'm not
the only one using the cluster, so maybe I am missing something.

Thanks

Fabio

Reply via email to