Hello everyone, We are working on adding support for agent lifecycle management [1] that will provide a feedback mechanism for frameworks in case of agent node failures. The existing agent lost [2] signal is not sufficient for frameworks to ascertain that a given agent node isn't coming back.
Here is a link to the design doc: https://docs.google.com/document/d/1XvP0acT8xadSev8UG2BXtsPlEh0Rb7R3WV3s-TnTeqg Please feel free to provide any feedback via comments on the doc. [1] JIRA Epic: https://issues.apache.org/jira/browse/MESOS-7426 [2] https://github.com/apache/mesos/blob/master/include/mesos/v1/scheduler/scheduler.proto#L151 -anand