Hi, While designing the correct behavior with one of our framework, we encounters some questions about behavior of status update:
The executor continuously polls the workload probe to get current mode of workload (a Cassandra server), and send various status update states (STARTING, RUNNING, FAILED, etc). Executor polls every 30 seconds, and sends status update. Here, we are seeing congestion on task update acknowledgements somewhere (still unknown). There are three scenarios that we want to understand. 1. Agent queue has task update TS1, TS2 & TS3 (in this order) waiting on acknowledgement. Suppose if TS2 receives an acknowledgement, then what will happen to TS1 update in the queue. 1. Agent queue has task update TS1, TS2, TS3 & TASK_FAILED. Here, TS1, TS2, TS3 are non-terminial updates. Once the agent has received a terminal status update, does it makes sense to ignore non-terminal updates in the queue? 1. As per Executor Driver code comment <https://github.com/apache/mesos/blob/master/src/java/src/org/apache/mesos/ExecutorDriver.java#L86>, if the executor is terminated, does agent send TASK_LOST? If so, does it send once or for each unacknowledged status update? I'll study the code in status update manager and agent separately but some official answer will definitely help. Many thanks! -- Cheers, Zhitao Li