Greg Mann created MESOS-10031: --------------------------------- Summary: Agent's 'executorTerminated()' can cause double task status update Key: MESOS-10031 URL: https://issues.apache.org/jira/browse/MESOS-10031 Project: Mesos Issue Type: Bug Affects Versions: 1.9.0 Reporter: Greg Mann Assignee: Greg Mann
When the agent first receives a task status update from an executor, it executes {{Slave::statusUpdate()}}, which adds the task ID to the {{Executor::pendingStatusUpdates}} map, but leaves the ID in {{Executor::launchedTasks}}. Meanwhile, the code in {{Slave::executorTerminated()}} is not capable of handling the intermediate task state which exists in between the execution of {{Slave::statusUpdate()}} and {{Slave::_statusUpdate()}}. If {{Slave::executorTerminated()}} executes at that point in time, it's possible that the task will be transitioned to a terminal state twice (for example, it could be transitioned to TASK_FINISHED by the executor, then to TASK_FAILED by the agent if the executor suddenly terminates). If the agent has already received a status update from an executor, that state transition should be honored even if the executor terminates immediately after it's sent. We should ensure that {{Slave::executorTerminated()}} cannot cause a valid update received from an executor to be ignored. -- This message was sent by Atlassian Jira (v8.3.4#803005)