Greg Mann created MESOS-10031:
---------------------------------
Summary: Agent's 'executorTerminated()' can cause double task
status update
Key: MESOS-10031
URL: https://issues.apache.org/jira/browse/MESOS-10031
Project: Mesos
Issue Type: Bug
Affects Versions: 1.9.0
Reporter: Greg Mann
Assignee: Greg Mann
When the agent first receives a task status update from an executor, it
executes {{Slave::statusUpdate()}}, which adds the task ID to the
{{Executor::pendingStatusUpdates}} map, but leaves the ID in
{{Executor::launchedTasks}}.
Meanwhile, the code in {{Slave::executorTerminated()}} is not capable of
handling the intermediate task state which exists in between the execution of
{{Slave::statusUpdate()}} and {{Slave::_statusUpdate()}}. If
{{Slave::executorTerminated()}} executes at that point in time, it's possible
that the task will be transitioned to a terminal state twice (for example, it
could be transitioned to TASK_FINISHED by the executor, then to TASK_FAILED by
the agent if the executor suddenly terminates).
If the agent has already received a status update from an executor, that state
transition should be honored even if the executor terminates immediately after
it's sent. We should ensure that {{Slave::executorTerminated()}} cannot cause a
valid update received from an executor to be ignored.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)