Greg Mann created MESOS-10031:
---------------------------------

             Summary: Agent's 'executorTerminated()' can cause double task 
status update
                 Key: MESOS-10031
                 URL: https://issues.apache.org/jira/browse/MESOS-10031
             Project: Mesos
          Issue Type: Bug
    Affects Versions: 1.9.0
            Reporter: Greg Mann
            Assignee: Greg Mann


When the agent first receives a task status update from an executor, it 
executes {{Slave::statusUpdate()}}, which adds the task ID to the 
{{Executor::pendingStatusUpdates}} map, but leaves the ID in 
{{Executor::launchedTasks}}.

Meanwhile, the code in {{Slave::executorTerminated()}} is not capable of 
handling the intermediate task state which exists in between the execution of 
{{Slave::statusUpdate()}} and {{Slave::_statusUpdate()}}. If 
{{Slave::executorTerminated()}} executes at that point in time, it's possible 
that the task will be transitioned to a terminal state twice (for example, it 
could be transitioned to TASK_FINISHED by the executor, then to TASK_FAILED by 
the agent if the executor suddenly terminates).

If the agent has already received a status update from an executor, that state 
transition should be honored even if the executor terminates immediately after 
it's sent. We should ensure that {{Slave::executorTerminated()}} cannot cause a 
valid update received from an executor to be ignored.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to