[ https://issues.apache.org/jira/browse/MESOS-1799?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dominic Hamon updated MESOS-1799: --------------------------------- Sprint: Mesos Q3 Sprint 6, Mesos Q3 Sprint 7 (was: Mesos Q3 Sprint 6) > Reconciliation can send out-of-order updates. > --------------------------------------------- > > Key: MESOS-1799 > URL: https://issues.apache.org/jira/browse/MESOS-1799 > Project: Mesos > Issue Type: Bug > Components: master, slave > Reporter: Benjamin Mahler > Assignee: Vinod Kone > > When a slave re-registers with the master, it currently sends the latest task > state for all tasks that are not both terminal and acknowledged. > However, reconciliation assumes that we always have the latest unacknowledged > state of the task represented in the master. > As a result, out-of-order updates are possible, e.g. > (1) Slave has task T in TASK_FINISHED, with unacknowledged updates: > [TASK_RUNNING, TASK_FINISHED]. > (2) Master fails over. > (3) New master re-registers the slave with T in TASK_FINISHED. > (4) Reconciliation request arrives, master sends TASK_FINISHED. > (5) Slave sends TASK_RUNNING to master, master sends TASK_RUNNING. > I think the fix here is to preserve the task state invariants in the master, > namely, that the master has the latest unacknowledged state of the task. This > means when the slave re-registers, it should instead send the latest > acknowledged state of each task. -- This message was sent by Atlassian JIRA (v6.3.4#6332)