Benjamin Mahler created MESOS-1407:
--------------------------------------

             Summary: Provide state reconciliation for frameworks.
                 Key: MESOS-1407
                 URL: https://issues.apache.org/jira/browse/MESOS-1407
             Project: Mesos
          Issue Type: Epic
            Reporter: Benjamin Mahler


State inconsistencies can arise between the framework scheduler's view of tasks 
and the view of tasks within Mesos.

Frameworks, like Aurora, have had to compensate for these inconsistencies by 
running a specialized executor on the slave that reconciles what happened on 
the slave against what the scheduler thinks is the current state of tasks.

This ticket is to track ways to allow frameworks to detect state 
inconsistencies both when:

(1) There are tasks known to the framework, but unknown to Mesos. This can 
arise when the framework's intent was not carried out, or when a terminal event 
is not delivered to the framework.

(2) There are tasks known to Mesos but unknown to the framework. This can arise 
when the framework suffered information loss, _assuming the framework always 
persists its intent prior to taking an action_.

We have recently added a reconciliation message that allows frameworks to deal 
with (1), but nothing for (2) just yet. This could be accomplished using an 
"implicit" form of the same reconciliation message, or we could consider 
providing a way for frameworks to receive a full list of the tasks, which 
allows them to reconcile both (1) and (2).



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to