+1 for the change. On Wed, Aug 23, 2017 at 8:58 AM, Benno Evers <bev...@mesosphere.com> wrote:
> I think it's ultimately up to the executor to interpret what "running" > means exactly. The closest thing to a general definition would probably be > this from docs/high-availability-framework.md: > > > A task transitions to the `TASK_RUNNING` state after it has begun running > > successfully (if the task fails to start, it transitions to one of the > > terminal states listed below). > > For the current built-in executors, the CommandExecutor sends TASK_RUNNING > when the process forked successfully and health checks were created, and > the DockerExecutor when it receives the output of the `docker inspect` > command for the started container. > > On Wed, Aug 23, 2017 at 4:54 PM, James Peach <jor...@gmail.com> wrote: > > > > > > On Aug 23, 2017, at 2:38 AM, Benno Evers <bev...@mesosphere.com> > wrote: > > > > > > Hi all, > > > > > > when starting a task, an executor can send out the following status > > updates: > > > > > > - [optional] TASK_STARTING: Sent by the executor when it received the > > > launch command > > > - TASK_RUNNING: Sent by the executor when the task is running > > > > > > How is "running" defined? > > > > > > > > The built-in executors currently don't send out TASK_STARTING updates. > I > > > think this discards potentially valuable information, because > > TASK_RUNNING > > > informs us about the current status of the task, but not about the > status > > > change. > > > > > > For example, if the network connection between scheduler and master is > > > interrupted during task start, it has no good way to estimate the tasks > > > start time, because the TASK_RUNNING update that it eventually gets > might > > > be a much later one. Also, for tasks with a long delay between STARTING > > and > > > RUNNING, to an outside observer it will look the same as if the task > was > > > stuck in STAGING. > > > > > > There is a small risk that sending an additional update could break > > > existing frameworks. We briefly looked through some of the most popular > > > open-source frameworks and didn't find any major issues, but of course > > it's > > > impossible to do an exhaustive check. > > > > > > In particular, a framework will break if > > > > > > 1. It runs tasks using one of the built-in mesos executors, and > > > 2. it doesn't handle the possibility of receiving TASK_STARTING update, > > and > > > 3. it reports an error whenever it encounters an unexpected task states > > in > > > an update. > > > > > > > > > If you are aware of any such framework, please speak up so we can > > consider > > > it. > > > > > > > > > Thanks, > > > -- > > > Benno Evers > > > Software Engineer, Mesosphere > > > > > > > -- > Benno Evers > Software Engineer, Mesosphere >