For JobServer integration with Mesos I take a two phase approach to
launching jobs on the slaves:

1) When my Framework has need to launch a job it calls
Scheduler.launchTask. But at this point it does not actually run any jobs.
If the launchTask returns successfully my Framework assumes at that point
that it has "control" over the resources and that the executor is ready to
run jobs.

2) I then pass the mesos slave/executor details to my internal job
framework and when ready I use Scheduler.sendFrameworkMessage() to actually
run my job after my internal scheduler does the setup work.

3) If step #2 fails for any reason, my Framework assumes the slave/executor
is no longer available and submits another request and waits for Framework
to find another match.

After the job completes, my Framework detects this and can potentially
recycle the executor and use it for another job if resources are a match
for the next job.

Thanks,
Sam Taha

http://www.grandlogic.com


On Wed, Sep 18, 2013 at 5:16 AM, Bernerd Schaefer <bern...@soundcloud.com>wrote:

> This is somewhat related to the existing thread about messaging
> reliability, though specific to launching tasks.
>
> I noticed that both Chronos[1] and Marathon[2] assume that after
> "launchTasks" is called, the task will be launched. However, as best as I
> can tell, Mesos makes no such guarantee -- or, rather, libprocess makes no
> guarantee that a sent message will be received.
>
> Am I correct in thinking that to reliably launch tasks, one should do
> something like the scheduler's doReliableRegistration[3], i.e., ask the
> driver to launch a task until a task update message is received for it?
>
> One problem I see with this, is that the master replies to repeated launch
> task messages[4] with "TASK_LOST". There are some cases where this would
> work effectively:
>
> - Framework sends task launch message. Master is down and never receives
> it. Master comes back up/fails over. Framework resends message, and
> receives TASK_LOST.
>
> At that point, the framework could re-queue the task to be launched
> against a new offer. There, however, are other cases where this creates a
> race condition:
>
> 1. Framework sends task launch message. Master begins to launch task.
> Framework resends message. The framework will receive both TASK_LOST and
> TASK_STARTING/RUNNING, in any order.
>
>  2. Framework sends task launch message. Master launches task, but dies
> before notifying framework. Master recovers/fails over. Framework resends
> message. It will receive both TASK_LOST and TASK_STARTING (or RUNNING), in
> any order.
>
> For case 1, I *think* this could be addressed if the master -- instead of
> only checking the validity of the offer -- also checked if the task already
> exists in the framework. If so it could either ignore the request, or send
> a status update message with the current state.
>
> I'm unsure about addressing case 2, since as far as I understand the task
> list is lazily rebuilt from slave messages after a crash/fail-over.
>
> Perhaps someone here has some more experience mesos and its frameworks,
> can lend some insight here. Is this even a problem in practice?
>
> Thanks,
>
> Bernerd
> Engineer @ Soundcloud
>
> [1]:
> https://github.com/airbnb/chronos/blob/master/src/main/scala/com/airbnb/scheduler/mesos/MesosJobFramework.scala#L198-L203
> [2]:
> https://github.com/mesosphere/marathon/blob/master/src/main/scala/mesosphere/marathon/MarathonScheduler.scala#L57-L59
> [3]: https://github.com/apache/mesos/blob/master/src/sched/sched.cpp#L285
> [4]:
> https://github.com/apache/mesos/blob/master/src/master/master.cpp#L902
>

Reply via email to