[jira] [Comment Edited] (MESOS-1812) Queued tasks are not actually launched in the order they were queued

Tom Arnfeld (JIRA) Fri, 19 Sep 2014 07:57:14 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140671#comment-14140671
 ]


Tom Arnfeld edited comment on MESOS-1812 at 9/19/14 2:55 PM:
-------------------------------------------------------------

I think there are use cases for it. For example, the modifications I am making 
to the hadoop framework.

Ultimately I am trying to control how long an Executor process lives for, and 
be able to trigger it to commit suicide, from the framework. Framework/Executor 
messages are currently not a reliable form of communication over mesos (as far 
as I know) and after my tasks are done I need the executor to stay around for a 
specific amount of time.

Currently I am launching two tasks, one as a controller for the executor 
(issuing {{killTask}} on this task ID will cause the executor to terminate. 
Then another N tasks for the actual work. I'd like to ensure the first task 
always launches first.

Perhaps what I really need here is some kind of {{shutdownExecutor}} driver 
call.


was (Author: tarnfeld):
I think there are use cases for it. For example, the modifications I am making 
to the hadoop framework.

Ultimately I am trying to control how long an Executor process lives for, and 
be able to trigger it to commit suicide, from the framework. Framework/Executor 
messages are currently not a reliable form of communication over mesos (as far 
as I know) and after my tasks are done I need the executor to stay around for a 
specific amount of time.

Perhaps what I really need here is some kind of {{shutdownExecutor}} driver 
call.

> Queued tasks are not actually launched in the order they were queued
> --------------------------------------------------------------------
>
>                 Key: MESOS-1812
>                 URL: https://issues.apache.org/jira/browse/MESOS-1812
>             Project: Mesos
>          Issue Type: Bug
>          Components: slave
>            Reporter: Tom Arnfeld
>
> Even though tasks are assigned and queued in the order in which they are 
> launched (e.g multiple tasks in reply to one offer), due to timing issues 
> with the futures, this can sometimes break the causality and end up not being 
> launched in order.
> Example trace from a slave... In this example the Task_Tracker_10 task should 
> be launched before slots_Task_Tracker_10.
> {code}
> I0918 02:10:50.371445 17072 slave.cpp:933] Got assigned task Task_Tracker_10 
> for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.372110 17072 slave.cpp:933] Got assigned task 
> slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.372172 17073 gc.cpp:84] Unscheduling 
> '/mnt/mesos-slave/slaves/20140915-112519-3171422218-5050-5016-6/frameworks/20140916-233111-3171422218-5050-14295-0015'
>  from gc
> I0918 02:10:50.375018 17072 slave.cpp:1043] Launching task 
> slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.386282 17072 slave.cpp:1153] Queuing task 
> 'slots_Task_Tracker_10' for executor executor_Task_Tracker_10 of framework 
> '20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.386312 17070 mesos_containerizer.cpp:537] Starting container 
> '5f507f09-b48e-44ea-b74e-740b0e8bba4d' for executor 
> 'executor_Task_Tracker_10' of framework 
> '20140916-233111-3171422218-5050-14295-0015'
> I0918 02:10:50.388942 17072 slave.cpp:1043] Launching task Task_Tracker_10 
> for framework 20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.406277 17070 launcher.cpp:117] Forked child with pid '817' for 
> container '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:10:50.406563 17072 slave.cpp:1153] Queuing task 'Task_Tracker_10' 
> for executor executor_Task_Tracker_10 of framework 
> '20140916-233111-3171422218-5050-14295-0015
> I0918 02:10:50.408499 17069 mesos_containerizer.cpp:647] Fetching URIs for 
> container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' using command 
> '/usr/local/libexec/mesos/mesos-fetcher'
> I0918 02:11:11.650687 17071 slave.cpp:2873] Current usage 17.34%. Max allowed 
> age: 5.086371210668750days
> I0918 02:11:16.590270 17075 slave.cpp:2355] Monitoring executor 
> 'executor_Task_Tracker_10' of framework 
> '20140916-233111-3171422218-5050-14295-0015' in container 
> '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:11:17.701015 17070 slave.cpp:1664] Got registration for executor 
> 'executor_Task_Tracker_10' of framework 
> 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:17.701897 17070 slave.cpp:1783] Flushing queued task 
> slots_Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework 
> 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:17.702350 17070 slave.cpp:1783] Flushing queued task 
> Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework 
> 20140916-233111-3171422218-5050-14295-0015
> I0918 02:11:18.588388 17070 mesos_containerizer.cpp:1112] Executor for 
> container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' has exited
> I0918 02:11:18.588665 17070 mesos_containerizer.cpp:996] Destroying container 
> '5f507f09-b48e-44ea-b74e-740b0e8bba4d'
> I0918 02:11:18.599234 17072 slave.cpp:2413] Executor 
> 'executor_Task_Tracker_10' of framework 
> 20140916-233111-3171422218-5050-14295-0015 has exited with status 1
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Comment Edited] (MESOS-1812) Queued tasks are not actually launched in the order they were queued

Reply via email to