[ https://issues.apache.org/jira/browse/MESOS-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14140671#comment-14140671 ]
Tom Arnfeld edited comment on MESOS-1812 at 9/19/14 2:55 PM: ------------------------------------------------------------- I think there are use cases for it. For example, the modifications I am making to the hadoop framework. Ultimately I am trying to control how long an Executor process lives for, and be able to trigger it to commit suicide, from the framework. Framework/Executor messages are currently not a reliable form of communication over mesos (as far as I know) and after my tasks are done I need the executor to stay around for a specific amount of time. Currently I am launching two tasks, one as a controller for the executor (issuing {{killTask}} on this task ID will cause the executor to terminate. Then another N tasks for the actual work. I'd like to ensure the first task always launches first. Perhaps what I really need here is some kind of {{shutdownExecutor}} driver call. was (Author: tarnfeld): I think there are use cases for it. For example, the modifications I am making to the hadoop framework. Ultimately I am trying to control how long an Executor process lives for, and be able to trigger it to commit suicide, from the framework. Framework/Executor messages are currently not a reliable form of communication over mesos (as far as I know) and after my tasks are done I need the executor to stay around for a specific amount of time. Perhaps what I really need here is some kind of {{shutdownExecutor}} driver call. > Queued tasks are not actually launched in the order they were queued > -------------------------------------------------------------------- > > Key: MESOS-1812 > URL: https://issues.apache.org/jira/browse/MESOS-1812 > Project: Mesos > Issue Type: Bug > Components: slave > Reporter: Tom Arnfeld > > Even though tasks are assigned and queued in the order in which they are > launched (e.g multiple tasks in reply to one offer), due to timing issues > with the futures, this can sometimes break the causality and end up not being > launched in order. > Example trace from a slave... In this example the Task_Tracker_10 task should > be launched before slots_Task_Tracker_10. > {code} > I0918 02:10:50.371445 17072 slave.cpp:933] Got assigned task Task_Tracker_10 > for framework 20140916-233111-3171422218-5050-14295-0015 > I0918 02:10:50.372110 17072 slave.cpp:933] Got assigned task > slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015 > I0918 02:10:50.372172 17073 gc.cpp:84] Unscheduling > '/mnt/mesos-slave/slaves/20140915-112519-3171422218-5050-5016-6/frameworks/20140916-233111-3171422218-5050-14295-0015' > from gc > I0918 02:10:50.375018 17072 slave.cpp:1043] Launching task > slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015 > I0918 02:10:50.386282 17072 slave.cpp:1153] Queuing task > 'slots_Task_Tracker_10' for executor executor_Task_Tracker_10 of framework > '20140916-233111-3171422218-5050-14295-0015 > I0918 02:10:50.386312 17070 mesos_containerizer.cpp:537] Starting container > '5f507f09-b48e-44ea-b74e-740b0e8bba4d' for executor > 'executor_Task_Tracker_10' of framework > '20140916-233111-3171422218-5050-14295-0015' > I0918 02:10:50.388942 17072 slave.cpp:1043] Launching task Task_Tracker_10 > for framework 20140916-233111-3171422218-5050-14295-0015 > I0918 02:10:50.406277 17070 launcher.cpp:117] Forked child with pid '817' for > container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' > I0918 02:10:50.406563 17072 slave.cpp:1153] Queuing task 'Task_Tracker_10' > for executor executor_Task_Tracker_10 of framework > '20140916-233111-3171422218-5050-14295-0015 > I0918 02:10:50.408499 17069 mesos_containerizer.cpp:647] Fetching URIs for > container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' using command > '/usr/local/libexec/mesos/mesos-fetcher' > I0918 02:11:11.650687 17071 slave.cpp:2873] Current usage 17.34%. Max allowed > age: 5.086371210668750days > I0918 02:11:16.590270 17075 slave.cpp:2355] Monitoring executor > 'executor_Task_Tracker_10' of framework > '20140916-233111-3171422218-5050-14295-0015' in container > '5f507f09-b48e-44ea-b74e-740b0e8bba4d' > I0918 02:11:17.701015 17070 slave.cpp:1664] Got registration for executor > 'executor_Task_Tracker_10' of framework > 20140916-233111-3171422218-5050-14295-0015 > I0918 02:11:17.701897 17070 slave.cpp:1783] Flushing queued task > slots_Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework > 20140916-233111-3171422218-5050-14295-0015 > I0918 02:11:17.702350 17070 slave.cpp:1783] Flushing queued task > Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework > 20140916-233111-3171422218-5050-14295-0015 > I0918 02:11:18.588388 17070 mesos_containerizer.cpp:1112] Executor for > container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' has exited > I0918 02:11:18.588665 17070 mesos_containerizer.cpp:996] Destroying container > '5f507f09-b48e-44ea-b74e-740b0e8bba4d' > I0918 02:11:18.599234 17072 slave.cpp:2413] Executor > 'executor_Task_Tracker_10' of framework > 20140916-233111-3171422218-5050-14295-0015 has exited with status 1 > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)