Tom Arnfeld created MESOS-1812: ---------------------------------- Summary: Queued tasks are not actually launched in the order they were queued Key: MESOS-1812 URL: https://issues.apache.org/jira/browse/MESOS-1812 Project: Mesos Issue Type: Bug Components: slave Reporter: Tom Arnfeld
Even though tasks are assigned and queued in the order in which they are launched (e.g multiple tasks in reply to one offer), due to timing issues with the futures, this can sometimes break the causality and end up not being launched in order. Example trace from a slave... In this example the Task_Tracker_10 task should be launched before slots_Task_Tracker_10. {code} I0918 02:10:50.371445 17072 slave.cpp:933] Got assigned task Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:10:50.372110 17072 slave.cpp:933] Got assigned task slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:10:50.372172 17073 gc.cpp:84] Unscheduling '/mnt/mesos-slave/slaves/20140915-112519-3171422218-5050-5016-6/frameworks/20140916-233111-3171422218-5050-14295-0015' from gc I0918 02:10:50.375018 17072 slave.cpp:1043] Launching task slots_Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:10:50.386282 17072 slave.cpp:1153] Queuing task 'slots_Task_Tracker_10' for executor executor_Task_Tracker_10 of framework '20140916-233111-3171422218-5050-14295-0015 I0918 02:10:50.386312 17070 mesos_containerizer.cpp:537] Starting container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' for executor 'executor_Task_Tracker_10' of framework '20140916-233111-3171422218-5050-14295-0015' I0918 02:10:50.388942 17072 slave.cpp:1043] Launching task Task_Tracker_10 for framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:10:50.406277 17070 launcher.cpp:117] Forked child with pid '817' for container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' I0918 02:10:50.406563 17072 slave.cpp:1153] Queuing task 'Task_Tracker_10' for executor executor_Task_Tracker_10 of framework '20140916-233111-3171422218-5050-14295-0015 I0918 02:10:50.408499 17069 mesos_containerizer.cpp:647] Fetching URIs for container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' using command '/usr/local/libexec/mesos/mesos-fetcher' I0918 02:11:11.650687 17071 slave.cpp:2873] Current usage 17.34%. Max allowed age: 5.086371210668750days I0918 02:11:16.590270 17075 slave.cpp:2355] Monitoring executor 'executor_Task_Tracker_10' of framework '20140916-233111-3171422218-5050-14295-0015' in container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' I0918 02:11:17.701015 17070 slave.cpp:1664] Got registration for executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:11:17.701897 17070 slave.cpp:1783] Flushing queued task slots_Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:11:17.702350 17070 slave.cpp:1783] Flushing queued task Task_Tracker_10 for executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015 I0918 02:11:18.588388 17070 mesos_containerizer.cpp:1112] Executor for container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' has exited I0918 02:11:18.588665 17070 mesos_containerizer.cpp:996] Destroying container '5f507f09-b48e-44ea-b74e-740b0e8bba4d' I0918 02:11:18.599234 17072 slave.cpp:2413] Executor 'executor_Task_Tracker_10' of framework 20140916-233111-3171422218-5050-14295-0015 has exited with status 1 {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)