Hello devs,

We recently ran into a situation where a task's executor was killed due to
registration timeout, but neither the executor nor the task was properly
killed, and the task has been stuck in queued_tasks for days.

The relevant log:

I0305 08:43:59.069857  5215 slave.cpp:6803] Terminating executor
'<executor_id>' of framework <framework_id> because it did not
register within 15mins
I0305 09:16:28.266021  5200 slave.cpp:3644] Asked to kill task
<task_id> of framework <framework_id>
W0305 09:16:28.266063  5200 slave.cpp:3816] Ignoring kill task
<task_id> because the executor '<executor_id>' of framework
<framework_id> is terminating


where the following just keeps repeating:

I0305 09:16:28.266021  5200 slave.cpp:3644] Asked to kill task
<task_id> of framework <framework_id>
W0305 09:16:28.266063  5200 slave.cpp:3816] Ignoring kill task
<task_id> because the executor '<executor_id>' of framework
<framework_id> is terminating


the agent state indicates that it doesn't have any active tasks but a quite
a few queued tasks.

Does anyone have any insight on why this might be happening?

Thanks,
Eric

Reply via email to