[ 
https://issues.apache.org/jira/browse/MESOS-6952?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sathish Kumar updated MESOS-6952:
---------------------------------
    Description: 
Task was stuck at staging state almost 6hours even after slave executor is 
terminated on the slave. Since the task was stuck at staging, framework have 
not received update from mesos-master.

 The issue got fixed after slave restart and the task was moved from staging to 
task lost state.

I can see in the slave logs Asked to run task ' which is terminating/terminated
{noformat}
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for 
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097193 107774 slave.cpp:1361] Got assigned task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097453 107774 slave.cpp:1480] Launching task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
terminating/terminated
{noformat}

full Log of slave
{noformat}
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.066277 107763 slave.cpp:3012] Handling status update TASK_FAILED 
(UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.134692 107766 status_update_manager.cpp:320] Received status update 
TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.134753 107766 status_update_manager.cpp:824] Checkpointing UPDATE for 
status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.142010 107767 slave.cpp:3410] Forwarding the update TASK_FAILED 
(UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.142119 107767 slave.cpp:3320] Sending acknowledgement for status 
update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.226682 107761 status_update_manager.cpp:392] Received status update 
acknowledgement (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.226759 107761 status_update_manager.cpp:824] Checkpointing ACK for 
status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.858510 107759 slave.cpp:1361] Got assigned task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.858762 107759 slave.cpp:1480] Launching task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.859004 107759 slave.cpp:1711] Queuing task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.939483 107759 slave.cpp:1863] Sending queued task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' to executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141394 107762 slave.cpp:3871] Executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 exited with status 0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141451 107762 slave.cpp:3012] Handling status update TASK_FAILED 
(UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141849 107762 status_update_manager.cpp:320] Received status update 
TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141989 107762 status_update_manager.cpp:824] Checkpointing UPDATE for 
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.147343 107766 slave.cpp:3410] Forwarding the update TASK_FAILED 
(UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.089175 107759 status_update_manager.cpp:392] Received status update 
acknowledgement (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for 
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097193 107774 slave.cpp:1361] Got assigned task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097453 107774 slave.cpp:1480] Launching task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
terminating/terminated
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097568 107774 slave.cpp:3012] Handling status update TASK_LOST (UUID: 
b999fb64-34f0-496d-be19-f5a7f998230e) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097633 107774 slave.cpp:3975] Cleaning up executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097790 107772 gc.cpp:55] Scheduling 
'/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
 for gc 6.99999886874074days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097836 107772 gc.cpp:55] Scheduling 
'/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
 for gc 6.99999886832296days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097869 107772 gc.cpp:55] Scheduling 
'/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
 for gc 6.99999886819259days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097888 107772 gc.cpp:55] Scheduling 
'/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
 for gc 6.99999886809185days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.WARNING.20161004-154318.107733:W0119
 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
terminating/terminated
{noformat}


Master logs
{noformat}


  was:
Task was stuck at staging state almost 6hours even after slave executor is 
terminated on the slave. Since the task was stuck at staging, framework have 
not received update from mesos-master.

 The issue got fixed after slave restart and the task was moved from staging to 
task lost state.

I can see in the slave logs Asked to run task ' which is terminating/terminated
{noformat}
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for 
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097193 107774 slave.cpp:1361] Got assigned task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097453 107774 slave.cpp:1480] Launching task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
terminating/terminated
{noformat}

full Log of slave
{noformat}
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.066277 107763 slave.cpp:3012] Handling status update TASK_FAILED 
(UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.134692 107766 status_update_manager.cpp:320] Received status update 
TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.134753 107766 status_update_manager.cpp:824] Checkpointing UPDATE for 
status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.142010 107767 slave.cpp:3410] Forwarding the update TASK_FAILED 
(UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.142119 107767 slave.cpp:3320] Sending acknowledgement for status 
update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.226682 107761 status_update_manager.cpp:392] Received status update 
acknowledgement (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.226759 107761 status_update_manager.cpp:824] Checkpointing ACK for 
status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.858510 107759 slave.cpp:1361] Got assigned task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.858762 107759 slave.cpp:1480] Launching task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.859004 107759 slave.cpp:1711] Queuing task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:15.939483 107759 slave.cpp:1863] Sending queued task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' to executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141394 107762 slave.cpp:3871] Executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 exited with status 0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141451 107762 slave.cpp:3012] Handling status update TASK_FAILED 
(UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141849 107762 status_update_manager.cpp:320] Received status update 
TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.141989 107762 status_update_manager.cpp:824] Checkpointing UPDATE for 
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:16.147343 107766 slave.cpp:3410] Forwarding the update TASK_FAILED 
(UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.089175 107759 status_update_manager.cpp:392] Received status update 
acknowledgement (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for 
status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097193 107774 slave.cpp:1361] Got assigned task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097453 107774 slave.cpp:1480] Launching task 
ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
terminating/terminated
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097568 107774 slave.cpp:3012] Handling status update TASK_LOST (UUID: 
b999fb64-34f0-496d-be19-f5a7f998230e) for task 
ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097633 107774 slave.cpp:3975] Cleaning up executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097790 107772 gc.cpp:55] Scheduling 
'/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
 for gc 6.99999886874074days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097836 107772 gc.cpp:55] Scheduling 
'/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
 for gc 6.99999886832296days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097869 107772 gc.cpp:55] Scheduling 
'/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
 for gc 6.99999886819259days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
 14:42:17.097888 107772 gc.cpp:55] Scheduling 
'/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
 for gc 6.99999886809185days in the future
mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.WARNING.20161004-154318.107733:W0119
 14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
terminating/terminated
{noformat}




> Mesos task state was stuck in staging even after executor terminated
> --------------------------------------------------------------------
>
>                 Key: MESOS-6952
>                 URL: https://issues.apache.org/jira/browse/MESOS-6952
>             Project: Mesos
>          Issue Type: Bug
>          Components: executor
>    Affects Versions: 0.28.2
>         Environment: ubuntu 14.04
>            Reporter: Sathish Kumar
>
> Task was stuck at staging state almost 6hours even after slave executor is 
> terminated on the slave. Since the task was stuck at staging, framework have 
> not received update from mesos-master.
>  The issue got fixed after slave restart and the task was moved from staging 
> to task lost state.
> I can see in the slave logs Asked to run task ' which is 
> terminating/terminated
> {noformat}
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for 
> status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for 
> task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097193 107774 slave.cpp:1361] Got assigned task 
> ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097453 107774 slave.cpp:1480] Launching task 
> ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
>  14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
> terminating/terminated
> {noformat}
> full Log of slave
> {noformat}
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.066277 107763 slave.cpp:3012] Handling status update TASK_FAILED 
> (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 from executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.134692 107766 status_update_manager.cpp:320] Received status update 
> TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.134753 107766 status_update_manager.cpp:824] Checkpointing UPDATE 
> for status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) 
> for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.142010 107767 slave.cpp:3410] Forwarding the update TASK_FAILED 
> (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.142119 107767 slave.cpp:3320] Sending acknowledgement for status 
> update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 to executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.226682 107761 status_update_manager.cpp:392] Received status update 
> acknowledgement (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.226759 107761 status_update_manager.cpp:824] Checkpointing ACK for 
> status update TASK_FAILED (UUID: 5e4147e8-f11c-4950-ba7b-c4e7f8bc5932) for 
> task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.858510 107759 slave.cpp:1361] Got assigned task 
> ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.858762 107759 slave.cpp:1480] Launching task 
> ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.859004 107759 slave.cpp:1711] Queuing task 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' for executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:15.939483 107759 slave.cpp:1863] Sending queued task 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' to executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:16.141394 107762 slave.cpp:3871] Executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 exited with status 0
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:16.141451 107762 slave.cpp:3012] Handling status update TASK_FAILED 
> (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:16.141849 107762 status_update_manager.cpp:320] Received status update 
> TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:16.141989 107762 status_update_manager.cpp:824] Checkpointing UPDATE 
> for status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) 
> for task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:16.147343 107766 slave.cpp:3410] Forwarding the update TASK_FAILED 
> (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 to master@10.14.23.181:5050
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.089175 107759 status_update_manager.cpp:392] Received status update 
> acknowledgement (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.089251 107759 status_update_manager.cpp:824] Checkpointing ACK for 
> status update TASK_FAILED (UUID: 247bbeed-1d60-4d33-ac1e-9282266c54ee) for 
> task ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097193 107774 slave.cpp:1361] Got assigned task 
> ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097453 107774 slave.cpp:1480] Launching task 
> ct:1484816820000:0:foocare_zendesk_round_robin: for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:W0119
>  14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
> terminating/terminated
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097568 107774 slave.cpp:3012] Handling status update TASK_LOST 
> (UUID: b999fb64-34f0-496d-be19-f5a7f998230e) for task 
> ct:1484816820000:0:foocare_zendesk_round_robin: of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 from @0.0.0.0:0
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097633 107774 slave.cpp:3975] Cleaning up executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' of framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 at executor(1)@10.14.38.239:43937
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097790 107772 gc.cpp:55] Scheduling 
> '/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
>  for gc 6.99999886874074days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097836 107772 gc.cpp:55] Scheduling 
> '/data/mesos/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
>  for gc 6.99999886832296days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097869 107772 gc.cpp:55] Scheduling 
> '/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:/runs/6b8922ff-3f57-42a0-97d1-d79c1de3d93b'
>  for gc 6.99999886819259days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.INFO.20161004-154315.107733:I0119
>  14:42:17.097888 107772 gc.cpp:55] Scheduling 
> '/data/mesos/meta/slaves/22c4f06b-d107-4cf4-86b1-81a6cce5441a-S56/frameworks/19393553-2061-4d2f-8c05-a0ba688334f4-0001/executors/ct:1484816820000:0:foocare_zendesk_round_robin:'
>  for gc 6.99999886809185days in the future
> mesos-slave.distancematrix8.prod-foo-dcos.foobar.net.invalid-user.log.WARNING.20161004-154318.107733:W0119
>  14:42:17.097527 107774 slave.cpp:1673] Asked to run task 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' for framework 
> 19393553-2061-4d2f-8c05-a0ba688334f4-0001 with executor 
> 'ct:1484816820000:0:foocare_zendesk_round_robin:' which is 
> terminating/terminated
> {noformat}
> Master logs
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to