Meng Zhu created MESOS-9673:
-------------------------------

             Summary: Add timeout mechanism to GC incomplete task.
                 Key: MESOS-9673
                 URL: https://issues.apache.org/jira/browse/MESOS-9673
             Project: Mesos
          Issue Type: Improvement
          Components: containerization
            Reporter: Meng Zhu


Currently, an executor's meta and sandbox directory are only GCed when a task 
is completed i.e. terminal task with all status acked.

However, in the case of unacked status update, the agent will keep resending 
and keep the directories forever.

One issue is that, agent will keep recovering this executor upon every failover 
and if a later executor happens to use the same pid (almost a certainty 
consider the old meta dir will never be GCed), it will send agent into a crash 
loop (MESOS-9672).

We should consider introducing a timeout mechanism to GC incomplete tasks.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to