[jira] [Commented] (MESOS-4565) slave recovers and attempt to destroy executor's child containers, then begins rejecting task status updates

Chanh Le (JIRA) Mon, 23 May 2016 21:36:21 -0700

    [ 
https://issues.apache.org/jira/browse/MESOS-4565?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15297666#comment-15297666
 ]


Chanh Le commented on MESOS-4565:
---------------------------------

Any update on that?
I still get the issues.

> slave recovers and attempt to destroy executor's child containers, then 
> begins rejecting task status updates
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: MESOS-4565
>                 URL: https://issues.apache.org/jira/browse/MESOS-4565
>             Project: Mesos
>          Issue Type: Bug
>          Components: docker
>    Affects Versions: 0.26.0
>            Reporter: James DeFelice
>              Labels: mesosphere
>
> AFAICT the slave is doing this:
> 1) recovering from some kind of failure
> 2) checking the containers that it pulled from its state store
> 3) complaining about cgroup children hanging off of executor containers
> 4) rejecting task status updates related to the executor container, the first 
> of which in the logs is:
> {code}
> E0130 02:22:21.979852 12683 slave.cpp:2963] Failed to update resources for 
> container 1d965a20-849c-40d8-9446-27cb723220a9 of executor 
> 'd701ab48a0c0f13_k8sm-executor' running task 
> pod.f2dc2c43-c6f7-11e5-ad28-0ad18c5e6c7f on status update for terminal task, 
> destroying container: Container '1d965a20-849c-40d8-9446-27cb723220a9' not 
> found
> {code}
> To be fair, I don't believe that my custom executor is re-registering 
> properly with the slave prior to attempting to send these (failing) status 
> updates. But the slave doesn't complain about that .. it complains that it 
> can't find the **container**.
> slave log here:
> https://gist.github.com/jdef/265663461156b7a7ed4e



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (MESOS-4565) slave recovers and attempt to destroy executor's child containers, then begins rejecting task status updates

Reply via email to