[ 
https://issues.apache.org/jira/browse/MESOS-9709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16813390#comment-16813390
 ] 

Andrei Budnik commented on MESOS-9709:
--------------------------------------

It's a Linux kernel bug: [https://github.com/lxc/lxc/issues/2141]

> Docker executor can become stuck terminating
> --------------------------------------------
>
>                 Key: MESOS-9709
>                 URL: https://issues.apache.org/jira/browse/MESOS-9709
>             Project: Mesos
>          Issue Type: Bug
>          Components: containerization
>    Affects Versions: 1.8.0
>            Reporter: Greg Mann
>            Priority: Major
>              Labels: containerization, mesosphere
>         Attachments: docker-executor-stuck.txt
>
>
> See attached agent log; the executor container ID is 
> {{d2bfec33-f6bd-44ee-9345-b5710780bb59}} and the executor ID contains the 
> string {{819f7ef7-4f42-11e9-a566-72ec67496045}}.
> After launching the executor, we see
> {code}
> Mar 29 18:23:36 int-mountvolumeagent9-soak113s.testing.mesosphe.re 
> mesos-agent[10238]: I0329 18:23:36.967316 10257 slave.cpp:3550] Launching 
> container d2bfec33-f6bd-44ee-9345-b5710780bb59 for executor 
> 'datastax-dse.instance-819f7ef7-4f42-11e9-a566-72ec67496045._app.339' of 
> framework a221eeb3-b9c0-4e92-ae20-1e1d4af25321-0000
> Mar 29 18:23:36 int-mountvolumeagent9-soak113s.testing.mesosphe.re 
> mesos-agent[10238]: I0329 18:23:36.968968 10253 docker.cpp:1161] No container 
> info found, skipping launch
> {code}
> I'm not sure why the container info was not set. Once the executor 
> reregistration timeout elapses, the agent attempts to terminate the executor 
> but it does not seem to be successful. The scheduler continues to try to kill 
> the task but we repeatedly see
> {code}
> Mar 29 18:35:19 int-mountvolumeagent9-soak113s.testing.mesosphe.re 
> mesos-agent[10238]: W0329 18:35:19.855063 10253 slave.cpp:3823] Ignoring kill 
> task datastax-dse.instance-819f7ef7-4f42-11e9-a566-72ec67496045._app.339 
> because the executor 
> 'datastax-dse.instance-819f7ef7-4f42-11e9-a566-72ec67496045._app.339' of 
> framework a221eeb3-b9c0-4e92-ae20-1e1d4af25321-0000 is terminating
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to