Thomas Rampelberg created MESOS-1473:
----------------------------------------

             Summary: Mesos slave becomes unresponsive after launching 8 
external containers
                 Key: MESOS-1473
                 URL: https://issues.apache.org/jira/browse/MESOS-1473
             Project: Mesos
          Issue Type: Bug
          Components: containerization
    Affects Versions: 0.19.0
            Reporter: Thomas Rampelberg


After 8 tasks have been launched by marathon/mesos that use external 
containerizer, the mesos slave becomes unresponsive and is eventually removed 
from the master (requiring a restart of the process).

Replication steps:
:; git clone g...@github.com:mesosphere/playa-mesos.git
:; cd playa-mesos
:; vagrant up
:; vagrant ssh
:; sudo mkdir -p /etc/mesos-slave
:; sudo mkdir -p /etc/mesos-master
:; echo /usr/bin/deimos       | sudo dd of=/etc/mesos-slave/containerizer_path
:; echo external              | sudo dd of=/etc/mesos-slave/isolation
:; curl -H "Content-Type: application/json" -X POST localhost:8080/v2/apps -d 
'{"id": "sleep", "cmd": "while true; do sleep 10; 
done","instances":8,"cpus":0.1,"mem":16.0}'

Once the 8 instances are done launching (you can verify via. `docker ps`), 
mesos slave will be completely unresponsive.

Here's a gist with the slave logs during the event:

https://gist.github.com/pyronicide/9dc68332a29faf38c890



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to