Thomas Rampelberg created MESOS-1473: ----------------------------------------
Summary: Mesos slave becomes unresponsive after launching 8 external containers Key: MESOS-1473 URL: https://issues.apache.org/jira/browse/MESOS-1473 Project: Mesos Issue Type: Bug Components: containerization Affects Versions: 0.19.0 Reporter: Thomas Rampelberg After 8 tasks have been launched by marathon/mesos that use external containerizer, the mesos slave becomes unresponsive and is eventually removed from the master (requiring a restart of the process). Replication steps: :; git clone g...@github.com:mesosphere/playa-mesos.git :; cd playa-mesos :; vagrant up :; vagrant ssh :; sudo mkdir -p /etc/mesos-slave :; sudo mkdir -p /etc/mesos-master :; echo /usr/bin/deimos | sudo dd of=/etc/mesos-slave/containerizer_path :; echo external | sudo dd of=/etc/mesos-slave/isolation :; curl -H "Content-Type: application/json" -X POST localhost:8080/v2/apps -d '{"id": "sleep", "cmd": "while true; do sleep 10; done","instances":8,"cpus":0.1,"mem":16.0}' Once the 8 instances are done launching (you can verify via. `docker ps`), mesos slave will be completely unresponsive. Here's a gist with the slave logs during the event: https://gist.github.com/pyronicide/9dc68332a29faf38c890 -- This message was sent by Atlassian JIRA (v6.2#6252)