[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16914034#comment-16914034 ] Qian Zhang commented on MESOS-9836: --- Master: commit 5db1835531edd16b8a7b550a944c27b0dacafbf7 Author: Qian Zhang Date: Wed Aug 21 16:50:06 2019 +0800 Used cached cgroups for updating resources in Docker containerizer. Review: https://reviews.apache.org/r/71335 1.8.x commit ee01c8d479b34ced35ee6bd172108a128086277e Author: Qian Zhang Date: Wed Aug 21 16:50:06 2019 +0800 Used cached cgroups for updating resources in Docker containerizer. Review: https://reviews.apache.org/r/71335 1.7.x commit 7b01d0d35e8e17434d6f8e8840c8586565fe8d6c Author: Qian Zhang Date: Wed Aug 21 16:50:06 2019 +0800 Used cached cgroups for updating resources in Docker containerizer. Review: https://reviews.apache.org/r/71335 1.6.x commit 1c70f29bdd270cdff8ce55bdfaab56581829017c (HEAD -> ci/qzhang/bp_9836_1.6.x) Author: Qian Zhang Date: Wed Aug 21 16:50:06 2019 +0800 Used cached cgroups for updating resources in Docker containerizer. Review: https://reviews.apache.org/r/71335 > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Assignee: Qian Zhang >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16912257#comment-16912257 ] Qian Zhang commented on MESOS-9836: --- On second thought, I would prefer Chun's solution: After obtaining the pid through `docker inspect`, we look into the proc filesystem and get the cgroup right away and cache it. And we do not need to checkpoint the cached cgroup since after agent restarts we can still get it through `docker inspect` and proc filesystem. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.slice' > I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to > 'mesos_executors.slice' > ... > {noformat} > It was highly likely that the container itse
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911305#comment-16911305 ] Andrei Budnik commented on MESOS-9836: -- Shall we deprecate the option to run a custom executor in a Docker container? If no one responds to our proposal in dev@ & user@ mailing lists, then we can safely deprecate this feature. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.slice' > I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to > 'mesos_executors.slice' > ... > {noformat} > It was highly likely that the container itself exited around 06:09:35, way > before the docker executor detected and reported the terminal status update, > and then its pid
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16911293#comment-16911293 ] Qian Zhang commented on MESOS-9836: --- I think there is a case that we need to update Docker container's CPU & memory directly in Docker containerizer: a customer executor is launched as a Docker container and it launches tasks as processes in the Docker container. In this case every time when a new task is sent to the customer executor or a running task terminates, Docker containerizer needs to update the Docker container's CPU and memory in cgroups. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.slice' > I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754'
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=1690#comment-1690 ] Qian Zhang commented on MESOS-9836: --- {quote}As Mesos provides an option to run a Docker image as an (custom?) executor, it might make sense to update the Docker container's resources (executor+its tasks running in the Docker container) in cgroups {quote} [~abudnik] You are saying the `--docker_mesos_image` flag, right? I think when this flag is specified, the Docker executor will run in a Docker container with the Docker image specified via this flag, but the task will be launched by Docker executor in *another* Docker container because Docker executor always calls `docker run` to launch task as a Docker container and in Docker containerizer we mount the docker socket for Docker executor so it can communicate with the Docker daemon in agent host (see [here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L333:L339] for details). And another important thing is, in any cases, what `DockerContainerizerProcess::update` tries to update is the Docker container of task instead of Docker container of executor (because [container->containerName|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.hpp#L486] is always the task container's name), that is really confusing me as I mentioned in my previous comment. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice'
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16908250#comment-16908250 ] Andrei Budnik commented on MESOS-9836: -- {quote} So what is the purpose of Docker containerizer's update method? {quote} As Mesos provides an option to run a Docker image as an (custom?) executor, it might make sense to update the Docker container's resources (executor+its tasks running in the Docker container) in cgroups. If this is the case, we probably should deprecate such an option? Ignoring `update` for Docker c'zer sounds like a good idea. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.slice' > I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754'
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16907981#comment-16907981 ] Qian Zhang commented on MESOS-9836: --- {quote}A fix would be after obtaining the pid through docker inspect, we look into the proc filesystem and get the cgroup right away and cache it, as well as other properties we originally retrieve through pid if any. {quote} [~chhsia0] Yeah, we could get and catch the cgroup, but we also need to recover it after agent restarts which might be complicated. {quote}Probably we should leave out all cgroups not containing "docker" substring instead of (or in addition to) filtering [the system root cgroup|https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1783-L1788]. It's ugly, hacky, and introduces a dependency on Docker's runtime. {quote} [~abudnik] I think a typical cgroup for Docker container is like: {code:java} /sys/fs/cgroup//docker/ {code} Are you saying in `DockerContainerizerProcess::__update` we do not update anything if the cgroup does not contain "docker" substring (i.e., it is not a Docker container)? Yeah, it should work, but as you mentioned it will introduce a dependency on Docker which is unfortunate. I am actually wondering why we need to update Docker container's CPU & memory in Docker containerizer. For the task launched via Docker containerizer, the method `DockerContainerizerProcess::update` will be called twice: # When Docker executor registers agent (see [here|https://github.com/apache/mesos/blob/1.8.1/src/slave/slave.cpp#L5210] for details): The resources passed into `DockerContainerizerProcess::update` is same as the container's resources, so we will not update anything (see [here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L1675:L1680] for details). # When agent receives a terminal status update for the task (see [here|https://github.com/apache/mesos/blob/1.8.1/src/slave/slave.cpp#L5841:L5846] for details): Since the container has terminated, what `docker inspect` returns will not contain a pid, so we will not update anything as well (see [here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L1754:L1756] for details). The reason that we have the issue mentioned in this ticket is, between #1 and #2 (i.e., the Docker container is stilling running) `DockerContainerizerProcess::usage` is called (e.g., agent's /monitor/statistics endpoint is hit), internally it calls `docker inspect` which will return a valid pid, and we will cache the pid (see [here|https://github.com/apache/mesos/blob/1.8.1/src/slave/containerizer/docker.cpp#L2025] for details). And after that but before #2, that pid may be reused by a process which is agent process's child, so we will overwrite agent's cgroup in #2. It is Docker executor to launch task as a Docker container, so it is a bit weird for Docker containerizer to directly update the Docker container's resources in cgroups which should be the job of Docker executor. And I do not see why we need to support updating Docker container's resource at runtime. So what is the purpose of Docker containerizer's update method? To me, it should just do nothing. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes >
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892844#comment-16892844 ] Andrei Budnik commented on MESOS-9836: -- A typical cgroup for Docker containers looks like: {code:java} /system.slice/docker-3a91c29381522918a2f2cad05583b172f415da4010bad672c21a19356aec1d69.scope {code} Probably we should leave out all cgroups not containing "docker" substring instead of (or in addition to) filtering [the system root cgroup|https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1783-L1788]. It's ugly, hacky, and introduces a dependency on Docker's runtime. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16892622#comment-16892622 ] Benjamin Bannier commented on MESOS-9836: - Any update on this [~gilbert]? We saw this again. This is a bad issue as it mainly manifests as just degraded agent performance. It could only be caught by monitoring of the agent cgroups. Can you please bump this up? > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > Labels: docker, mesosphere > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.slice' > I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to > 'mesos_executors.slice' > ... > {noformat} > It was highly likely that the container itself exited around 06:09:35, way > before the docker executor detected and reported the termina
[jira] [Commented] (MESOS-9836) Docker containerizer overwrites `/mesos/slave` cgroups.
[ https://issues.apache.org/jira/browse/MESOS-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16861255#comment-16861255 ] Chun-Hung Hsiao commented on MESOS-9836: A fix would be after obtaining the pid through {{docker inspect}}, we look into the proc filesystem and get the cgroup right away and cache it, as well as other properties we originally retrieve through pid if any. > Docker containerizer overwrites `/mesos/slave` cgroups. > --- > > Key: MESOS-9836 > URL: https://issues.apache.org/jira/browse/MESOS-9836 > Project: Mesos > Issue Type: Bug > Components: containerization >Reporter: Chun-Hung Hsiao >Priority: Critical > > The following bug was observed on our internal testing cluster. > The docker containerizer launched a container on an agent: > {noformat} > I0523 06:00:53.888579 21815 docker.cpp:1195] Starting container > 'f69c8a8c-eba4-4494-a305-0956a44a6ad2' for task > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1' (and executor > 'apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1') of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 > I0523 06:00:54.524171 21815 docker.cpp:783] Checkpointing pid 13716 to > '/var/lib/mesos/slave/meta/slaves/60c42ab7-eb1a-4cec-b03d-ea06bff00c3f-S2/frameworks/415284b7-2967-407d-b66f-f445e93f064e-0011/executors/apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1/runs/f69c8a8c-eba4-4494-a305-0956a44a6ad2/pids/forked.pid' > {noformat} > After the container was launched, the docker containerizer did a {{docker > inspect}} on the container and cached the pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1764] > The pid should be slightly greater than 13716. > The docker executor sent a {{TASK_FINISHED}} status update around 16 minutes > later: > {noformat} > I0523 06:16:17.287595 21809 slave.cpp:5566] Handling status update > TASK_FINISHED (Status UUID: 4e00b786-b773-46cd-8327-c7deb08f1de9) for task > apps_docker-sleep-app.1fda5b8e-7d20-11e9-9717-7aa030269ee1 of framework > 415284b7-2967-407d-b66f-f445e93f064e-0011 from executor(1)@172.31.1.7:36244 > {noformat} > After receiving the terminal status update, the agent asked the docker > containerizer to update {{cpu.cfs_period_us}}, {{cpu.cfs_quota_us}} and > {{memory.soft_limit_in_bytes}} of the container through the cached pid: > > [https://github.com/apache/mesos/blob/0c431dd60ae39138cc7e8b099d41ad794c02c9a9/src/slave/containerizer/docker.cpp#L1696] > {noformat} > I0523 06:16:17.290447 21815 docker.cpp:1868] Updated 'cpu.shares' to 102 at > /sys/fs/cgroup/cpu,cpuacct/mesos/slave for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.290660 21815 docker.cpp:1895] Updated 'cpu.cfs_period_us' to > 100ms and 'cpu.cfs_quota_us' to 10ms (cpus 0.1) for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > I0523 06:16:17.889816 21815 docker.cpp:1937] Updated > 'memory.soft_limit_in_bytes' to 32MB for container > f69c8a8c-eba4-4494-a305-0956a44a6ad2 > {noformat} > Note that the cgroup of {{cpu.shares}} was {{/mesos/slave}}. This was > possibly because that over the 16 minutes the pid got reused: > {noformat} > # zgrep 'systemd.cpp:98\]' /var/log/mesos/archive/mesos-agent.log.12.gz > ... > I0523 06:00:54.525178 21815 systemd.cpp:98] Assigned child process '13716' to > 'mesos_executors.slice' > I0523 06:00:55.078546 21808 systemd.cpp:98] Assigned child process '13798' to > 'mesos_executors.slice' > I0523 06:00:55.134096 21808 systemd.cpp:98] Assigned child process '13799' to > 'mesos_executors.slice' > ... > I0523 06:06:30.997439 21808 systemd.cpp:98] Assigned child process '32689' to > 'mesos_executors.slice' > I0523 06:06:31.050976 21808 systemd.cpp:98] Assigned child process '32690' to > 'mesos_executors.slice' > I0523 06:06:31.110514 21815 systemd.cpp:98] Assigned child process '32692' to > 'mesos_executors.slice' > I0523 06:06:33.143726 21818 systemd.cpp:98] Assigned child process '446' to > 'mesos_executors.slice' > I0523 06:06:33.196251 21818 systemd.cpp:98] Assigned child process '447' to > 'mesos_executors.slice' > I0523 06:06:33.266332 21816 systemd.cpp:98] Assigned child process '449' to > 'mesos_executors.slice' > ... > I0523 06:09:34.870056 21808 systemd.cpp:98] Assigned child process '13717' to > 'mesos_executors.slice' > I0523 06:09:34.937762 21813 systemd.cpp:98] Assigned child process '13744' to > 'mesos_executors.slice' > I0523 06:09:35.073971 21817 systemd.cpp:98] Assigned child process '13754' to > 'mesos_executors.slice' > ... > {noformat} > It was highly likely that the container itself exited around 06:09:35, way > before the docker executor detected and reported the terminal status update, > and then its pid was reused