[
https://issues.apache.org/jira/browse/MESOS-941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13880624#comment-13880624
]
Vinod Kone commented on MESOS-941:
----------------------------------
https://reviews.apache.org/r/17295/
> Memory limit not correctly set when no memory resource set on executor level
> ----------------------------------------------------------------------------
>
> Key: MESOS-941
> URL: https://issues.apache.org/jira/browse/MESOS-941
> Project: Mesos
> Issue Type: Bug
> Components: slave
> Reporter: Lin Zhao
> Assignee: Vinod Kone
> Fix For: 0.17.0
>
>
> When a framework is launched with memory resource only set on the tasks, and
> non set on the executor level, the slave fails to apply the memory control
> needed to limit memory usage for the executor. The executor process can use
> more resident memory than specified in the tasks.
> Example framework: https://gist.github.com/lin-zhao/8544495. This framework
> was tested with Mesos 0.14.2 on Centos 6, kernel 3.10.11-1.el6.x86_64.
> According to Benjamin Mahler:
> What's happening is that you're launching an executor with no resources,
> consequently before we fork, we attempt to update the memory control but we
> don't call the memory handler since the executor has no memory resources:
> I0121 19:39:01.660071 8566 cgroups_isolator.cpp:516] Launching default
> (/home/lin/test-executor) in
> /tmp/mesos/slaves/201312032357-3645772810-5050-2033-0/frameworks/201401171812-2907575306-5050-19011-0020/executors/default/runs/8bc2ab10-8988-4b22-afa2-3433bbedc3ed
> with resources for framework 201401171812-2907575306-5050-19011-0020 in
> cgroup
> mesos/framework_201401171812-2907575306-5050-19011-0020_executor_default_tag_8bc2ab10-8988-4b22-afa2-3433bbedc3ed
> I0121 19:39:01.663082 8566 cgroups_isolator.cpp:709] Changing cgroup
> controls for executor default of framework
> 201401171812-2907575306-5050-19011-0020 with resources
> I0121 19:39:01.667129 8566 cgroups_isolator.cpp:1163] Started listening for
> OOM events for executor default of framework
> 201401171812-2907575306-5050-19011-0020
> I0121 19:39:01.681857 8566 cgroups_isolator.cpp:568] Forked executor at =
> 27609
> Then, later, when we are updating the resources for your 128MB task, we set
> the soft limit, but we don't set the hard limit because the following buggy
> check is not satisfied:
> // Determine whether to set the hard limit. If this is the first
> // time (info->pid.isNone()), or we're raising the existing limit,
> // then we can update the hard limit safely. Otherwise, if we need
> // to decrease 'memory.limit_in_bytes' we may induce an OOM if too
> // much memory is in use. As a result, we only update the soft
> // limit when the memory reservation is being reduced. This is
> // probably okay if the machine has available resources.
> // TODO(benh): Introduce a MemoryWatcherProcess which monitors the
> // discrepancy between usage and soft limit and introduces a
> // "manual oom" if necessary.
> if (info->pid.isNone() || limit > currentLimit.get()) {
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)