Re: Review Request 71858: Set resource limits when launching executor container.

Qian Zhang Thu, 27 Feb 2020 19:23:23 -0800


> On Jan. 8, 2020, 5:43 a.m., Greg Mann wrote:
> > src/slave/slave.cpp
> > Lines 3140-3201 (patched)
> > <https://reviews.apache.org/r/71858/diff/4/?file=2191535#file2191535line3140>
> >
> >     I think maybe this logic could be easier to read if we do something 
> > like:
> >     
> >     ```
> >     auto limitsAreSet = [](const vector<TaskInfo>& tasks) {
> >       foreach (const TaskInfo& task, tasks) {
> >         if (!task.limits().empty()) {
> >           return true;
> >         }
> >       }
> >       
> >       return false;
> >     };
> >     
> >     Option<Map<string, Value::Scalar>> executorLimits;
> >     if (limitsAreSet(tasks)) {
> >       executorLimits = Map<string, Value::Scalar>();
> >       foreach (const TaskInfo& task, tasks) {
> >         // Add up task limits/requests here.
> >       }
> >     }
> >     ```
> >     
> >     What do you think?
> 
> Qian Zhang wrote:
>     I am a bit confused how it will simplify the logic here, like: how will 
> you do the `Add up task limits/requests here`? I guess you still need the 
> code from L3140 to L3201, right?
> 
> Greg Mann wrote:
>     Ah sorry, before I answer your question I have another one: currently, 
> your code will only set the executor's cpu limit if one or more tasks have a 
> cpu limit set, and will only set the executor's mem limit if one or more 
> tasks have the memory limit set. However, I think we also want to set the cpu 
> limit if one or more tasks has a _memory_ limit set, and we want to set the 
> memory limit if one or more tasks has a _cpu_ limit set, right? This way, if 
> a single task under an executor sets either a cpu or memory limit, then all 
> tasks will have both the cpu and memory limit set (and if it wasn't specified 
> for a particular task, it will be set to the default, which is the value of 
> the request).
> 
> Qian Zhang wrote:
>     So if a framework launches two tasks (t1 and t2) with a same executor, t1 
> has CPU limit specified but no memory limit specified, t2 has both of the CPU 
> and memory limits not specified, then we should not only set CPU hard limit 
> but also the memory hard limit (i.e., set it to t1's memory request + t2's 
> memory request) in the executor container's cgroups, right? I think we have 
> already done it because executor's resource requests always includes all 
> task's resource requests (see L3209 in this patch 
> https://reviews.apache.org/r/71858/diff/6 ), and in the memory cgroups 
> (`memory.cpp`) we will set the executor container's memory hard limit 
> (`memory.limit_in_bytes`) to its memory request if its memory limit is not 
> specified (see https://reviews.apache.org/r/71943/ for details).
>     
>     And similarly if t1 has memory limit specified but no CPU limit 
> specified, in the CPU cgroups we will set the executor container's CPU hard 
> limit (CFS quota) to the executor's CPU request if `--cgroups_enable_cfs` is 
> true.
> 
> Greg Mann wrote:
>     > I think we have already done it because executor's resource requests 
> always includes all task's resource requests (see L3209 in this patch 
> https://reviews.apache.org/r/71858/diff/6 )
>     
>     That works when the executor is first launched, but will it be updated 
> when additional task groups are sent to the same executor?
>     
>     > And similarly if t1 has memory limit specified but no CPU limit 
> specified, in the CPU cgroups we will set the executor container's CPU hard 
> limit (CFS quota) to the executor's CPU request if --cgroups_enable_cfs is 
> true.
>     
>     If t1 has a memory limit specified, then don't we want to set the CFS 
> quota regardless of whether or not the '--cgroups_enable_cfs' flag is set?
> 
> Qian Zhang wrote:
>     > That works when the executor is first launched, but will it be updated 
> when additional task groups are sent to the same executor?
>     
>     Yes, see https://reviews.apache.org/r/71952/ for details.
>     
>     > If t1 has a memory limit specified, then don't we want to set the CFS 
> quota regardless of whether or not the '--cgroups_enable_cfs' flag is set?
>     
>     I do not think we want to do it. In that case, there is no CPU limit 
> specified for both t1 and t2 (which means user does not want CPU hard limit 
> set) and CFS quota is disabled via the `--cgroups_enable_cfs` flag, so why do 
> we want to set CFS quota when user does not want it?
>     
>     For memory limit, it is a bit different, the original behavior is we 
> always set both soft limit and hard limit to the memory allocated to the task 
> (i.e. memory request) when launching a container and there is no agent flag 
> to control it. So to keep backward compatibility, we want to set memory hard 
> limit even there is no memory limit specified in the task (i.e., set hard 
> limit to memory request). And what we want to change here is, setting hard 
> limit to a value which is larger than soft limit if memory limit is specified 
> in the task.
> 
> Greg Mann wrote:
>     In the design doc, the table outlining different states of 
> 'shared_cgroups' says that when 'shared_cgroups = false', we would always set 
> the hard limits for both CPU and memory cgroups, even if the user has not set 
> limits. If we still go with that approach, then I think we would ignore the 
> `--cgroups_enable_cfs` flag in the above case of t1.
>     
>     Is there a good reason we should pay attention to the 
> `--cgroups_enable_cfs` flag when 'shared_cgroups = false'? It seems intuitive 
> to me that opting in to per-nested-container cgroups means that hard limits 
> will also be set for those cgroups, WDYT?
> 
> Qian Zhang wrote:
>     Basically I think we should set cpu limt and memory limit independently 
> rather than coupling them. For example, if memory limit is specified for task 
> but CPU limit is not specified, then we should leave the CPU hard limit (CFS 
> quota) to the original behavior for backward compatibility, i.e., whether 
> setting CFS quota depends on the `--cgroups_enable_cfs` flag. Otherwise if we 
> set the CPU hard limit too and `--cgroups_enable_cfs` is set to false in 
> user's env, then user may be confused why their task's CPU usage is throttled 
> since he/she does not specify CPU limit for the task at all (i.e., what user 
> really wants is just a memory constraint on their tasks).
> 
> Greg Mann wrote:
>     I think that exposing the CFS behavior via the agent flag is confusing 
> API design, when we already have a mechanism for specifying no CPU limit in 
> the new resource limits API. If a user wants a container to have no CPU 
> limit, then they can set it to Infinity.
>     
>     It seems a bit confusing because, for example, if the user sets the 
> `--cgroups_enable_cfs` flag to 'false', then we would not pay attention to 
> that flag when a user has set a CPU limit for their task with 
> `share_cgroups==false`, but we _would_ pay attention to that flag when they 
> do not set a CPU limit for a task with `share_cgroups==false`. It's also 
> confusing because we don't have a corresponding agent flag for memory limits.
>     
>     Seems more consistent to me to never pay attention to the 
> `--cgroups_enable_cfs` flag when resource limits are being used (in other 
> words, when `share_cgroups=false`).

> if the user sets the --cgroups_enable_cfs flag to 'false', then we would not 
> pay attention to that flag when a user has set a CPU limit for their task 
> with share_cgroups==false.

If a task in a task group has `share_cgroups==false` and has CPU limit set, 
then we will just go ahead with setting CFS quota for this task and will not 
care about the `--cgroups_enable_cfs` flag at all. So `share_cgroup` is higher 
priority than `--cgroups_enable_cfs` but not the reverse.

> but we would pay attention to that flag when they do not set a CPU limit for 
> a task with share_cgroups==false.

Could you please elaborate a bit why this is confusing? If a task in a task 
group has `share_cgroups==false` but has no CPU limit set, then we just fall 
back to the previous behavior (i.e., setting CFS quoto or not depends on the 
`--cgroups_enable_cfs` flag) for backward compatibility, I think it makes 
sense, right?

> Seems more consistent to me to never pay attention to the 
> --cgroups_enable_cfs flag when resource limits are being used (in other 
> words, when share_cgroups=false).

I agree when CPU limits is set for a task, we do not need to care about the 
`--cgroups_enable_cfs` flag, actually that is exactly what I does in this patch 
https://reviews.apache.org/r/71886/ . However I think setting resource limits 
and setting `share_cgroups==false` are two seperate things, setting 
`share_cgroups==false` does not mean resource limits are being used since it is 
allowed to set `share_cgroups==false` but not set resource limits at all for a 
task.

- Qian

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/71858/#review219154
-----------------------------------------------------------

On Feb. 25, 2020, 9:46 a.m., Qian Zhang wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/71858/
> -----------------------------------------------------------
> 
> (Updated Feb. 25, 2020, 9:46 a.m.)
> 
> 
> Review request for mesos, Andrei Budnik and Greg Mann.
> 
> 
> Bugs: MESOS-10046
>     https://issues.apache.org/jira/browse/MESOS-10046
> 
> 
> Repository: mesos
> 
> 
> Description
> -------
> 
> Set resource limits when launching executor container.
> 
> 
> Diffs
> -----
> 
>   src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 
>   src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be 
> 
> 
> Diff: https://reviews.apache.org/r/71858/diff/10/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Qian Zhang
> 
>

Re: Review Request 71858: Set resource limits when launching executor container.

Reply via email to