> On Jan. 8, 2020, 5:43 a.m., Greg Mann wrote: > > src/slave/slave.cpp > > Lines 3140-3201 (patched) > > <https://reviews.apache.org/r/71858/diff/4/?file=2191535#file2191535line3140> > > > > I think maybe this logic could be easier to read if we do something > > like: > > > > ``` > > auto limitsAreSet = [](const vector<TaskInfo>& tasks) { > > foreach (const TaskInfo& task, tasks) { > > if (!task.limits().empty()) { > > return true; > > } > > } > > > > return false; > > }; > > > > Option<Map<string, Value::Scalar>> executorLimits; > > if (limitsAreSet(tasks)) { > > executorLimits = Map<string, Value::Scalar>(); > > foreach (const TaskInfo& task, tasks) { > > // Add up task limits/requests here. > > } > > } > > ``` > > > > What do you think? > > Qian Zhang wrote: > I am a bit confused how it will simplify the logic here, like: how will > you do the `Add up task limits/requests here`? I guess you still need the > code from L3140 to L3201, right? > > Greg Mann wrote: > Ah sorry, before I answer your question I have another one: currently, > your code will only set the executor's cpu limit if one or more tasks have a > cpu limit set, and will only set the executor's mem limit if one or more > tasks have the memory limit set. However, I think we also want to set the cpu > limit if one or more tasks has a _memory_ limit set, and we want to set the > memory limit if one or more tasks has a _cpu_ limit set, right? This way, if > a single task under an executor sets either a cpu or memory limit, then all > tasks will have both the cpu and memory limit set (and if it wasn't specified > for a particular task, it will be set to the default, which is the value of > the request). > > Qian Zhang wrote: > So if a framework launches two tasks (t1 and t2) with a same executor, t1 > has CPU limit specified but no memory limit specified, t2 has both of the CPU > and memory limits not specified, then we should not only set CPU hard limit > but also the memory hard limit (i.e., set it to t1's memory request + t2's > memory request) in the executor container's cgroups, right? I think we have > already done it because executor's resource requests always includes all > task's resource requests (see L3209 in this patch > https://reviews.apache.org/r/71858/diff/6 ), and in the memory cgroups > (`memory.cpp`) we will set the executor container's memory hard limit > (`memory.limit_in_bytes`) to its memory request if its memory limit is not > specified (see https://reviews.apache.org/r/71943/ for details). > > And similarly if t1 has memory limit specified but no CPU limit > specified, in the CPU cgroups we will set the executor container's CPU hard > limit (CFS quota) to the executor's CPU request if `--cgroups_enable_cfs` is > true. > > Greg Mann wrote: > > I think we have already done it because executor's resource requests > always includes all task's resource requests (see L3209 in this patch > https://reviews.apache.org/r/71858/diff/6 ) > > That works when the executor is first launched, but will it be updated > when additional task groups are sent to the same executor? > > > And similarly if t1 has memory limit specified but no CPU limit > specified, in the CPU cgroups we will set the executor container's CPU hard > limit (CFS quota) to the executor's CPU request if --cgroups_enable_cfs is > true. > > If t1 has a memory limit specified, then don't we want to set the CFS > quota regardless of whether or not the '--cgroups_enable_cfs' flag is set? > > Qian Zhang wrote: > > That works when the executor is first launched, but will it be updated > when additional task groups are sent to the same executor? > > Yes, see https://reviews.apache.org/r/71952/ for details. > > > If t1 has a memory limit specified, then don't we want to set the CFS > quota regardless of whether or not the '--cgroups_enable_cfs' flag is set? > > I do not think we want to do it. In that case, there is no CPU limit > specified for both t1 and t2 (which means user does not want CPU hard limit > set) and CFS quota is disabled via the `--cgroups_enable_cfs` flag, so why do > we want to set CFS quota when user does not want it? > > For memory limit, it is a bit different, the original behavior is we > always set both soft limit and hard limit to the memory allocated to the task > (i.e. memory request) when launching a container and there is no agent flag > to control it. So to keep backward compatibility, we want to set memory hard > limit even there is no memory limit specified in the task (i.e., set hard > limit to memory request). And what we want to change here is, setting hard > limit to a value which is larger than soft limit if memory limit is specified > in the task. > > Greg Mann wrote: > In the design doc, the table outlining different states of > 'shared_cgroups' says that when 'shared_cgroups = false', we would always set > the hard limits for both CPU and memory cgroups, even if the user has not set > limits. If we still go with that approach, then I think we would ignore the > `--cgroups_enable_cfs` flag in the above case of t1. > > Is there a good reason we should pay attention to the > `--cgroups_enable_cfs` flag when 'shared_cgroups = false'? It seems intuitive > to me that opting in to per-nested-container cgroups means that hard limits > will also be set for those cgroups, WDYT? > > Qian Zhang wrote: > Basically I think we should set cpu limt and memory limit independently > rather than coupling them. For example, if memory limit is specified for task > but CPU limit is not specified, then we should leave the CPU hard limit (CFS > quota) to the original behavior for backward compatibility, i.e., whether > setting CFS quota depends on the `--cgroups_enable_cfs` flag. Otherwise if we > set the CPU hard limit too and `--cgroups_enable_cfs` is set to false in > user's env, then user may be confused why their task's CPU usage is throttled > since he/she does not specify CPU limit for the task at all (i.e., what user > really wants is just a memory constraint on their tasks). > > Greg Mann wrote: > I think that exposing the CFS behavior via the agent flag is confusing > API design, when we already have a mechanism for specifying no CPU limit in > the new resource limits API. If a user wants a container to have no CPU > limit, then they can set it to Infinity. > > It seems a bit confusing because, for example, if the user sets the > `--cgroups_enable_cfs` flag to 'false', then we would not pay attention to > that flag when a user has set a CPU limit for their task with > `share_cgroups==false`, but we _would_ pay attention to that flag when they > do not set a CPU limit for a task with `share_cgroups==false`. It's also > confusing because we don't have a corresponding agent flag for memory limits. > > Seems more consistent to me to never pay attention to the > `--cgroups_enable_cfs` flag when resource limits are being used (in other > words, when `share_cgroups=false`).
> if the user sets the --cgroups_enable_cfs flag to 'false', then we would not > pay attention to that flag when a user has set a CPU limit for their task > with share_cgroups==false. If a task in a task group has `share_cgroups==false` and has CPU limit set, then we will just go ahead with setting CFS quota for this task and will not care about the `--cgroups_enable_cfs` flag at all. So `share_cgroup` is higher priority than `--cgroups_enable_cfs` but not the reverse. > but we would pay attention to that flag when they do not set a CPU limit for > a task with share_cgroups==false. Could you please elaborate a bit why this is confusing? If a task in a task group has `share_cgroups==false` but has no CPU limit set, then we just fall back to the previous behavior (i.e., setting CFS quoto or not depends on the `--cgroups_enable_cfs` flag) for backward compatibility, I think it makes sense, right? > Seems more consistent to me to never pay attention to the > --cgroups_enable_cfs flag when resource limits are being used (in other > words, when share_cgroups=false). I agree when CPU limits is set for a task, we do not need to care about the `--cgroups_enable_cfs` flag, actually that is exactly what I does in this patch https://reviews.apache.org/r/71886/ . However I think setting resource limits and setting `share_cgroups==false` are two seperate things, setting `share_cgroups==false` does not mean resource limits are being used since it is allowed to set `share_cgroups==false` but not set resource limits at all for a task. - Qian ----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/71858/#review219154 ----------------------------------------------------------- On Feb. 25, 2020, 9:46 a.m., Qian Zhang wrote: > > ----------------------------------------------------------- > This is an automatically generated e-mail. To reply, visit: > https://reviews.apache.org/r/71858/ > ----------------------------------------------------------- > > (Updated Feb. 25, 2020, 9:46 a.m.) > > > Review request for mesos, Andrei Budnik and Greg Mann. > > > Bugs: MESOS-10046 > https://issues.apache.org/jira/browse/MESOS-10046 > > > Repository: mesos > > > Description > ------- > > Set resource limits when launching executor container. > > > Diffs > ----- > > src/slave/slave.hpp 03279db082bdba23dbfeb2d93081a908e609aec2 > src/slave/slave.cpp cce275a504effae7a6b71dd333ce8a300d1ce5be > > > Diff: https://reviews.apache.org/r/71858/diff/10/ > > > Testing > ------- > > > Thanks, > > Qian Zhang > >