Re: [Devel] [PATCH v2] cgroup/cpuset: emulate cgroup in container

Pavel Tikhomirov Thu, 14 Dec 2017 00:12:08 -0800

Other idea is - maybe we can not fake cpuset cgroup, but just allowthese controller in container?

Main Idea of faking/hiding cpuset was: cpuset is not virtuallized(wedon't have virtual processors) so container can bind itself to physicalcpus and memory nodes. If several containers bind to same cpu they willend up competing for these cpu resources, it can influence performancebadly. https://jira.sw.ru/browse/PSBM-30541

But AFAIKS performance is degraded only for containers which setupcpuset badly, all others are still scheduled on all cores and are fine.So we protect customers from themselves.

We can even add a feature to enable/disable cpuset per CT, e.g. vzctlsets ve.cpuset_enabled in ve cgroup before it's start, and after thatfrom ve cgroup ctinit mounts cpuset in CT if it is listed in/proc/cgroups. Note we also need to do the same on criu restore.


On 12/13/2017 07:52 PM, Stanislav Kinsburskiy wrote:

Any changes to this cgroup are skipped in container, but success code is
returned.
The idea is to fool Docker/Kubernetes.

https://jira.sw.ru/browse/PSBM-58423

This patch obsoletes "ve/proc/cpuset: do not show cpuset in CT"

v2:
Do not attach tasks in cpuset_change_cpumask on cpuset set change, it
requested from non-super VE.
This is a second part of the logic.
The first was to not change cpuset for newly added task. This one - to not
set new cpuset for all the tasks in cgroup

Signed-off-by: Stanislav Kinsburskiy <skinsbur...@virtuozzo.com>
---
  kernel/cpuset.c |   12 ++++++++++++
  1 file changed, 12 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 26d88eb..43b1410 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -848,6 +848,9 @@ static int cpuset_test_cpumask(struct task_struct *tsk,
  static void cpuset_change_cpumask(struct task_struct *tsk,
                                  struct cgroup_scanner *scan)
  {
+       if (!ve_is_super(get_exec_env()))
+               return;
+

Likely we have to do the same for nodemask too if we choose to fakecpuset cgroup, and maybe some others:


ls /sys/fs/cgroup/cpuset/cpuset.*
/sys/fs/cgroup/cpuset/cpuset.cpu_exclusive
/sys/fs/cgroup/cpuset/cpuset.cpus
/sys/fs/cgroup/cpuset/cpuset.mem_exclusive
/sys/fs/cgroup/cpuset/cpuset.mem_hardwall
/sys/fs/cgroup/cpuset/cpuset.memory_migrate
/sys/fs/cgroup/cpuset/cpuset.memory_pressure
/sys/fs/cgroup/cpuset/cpuset.memory_pressure_enabled
/sys/fs/cgroup/cpuset/cpuset.memory_spread_page
/sys/fs/cgroup/cpuset/cpuset.memory_spread_slab
/sys/fs/cgroup/cpuset/cpuset.mems
/sys/fs/cgroup/cpuset/cpuset.sched_load_balance
/sys/fs/cgroup/cpuset/cpuset.sched_relax_domain_leve

        set_cpus_allowed_ptr(tsk, ((cgroup_cs(scan->cg))->cpus_allowed));
  }

@@ -1441,6 +1444,9 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)

        struct task_struct *task;
        int ret;

+ if (!ve_is_super(get_exec_env()))

+               return 0;
+
        mutex_lock(&cpuset_mutex);

ret = -ENOSPC;

@@ -1470,6 +1476,9 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
  static void cpuset_cancel_attach(struct cgroup *cgrp,
                                 struct cgroup_taskset *tset)
  {
+       if (!ve_is_super(get_exec_env()))
+               return;
+
        mutex_lock(&cpuset_mutex);
        cgroup_cs(cgrp)->attach_in_progress--;
        mutex_unlock(&cpuset_mutex);
@@ -1494,6 +1503,9 @@ static void cpuset_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
        struct cpuset *cs = cgroup_cs(cgrp);
        struct cpuset *oldcs = cgroup_cs(oldcgrp);

+ if (!ve_is_super(get_exec_env()))

+               return;
+
        mutex_lock(&cpuset_mutex);

/* prepare for attach */


--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
_______________________________________________
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel

Re: [Devel] [PATCH v2] cgroup/cpuset: emulate cgroup in container

Reply via email to