[Devel] [PATCH v2] cgroup/cpuset: emulate cgroup in container

2017-12-13 Thread Stanislav Kinsburskiy
Any changes to this cgroup are skipped in container, but success code is
returned.
The idea is to fool Docker/Kubernetes.

https://jira.sw.ru/browse/PSBM-58423

This patch obsoletes "ve/proc/cpuset: do not show cpuset in CT"

v2:
Do not attach tasks in cpuset_change_cpumask on cpuset set change, it
requested from non-super VE.
This is a second part of the logic.
The first was to not change cpuset for newly added task. This one - to not
set new cpuset for all the tasks in cgroup.

Signed-off-by: Stanislav Kinsburskiy 
---
 kernel/cpuset.c |   12 
 1 file changed, 12 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 26d88eb..43b1410 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -848,6 +848,9 @@ static int cpuset_test_cpumask(struct task_struct *tsk,
 static void cpuset_change_cpumask(struct task_struct *tsk,
  struct cgroup_scanner *scan)
 {
+   if (!ve_is_super(get_exec_env()))
+   return;
+
set_cpus_allowed_ptr(tsk, ((cgroup_cs(scan->cg))->cpus_allowed));
 }
 
@@ -1441,6 +1444,9 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
struct task_struct *task;
int ret;
 
+   if (!ve_is_super(get_exec_env()))
+   return 0;
+
mutex_lock(&cpuset_mutex);
 
ret = -ENOSPC;
@@ -1470,6 +1476,9 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
 static void cpuset_cancel_attach(struct cgroup *cgrp,
 struct cgroup_taskset *tset)
 {
+   if (!ve_is_super(get_exec_env()))
+   return;
+
mutex_lock(&cpuset_mutex);
cgroup_cs(cgrp)->attach_in_progress--;
mutex_unlock(&cpuset_mutex);
@@ -1494,6 +1503,9 @@ static void cpuset_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
struct cpuset *cs = cgroup_cs(cgrp);
struct cpuset *oldcs = cgroup_cs(oldcgrp);
 
+   if (!ve_is_super(get_exec_env()))
+   return;
+
mutex_lock(&cpuset_mutex);
 
/* prepare for attach */

___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel


Re: [Devel] [PATCH v2] cgroup/cpuset: emulate cgroup in container

2017-12-14 Thread Pavel Tikhomirov
Other idea is - maybe we can not fake cpuset cgroup, but just allow 
these controller in container?


Main Idea of faking/hiding cpuset was: cpuset is not virtuallized(we 
don't have virtual processors) so container can bind itself to physical 
cpus and memory nodes. If several containers bind to same cpu they will 
end up competing for these cpu resources, it can influence performance 
badly. https://jira.sw.ru/browse/PSBM-30541


But AFAIKS performance is degraded only for containers which setup 
cpuset badly, all others are still scheduled on all cores and are fine. 
So we protect customers from themselves.


We can even add a feature to enable/disable cpuset per CT, e.g. vzctl 
sets ve.cpuset_enabled in ve cgroup before it's start, and after that 
from ve cgroup ctinit mounts cpuset in CT if it is listed in 
/proc/cgroups. Note we also need to do the same on criu restore.


On 12/13/2017 07:52 PM, Stanislav Kinsburskiy wrote:

Any changes to this cgroup are skipped in container, but success code is
returned.
The idea is to fool Docker/Kubernetes.

https://jira.sw.ru/browse/PSBM-58423

This patch obsoletes "ve/proc/cpuset: do not show cpuset in CT"

v2:
Do not attach tasks in cpuset_change_cpumask on cpuset set change, it
requested from non-super VE.
This is a second part of the logic.
The first was to not change cpuset for newly added task. This one - to not
set new cpuset for all the tasks in cgroup

Signed-off-by: Stanislav Kinsburskiy 
---
  kernel/cpuset.c |   12 
  1 file changed, 12 insertions(+)

diff --git a/kernel/cpuset.c b/kernel/cpuset.c
index 26d88eb..43b1410 100644
--- a/kernel/cpuset.c
+++ b/kernel/cpuset.c
@@ -848,6 +848,9 @@ static int cpuset_test_cpumask(struct task_struct *tsk,
  static void cpuset_change_cpumask(struct task_struct *tsk,
  struct cgroup_scanner *scan)
  {
+   if (!ve_is_super(get_exec_env()))
+   return;
+


Likely we have to do the same for nodemask too if we choose to fake 
cpuset cgroup, and maybe some others:


ls /sys/fs/cgroup/cpuset/cpuset.*
/sys/fs/cgroup/cpuset/cpuset.cpu_exclusive
/sys/fs/cgroup/cpuset/cpuset.cpus
/sys/fs/cgroup/cpuset/cpuset.mem_exclusive
/sys/fs/cgroup/cpuset/cpuset.mem_hardwall
/sys/fs/cgroup/cpuset/cpuset.memory_migrate
/sys/fs/cgroup/cpuset/cpuset.memory_pressure
/sys/fs/cgroup/cpuset/cpuset.memory_pressure_enabled
/sys/fs/cgroup/cpuset/cpuset.memory_spread_page
/sys/fs/cgroup/cpuset/cpuset.memory_spread_slab
/sys/fs/cgroup/cpuset/cpuset.mems
/sys/fs/cgroup/cpuset/cpuset.sched_load_balance
/sys/fs/cgroup/cpuset/cpuset.sched_relax_domain_leve


set_cpus_allowed_ptr(tsk, ((cgroup_cs(scan->cg))->cpus_allowed));
  }
  
@@ -1441,6 +1444,9 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct cgroup_taskset *tset)

struct task_struct *task;
int ret;
  
+	if (!ve_is_super(get_exec_env()))

+   return 0;
+
mutex_lock(&cpuset_mutex);
  
  	ret = -ENOSPC;

@@ -1470,6 +1476,9 @@ static int cpuset_can_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
  static void cpuset_cancel_attach(struct cgroup *cgrp,
 struct cgroup_taskset *tset)
  {
+   if (!ve_is_super(get_exec_env()))
+   return;
+
mutex_lock(&cpuset_mutex);
cgroup_cs(cgrp)->attach_in_progress--;
mutex_unlock(&cpuset_mutex);
@@ -1494,6 +1503,9 @@ static void cpuset_attach(struct cgroup *cgrp, struct 
cgroup_taskset *tset)
struct cpuset *cs = cgroup_cs(cgrp);
struct cpuset *oldcs = cgroup_cs(oldcgrp);
  
+	if (!ve_is_super(get_exec_env()))

+   return;
+
mutex_lock(&cpuset_mutex);
  
  	/* prepare for attach */




--
Best regards, Tikhomirov Pavel
Software Developer, Virtuozzo.
___
Devel mailing list
Devel@openvz.org
https://lists.openvz.org/mailman/listinfo/devel