[Devel] [PATCH v7 00/11] per-cgroup cpu-stat

2013-05-29 Thread Glauber Costa
Peter et. al, I am coming again with this series, hoping this is a better time for you all to look at it. I am *not* going as far as marking cpuacct deprecated, because I think it deserves a special discussion (even though my position in this matter is widely known), but all the infrastructure to

[Devel] [PATCH v7 01/11] don't call cpuacct_charge in stop_task.c

2013-05-29 Thread Glauber Costa
Commit 8f618968 changed stop_task to do the same bookkeping as the other classes. However, the call to cpuacct_charge() doesn't affect the scheduler decisions at all, and doesn't need to be moved over. Moreover, being a kthread, the migration thread won't belong to any cgroup anyway, rendering thi

[Devel] [PATCH v7 04/11] sched: adjust exec_clock to use it as cpu usage metric

2013-05-29 Thread Glauber Costa
exec_clock already provides per-group cpu usage metrics, and can be reused by cpuacct in case cpu and cpuacct are comounted. However, it is only provided by tasks in fair class. Doing the same for rt is easy, and can be done in an already existing hierarchy loop. This is an improvement over the in

[Devel] [PATCH v7 06/11] sched: document the cpu cgroup.

2013-05-29 Thread Glauber Costa
The CPU cgroup is so far, undocumented. Although data exists in the Documentation directory about its functioning, it is usually spread, and/or presented in the context of something else. This file consolidates all cgroup-related information about it. Signed-off-by: Glauber Costa --- Documentati

[Devel] [PATCH v7 07/11] sched: account guest time per-cgroup as well.

2013-05-29 Thread Glauber Costa
We already track multiple tick statistics per-cgroup, using the task_group_account_field facility. This patch accounts guest_time in that manner as well. Signed-off-by: Glauber Costa CC: Peter Zijlstra CC: Paul Turner --- kernel/sched/cputime.c | 10 -- 1 file changed, 4 insertions(+),

[Devel] [PATCH v7 08/11] sched: Push put_prev_task() into pick_next_task()

2013-05-29 Thread Glauber Costa
From: Peter Zijlstra In order to avoid having to do put/set on a whole cgroup hierarchy when we context switch, push the put into pick_next_task() so that both operations are in the same function. Further changes then allow us to possibly optimize away redundant work. [ glom...@openvz.org: incor

[Devel] [PATCH v7 09/11] sched: record per-cgroup number of context switches

2013-05-29 Thread Glauber Costa
Context switches are, to this moment, a property of the runqueue. When running containers, we would like to be able to present a separate figure for each container (or cgroup, in this context). The chosen way to accomplish this is to increment a per cfs_rq or rt_rq, depending on the task, for each

[Devel] [PATCH v7 10/11] sched: change nr_context_switches calculation.

2013-05-29 Thread Glauber Costa
This patch changes the calculation of nr_context_switches. The variable "nr_switches" is now used to account for the number of transition to the idle task, or stop task. It is removed from the schedule() path. The total calculation can be made using the fact that the transitions to fair and rt cla

[Devel] [PATCH v7 11/11] sched: introduce cgroup file stat_percpu

2013-05-29 Thread Glauber Costa
The file cpu.stat_percpu will show various scheduler related information, that are usually available to the top level through other files. For instance, most of the meaningful data in /proc/stat is presented here. Given this file, a container can easily construct a local copy of /proc/stat for int

[Devel] [PATCH v7 03/11] cgroup, sched: let cpu serve the same files as cpuacct

2013-05-29 Thread Glauber Costa
From: Tejun Heo cpuacct being on a separate hierarchy is one of the main cgroup related complaints from scheduler side and the consensus seems to be * Allowing cpuacct to be a separate controller was a mistake. In general multiple controllers on the same type of resource should be avoided,

[Devel] [PATCH v7 05/11] cpuacct: don't actually do anything.

2013-05-29 Thread Glauber Costa
All the information we have that is needed for cpuusage (and cpuusage_percpu) is present in schedstats. It is already recorded in a sane hierarchical way. If we have CONFIG_SCHEDSTATS, we don't really need to do any extra work. All former functions become empty inlines. Signed-off-by: Glauber Cos

[Devel] [PATCH v7 02/11] cgroup: implement CFTYPE_NO_PREFIX

2013-05-29 Thread Glauber Costa
From: Tejun Heo When cgroup files are created, cgroup core automatically prepends the name of the subsystem as prefix. This patch adds CFTYPE_NO_PREFIX which disables the automatic prefix. This will be used to deprecate cpuacct which will make cpu create and serve the cpuacct files. Signed-off

Re: [Devel] [PATCH 3/3] cpt: restore veth devices with correct names

2013-05-29 Thread Kir Kolyshkin
On 05/27/2013 02:12 PM, Andrey Vagin wrote: transmit pair of veth names to criu via the option --veth-pair v2: unset IFS and delete eval from vps-rst Signed-off-by: Andrey Vagin --- scripts/vps-rst.in | 15 +-- src/lib/hooks_ct.c | 18 ++ 2 files changed, 27 ins

[Devel] [PATCH 1/2] vzctl: add ability to skip creation of veth devices (v2)

2013-05-29 Thread Andrey Vagin
It will be used for resuming CT with help CRIU. CRIU restores veth devices and configures them inside CT, so vzctl should configures them on the host side. Signed-off-by: Andrey Vagin --- include/types.h | 2 ++ scripts/vps-netns_dev_add.in | 5 - src/lib/hooks_ct.c

[Devel] [PATCH 2/2] cpt: restore veth devices with correct names (v3)

2013-05-29 Thread Andrey Vagin
transmit pair of veth names to criu via the option --veth-pair v2: unset IFS and delete eval from vps-rst v3: fix comments from Kir Signed-off-by: Andrey Vagin --- scripts/vps-rst.in | 12 ++-- src/lib/hooks_ct.c | 16 +--- 2 files changed, 23 insertions(+), 5 deletions(-)