[Devel] [PATCH rh7 3/3] Port cpustat related patches

2015-05-29 Thread Vladimir Davydov
This patch ports: diff-sched-rework-_proc_stat-output diff-sched-fix-output-of-vestat-idle diff-sched-make-allowance-for-vcpu-rate-in-_proc_stat diff-sched-hide-steal-time-from-inside-CT diff-sched-cpu.proc.stat-always-count-nr_running-and-co-on-all-cpus Author: Vladimir Davydov Email: vdavy...@p

[Devel] [PATCH rh7 2/3] sched: use cpuacct->cpustat for showing cpu stats

2015-05-29 Thread Vladimir Davydov
In contrast to RH6, where tg->cpustat was used, now cpu stats are accounted in the cpuacct cgroup. So zap tg->cpustat and use cpuacct->cpustat for showing cpu.proc.stat instead. Fortunately cpu and cpuacct cgroups are always mounted together (even by systemd by default), so this will work. Signed-

[Devel] [PATCH rh7 1/3] Revert "SCHED: rework cputime accounting (v2)"

2015-05-29 Thread Vladimir Davydov
This reverts commit 6071473d0440fcfd128f3243dfb82d19f6aef668. The above-mentioned commit dramatically complicates porting of cpu acct patches from RH6, so revert it. The next patch will fix cpu accounting once again. Signed-off-by: Vladimir Davydov Conflicts: kernel/sched/core.c --- dr

Re: [Devel] [PATCH] ve: Kill tcp_v4_kill_ve_sockets()

2015-05-29 Thread Andrew Vagin
Acked-by: Andrew Vagin I'm agree that we need to remove this function, but I don't know how it fixes the bug. On Fri, May 29, 2015 at 04:53:39PM +0300, Kirill Tkhai wrote: > This is a leftover from earlier versions of PCS, > and we do not need that functionality in 3.10. > > It was the reason o

[Devel] [PATCH 7/7] packet: Pre-account maximum socket buffer into cg memory

2015-05-29 Thread Pavel Emelyanov
Packet sockets have incoming queue of packets that is only limited with per-socket wmem buffer. Strictly speaking we should sum up all the queues and charge them into kmem once new packet arrives, but this will result in huge patch. Since there's typically quite a few of packet sockets in container

[Devel] [PATCH 6/7] netlink: Make all in-cg memory be kmem accounted

2015-05-29 Thread Pavel Emelyanov
So, this one is tricky. Right now most (all but one place) of the memory allocations in netlink code happen in process context and are done via kmalloc/slub. Thus they are auto-accounted into kmem. The single exceptional place is in netlink_alloc_large_skb where big sending packets are allocated w

[Devel] [PATCH 5/7] unix: Charge outgoing buffers into cg memory

2015-05-29 Thread Pavel Emelyanov
For unix sockets there's no such thing as "read buffers" as all the data is accounted on the send paths. Fortunate enough most of the stuff is already kmem-auto-charged except one thing -- paged dgram skbs. Signed-off-by: Pavel Emelyanov --- net/core/sock.c | 2 +- 1 file changed, 1 insertion(+

[Devel] [PATCH 4/7] udp: Charge ingress buffers into cg memory

2015-05-29 Thread Pavel Emelyanov
Right now UDP outgoing traffic is kmem-auto-charged into cg kmem. Incoming traffic is not, but it has tcp-like memory scheduler (but simpler, with just one limit). So here's the per-cgroup UDP read buffers limiting in the same was as TCP is done. Signed-off-by: Pavel Emelyanov --- include/net/

[Devel] [PATCH 3/7] tcp: Limit orphan sockets per-cg

2015-05-29 Thread Pavel Emelyanov
Kernel limits the total number of orphan TCP sockets in the system. One container can eat all this limit and make others' container to suffer from TCP state machine breakage. Thus here's the per-CT limit on the number of prphans that doesn't affect the others. The limit is set to be 1/4-th of the

[Devel] [PATCH 2/7] tcp: Charge socket buffers into cg memory

2015-05-29 Thread Pavel Emelyanov
TCP code already has internal memory management for both -- in and out traffic. The outgoing packets are also already auto accounted into kmem (and into cg memory), incoming traffic is not accounted into kmem. And this management is already per-cg thanks to Glauber work some time ago. So TCP mm fi

[Devel] [PATCH 1/7] bc: Rip old network buffers and sockets accounting

2015-05-29 Thread Pavel Emelyanov
The previous BC approach was based on two ideas. The first is that we know the maximum amount of sockets and thair buffers a container needs. The second was about "poll semantics" i.e. when a poll() reports "writable", the write/send system call cannot block on buffer limit hit. To address the latt

[Devel] [PATCH rh7 0/7] Rework containers' network memory management

2015-05-29 Thread Pavel Emelyanov
Hi, This is the set that replaces old limit-based network BC code with simpler approach -- all the socket buffers are just charged into container's memory. Such things as TCP memory/window management is already per-CG upstream, so we'll just re-use this code. -- Pavel ___

[Devel] [PATCH] ve: Kill tcp_v4_kill_ve_sockets()

2015-05-29 Thread Kirill Tkhai
This is a leftover from earlier versions of PCS, and we do not need that functionality in 3.10. It was the reason of the kernel panic in 2.6.32: https://jira.sw.ru/browse/PSBM-33755, in the test of forced CT destroying. Also, tcp_v4_kill_ve_sockets() looks as the only user of synchronize_net(), s

[Devel] [PATCH rh7] Port diff-sched-return-only-virtual-cpus-in-sched_getaffinity

2015-05-29 Thread Vladimir Davydov
And diff-sched-fix-sched_getaffinity-on-unexisting-tasks-inside-VE, which is a quickfix for the original patch. Author: Kirill Tkhai Email: ktk...@parallels.com Subject: sched: Return only virtual cpus in sched_getaffinity() Date: Mon, 28 Apr 2014 12:45:38 +0400 Fill mask using virtual cpus which

[Devel] [PATCH rh7] Port diff-sched-clear-prev-entity-if-curr-is-dequeued

2015-05-29 Thread Vladimir Davydov
Author: Vladimir Davydov Email: vdavy...@parallels.com Subject: sched: clear prev entity if curr is dequeued Date: Fri, 20 Sep 2013 16:55:23 +0400 cfs_rq->prev is used for ctxsw accounting: on put_prev_entity() cfs_rq->prev is set to curr if curr is on rq, and on set_next_entity() nr_switches is i

[Devel] [PATCH rh7] Port diff-sched-increase-SCHED_LOAD_SCALE-resolution

2015-05-29 Thread Vladimir Davydov
This patch is already upstream, but the feature is disabled by default. This patch only enables it. Author: Vladimir Davydov Email: vdavy...@parallels.com Subject: sched: Increase SCHED_LOAD_SCALE resolution Date: Tue, 25 Dec 2012 13:33:21 +0400 Mainstream commit c8b281161dfa4bb5d5be63fb036ce1934

[Devel] [PATCH rh7] Port diff-sched-initialize-runtime-to-non-zero-on-cfs-bw-set

2015-05-29 Thread Vladimir Davydov
Author: Vladimir Davydov Email: vdavy...@parallels.com Subject: sched: initialize runtime to non-zero on cfs bw set Date: Mon, 21 Jan 2013 11:44:58 +0400 * [sched] running tasks could be throttled and never unthrottled thus causing random node hangs. (PSBM-17658) If cfs_rq->runtime_remain

[Devel] [PATCH rh7] Port diff-fairsched-do-not-account-iothrottled-tasks-in-loadavg-core

2015-05-29 Thread Vladimir Davydov
Author: Vladimir Davydov Email: vdavy...@parallels.com Subject: sched: do not account iothrottled tasks in loadavg Date: Mon, 22 Oct 2012 14:27:25 +0400 Changes from v1: * do not account tasks throttled while doing reclaim too Signed-off-by: Vladimir Davydov Acked-by: Konstantin Khlebnikov ===

[Devel] [PATCH RHEL7 COMMIT] ub/netfilter: account x_tables to ub

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit b103b8fdcd475a79e13c3e8f5325c92da90fd401 Author: Vladimir Davydov Date: Fri May 29 17:01:34 2015 +0400 ub/netfilter: accoun

[Devel] [PATCH RHEL7 COMMIT] cgroup: Mangle cgroups root from inside of VE view

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit e5f41176c3b8653ff237f1fd695d608a176aa46b Author: Cyrill Gorcunov Date: Fri May 29 16:50:55 2015 +0400 cgroup: Mangle cgroup

[Devel] [PATCH RHEL7 COMMIT] cgroup: mount -- Disable mounting from inside of VE context

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit bd28914a36ef98c893dbeb269a0bd4859151936e Author: Cyrill Gorcunov Date: Fri May 29 16:50:49 2015 +0400 cgroup: mount -- Disa

Re: [Devel] [PATCH rh7] netfilter: account x_tables to ub

2015-05-29 Thread Andrew Vagin
Acked-by: Andrew Vagin On Thu, May 28, 2015 at 05:43:18PM +0300, Vladimir Davydov wrote: > This patch ports the code accounting netfilter/x_tables to ub > (UB_NUMXTENT) from RH6. > > Related to https://jira.sw.ru/browse/PSBM-20089 > > Signed-off-by: Vladimir Davydov > --- > include/linux/netf

Re: [Devel] [PATCH rh7] netfilter: account x_tables to ub

2015-05-29 Thread Vladimir Davydov
On Fri, May 29, 2015 at 03:07:07PM +0300, Andrew Vagin wrote: > > +static int recharge_xtables(struct xt_table_info *new, struct > > xt_table_info *old) > > +{ > > + struct user_beancounter *ub, *old_ub; > > + long change; > > + > > + ub = new->ub; > > + old_ub = old->number ? old->ub : ub

Re: [Devel] [PATCH rh7] netfilter: account x_tables to ub

2015-05-29 Thread Andrew Vagin
On Thu, May 28, 2015 at 05:43:18PM +0300, Vladimir Davydov wrote: > This patch ports the code accounting netfilter/x_tables to ub > (UB_NUMXTENT) from RH6. > > Related to https://jira.sw.ru/browse/PSBM-20089 > > Signed-off-by: Vladimir Davydov > --- > include/linux/netfilter/x_tables.h | 4 +++

[Devel] [PATCH RHEL7 COMMIT] fairsched: Port diff-fairsched-optimize-sys_fairsched_mvpr

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 901bf5dd708537e483a90ebcf62af57239e63702 Author: Vladimir Davydov Date: Fri May 29 15:49:38 2015 +0400 fairsched: Port diff

[Devel] [PATCH RHEL7 COMMIT] ve: remove fairsched node only in the legacy mode

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 72bcfd69eb80e5ca7624a6c0665494990fd06faf Author: Vladimir Davydov Date: Fri May 29 15:45:19 2015 +0400 ve: remove fairsched

[Devel] [PATCH rh7] Port diff-fairsched-optimize-sys_fairsched_mvpr

2015-05-29 Thread Vladimir Davydov
Author: Konstantin Khlebnikov Email: khlebni...@openvz.org Subject: fairsched: optimize sys_fairsched_mvpr() Date: Wed, 08 May 2013 13:59:04 +0400 Remove task_list_lock and optimize for current. https://jira.sw.ru/browse/PCLIN-26766 Signed-off-by: Konstantin Khlebnikov =

[Devel] [PATCH rh7] ve: remove fairsched node only in the legacy mode

2015-05-29 Thread Vladimir Davydov
Currently, we try and fail to remove the fairsched node of a UUID-named container from the kernel on stop: CT: 956ebfc3-3ca9-44e0-9739-ab8abbe50edc: started Can't remove fairsched node 1073741823 err=-2 CT: 956ebfc3-3ca9-44e0-9739-ab8abbe50edc: stopped We should only do that for containers

[Devel] [PATCH RHEL7 COMMIT] ms/memcg: do not call high reclaim if !__GFP_WAIT

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit c238509b21ac68629812f1a27d7c4e95a75a815a Author: Vladimir Davydov Date: Fri May 29 14:53:11 2015 +0400 ms/memcg: do not cal

[Devel] [PATCH RHEL7 COMMIT] fairsched: fix output of /proc/fairsched[2]

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 7b768085e56d53826162aef58ace5f58adc1b4c9 Author: Vladimir Davydov Date: Fri May 29 14:53:03 2015 +0400 fairsched: fix outpu

Re: [Devel] [PATCH rh7] net: Use {get, put}_net() in inet_twsk_{alloc, free}()

2015-05-29 Thread Kirill Tkhai
В Чт, 28/05/2015 в 18:56 +0300, Andrew Vagin пишет: > On Wed, May 27, 2015 at 02:32:22PM +0300, Kirill Tkhai wrote: > > hold_net() doesn't increment net refcounter if NETNS_REFCNT_DEBUG > > is not defined. In this case inet_twdr_do_twkill_work() may happen > > after network is destoyed and lead to

Re: [Devel] [patch rh7 2/2] cgroup: Mangle cgroups root from inside of VE view

2015-05-29 Thread Vladimir Davydov
On Tue, May 26, 2015 at 06:00:52PM +0300, Cyrill Gorcunov wrote: > We're bindmounting cgroups for container so if say a container > is having CTID=200 then @cgroups and @mountinfo output will > contain /200 as a root. Which makes Docker to lookup for > appropriate directory inside /sys/fs/cgroup/ >

Re: [Devel] [patch rh7 2/2] cgroup: Mangle cgroups root from inside of VE view

2015-05-29 Thread Vladimir Davydov
On Fri, May 29, 2015 at 11:30:00AM +0300, Cyrill Gorcunov wrote: > On Fri, May 29, 2015 at 11:18:52AM +0300, Vladimir Davydov wrote: > > Hi Cyrill, > > > > On Tue, May 26, 2015 at 06:00:52PM +0300, Cyrill Gorcunov wrote: > > > We're bindmounting cgroups for container so if say a container > > > is

[Devel] [PATCH RHEL7 COMMIT] ub: ub_dirty_limits: obtain ram size from memcg

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 12d5e82b5955aab42daf5b688d2cba662de8111b Author: Vladimir Davydov Date: Fri May 29 12:36:01 2015 +0400 ub: ub_dirty_limits:

[Devel] [PATCH RHEL7 COMMIT] shmem: fix tmpfs_ram_pages

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 2d3d5fbf8c45b0e297ff429f0fbaa15f8e35d647 Author: Vladimir Davydov Date: Fri May 29 12:35:55 2015 +0400 shmem: fix tmpfs_ram

[Devel] [PATCH RHEL7 COMMIT] memcg: add function to get container's ram size

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit a67159e9a53cc3328df60c472c3cbb850c6a5fbb Author: Vladimir Davydov Date: Fri May 29 12:35:45 2015 +0400 memcg: add function

Re: [Devel] [patch rh7 2/2] cgroup: Mangle cgroups root from inside of VE view

2015-05-29 Thread Vladimir Davydov
Hi Cyrill, On Tue, May 26, 2015 at 06:00:52PM +0300, Cyrill Gorcunov wrote: > We're bindmounting cgroups for container so if say a container > is having CTID=200 then @cgroups and @mountinfo output will > contain /200 as a root. Which makes Docker to lookup for > appropriate directory inside /sys/

Re: [Devel] [patch rh7 1/2] cgroup: mount -- Disable mounting from inside of VE context

2015-05-29 Thread Vladimir Davydov
On Tue, May 26, 2015 at 06:00:51PM +0300, Cyrill Gorcunov wrote: > Even mounting knowing cgroups (ie ones which already known to VE and > been mounted by vzctl or any other tool for containter sake) is not > that harmless as it might look like. In particular this introduce > additional performance

[Devel] [PATCH RHEL7 COMMIT] ve/kmod: Add rules for new {ip, ip6, x}table modules

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit d2e9d1ba7e3acc37c18ae91a11df1fb5bba2972c Author: Kirill Tkhai Date: Fri May 29 12:02:00 2015 +0400 ve/kmod: Add rules for n

[Devel] [PATCH RHEL7 COMMIT] ve/kmod: Add rules for autoloading (new) nf_tables

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 5f6dbce004ffb21b500d930b46d2b85287619f6d Author: Kirill Tkhai Date: Fri May 29 12:01:52 2015 +0400 ve/kmod: Add rules for a

[Devel] [PATCH RHEL7 COMMIT] ms/memcg: port memory.high

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 4038cd0e029ddee1c3308216bdc5da6c4485656b Author: Vladimir Davydov Date: Fri May 29 11:55:46 2015 +0400 ms/memcg: port memor

[Devel] [PATCH RHEL7 COMMIT] ms/memcg: use CFTYPE_NOT_ON_ROOT for memory.low and memory.oom_guarantee

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 24dbf07906c8e288ed18f895f49c36dc850213bb Author: Vladimir Davydov Date: Fri May 29 11:55:40 2015 +0400 ms/memcg: use CFTYPE

[Devel] [PATCH RHEL7 COMMIT] ms/memcg: rename memory.low_limit_in_bytes to memory.low

2015-05-29 Thread Konstantin Khorenko
The commit is pushed to "branch-rh7-3.10.0-123.1.2-ovz" and will appear at https://src.openvz.org/scm/ovz/vzkernel.git after rh7-3.10.0-123.1.2.vz7.5.7 --> commit 9eafb07898336d06d2003d45e8c682837be2ae2e Author: Vladimir Davydov Date: Fri May 29 11:55:35 2015 +0400 ms/memcg: rename mem