Re: [PATCH] cpuidle: menu: use nr_running instead of cpuload for calculating perf mult

2012-11-27 Thread Vladimir Davydov
small loads, which would probably lead to the cpuidle governor making wrong decisions due to overestimating the system load. So, this seems to be another reason to use some different performance multiplier in cpuidle governor. On Jun 4, 2012, at 2:24 PM, Vladimir Davydov wrote: > rq->cpulo

[PATCH] mqueue: sys_mq_open: do not call mnt_drop_write() if read-only

2013-03-19 Thread Vladimir Davydov
mnt_drop_write() must be called only if mnt_want_write() succeeded, otherwise the mnt_writers counter will diverge. Signed-off-by: Vladimir Davydov Cc: Doug Ledford Cc: Andrew Morton Cc: KOSAKI Motohiro Cc: "Eric W. Biederman" --- ipc/mqueue.c |3 ++- 1 files changed, 2 insert

Re: [PATCH] mqueue: sys_mq_open: do not call mnt_drop_write() if read-only

2013-03-19 Thread Vladimir Davydov
On Mar 20, 2013, at 1:09 AM, Andrew Morton wrote: > On Tue, 19 Mar 2013 13:31:18 +0400 Vladimir Davydov > wrote: > >> mnt_drop_write() must be called only if mnt_want_write() succeeded, >> otherwise the mnt_writers counter will diverge. >> >> ... >&

[PATCH] sched: initialize runtime to non-zero on cfs bw set

2013-02-07 Thread Vladimir Davydov
gt;tg->cfs_bandwidth.timer_active=0 which conforms pretty nice to the explanation given above. Signed-off-by: Vladimir Davydov --- kernel/sched/core.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 26058d0..c7a078f 1006

Re: [PATCH] sched: initialize runtime to non-zero on cfs bw set

2013-02-08 Thread Vladimir Davydov
On Feb 8, 2013, at 6:46 PM, Paul Turner wrote: > On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote: >> If cfs_rq->runtime_remaining is <= 0 then either >> - cfs_rq is throttled and waiting for quota redistribution, or >> - cfs_rq is currently executi

Re: [PATCH] sched: initialize runtime to non-zero on cfs bw set

2013-02-08 Thread Vladimir Davydov
On Feb 8, 2013, at 7:26 PM, Vladimir Davydov wrote: > On Feb 8, 2013, at 6:46 PM, Paul Turner wrote: > >> On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote: >>> If cfs_rq->runtime_remaining is <= 0 then either >>> - cfs_rq is throttled and

[PATCH] netfilter: nf_conntrack: Batch cleanup

2013-03-14 Thread Vladimir Davydov
track # time modprobe -r nf_conntrack real 0m10.337s user 0m0.000s sys0m0.376s # modprobe nf_conntrack # time modprobe -r nf_conntrack real0m5.661s user0m0.000s sys 0m0.216s Signed-off-by: Vladimir Davydov Cc: Patrick McHardy Cc: "David S. Miller&quo

[PATCH] net: batch nf_conntrack_net_exit

2012-07-30 Thread Vladimir Davydov
The patch introduces nf_conntrack_cleanup_list(), which cleanups nf_conntracks for a list of netns and calls synchronize_net() only once for them all. --- include/net/netfilter/nf_conntrack_core.h | 10 +- net/netfilter/nf_conntrack_core.c | 21 + net/netfil

[PATCH RFC] sched: move h_load calculation to task_h_load

2013-07-13 Thread Vladimir Davydov
merqueue_add 5.93% libc-2.12.so [.] usleep Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c | 56 ++ kernel/sched/sched.h |7 +++ 2 files changed, 28 insertions(+), 35 deletions(-) diff --git a/kernel/sched/fair.c b/kern

[PATCH] sched: Fix task_h_load calculation

2013-09-14 Thread Vladimir Davydov
s of all runnable tasks there instead. Fix it. Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index 9b3fe1c..13abc29 100644 --- a/kernel/sched/fair.c +++ b/kernel/sche

[PATCH 2/2] sched: fix_small_imbalance: Fix local->avg_load > busiest->avg_load case

2013-09-15 Thread Vladimir Davydov
alent addition in the check. -- The bug can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c |4 ++-- 1 file

[PATCH 1/2] sched: calculate_imbalance: Fix local->avg_load > sds->avg_load case

2013-09-15 Thread Vladimir Davydov
d. -- The bug can be caught by running 2*N cpuhogs pinned to two logical cpus belonging to different cores on an HT-enabled machine with N logical cpus: just look at se.nr_migrations growth. Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c |3 ++- 1 file changed, 2 insertions(+), 1 delet

[PATCH 1/2] sched: load_balance: Prevent reselect prev dst_cpu if some pinned

2013-09-15 Thread Vladimir Davydov
Currently new_dst_cpu is prevented from being reselected actually, not dst_cpu. This can result in attempting to pull tasks to this_cpu twice. Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c |6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/sched/fair.c

[PATCH 2/2] sched: load_balance: Reset env when going to redo due to all pinned

2013-09-15 Thread Vladimir Davydov
s to allow handling 'some pinned' case when pulling tasks from a new busiest cpu. Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c | 12 ++-- 1 file changed, 10 insertions(+), 2 deletions(-) diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c index cd59640..d840e51

Re: [PATCH 1/2] sched: calculate_imbalance: Fix local->avg_load > sds->avg_load case

2013-09-16 Thread Vladimir Davydov
On 09/16/2013 09:52 AM, Peter Zijlstra wrote: On Sun, Sep 15, 2013 at 05:49:13PM +0400, Vladimir Davydov wrote: In busiest->group_imb case we can come to calculate_imbalance() with local->avg_load >= busiest->avg_load >= sds->avg_load. This can result in imbalance overf

Re: [PATCH 2/2] sched: load_balance: Reset env when going to redo due to all pinned

2013-09-16 Thread Vladimir Davydov
On 09/16/2013 09:43 AM, Peter Zijlstra wrote: On Sun, Sep 15, 2013 at 09:30:14PM +0400, Vladimir Davydov wrote: Firstly, reset env.dst_cpu/dst_rq to this_cpu/this_rq, because it could have changed in 'some pinned' case. Otherwise, should_we_balance() can stop balancing beforehand.

Re: [PATCH 00/19] pramfs

2013-09-08 Thread Vladimir Davydov
the code is more or less the same but, with a really big thanks to Vladimir Davydov and Parallels, the development of fsck has been started and we have now the possibility to correct fs errors due to corruption. It's a "young" tool but we are working on it. You can clone the code from

Re: [tip:sched/core] sched/balancing: Fix cfs_rq-> task_h_load calculation

2013-09-30 Thread Vladimir Davydov
On 09/29/2013 01:47 PM, Yuanhan Liu wrote: On Fri, Sep 20, 2013 at 06:46:59AM -0700, tip-bot for Vladimir Davydov wrote: Commit-ID: 7e3115ef5149fc502e3a2e80719dba54a8e7409d Gitweb:http://git.kernel.org/tip/7e3115ef5149fc502e3a2e80719dba54a8e7409d Author: Vladimir Davydov AuthorDate: Sat

[PATCH RFC] sched: boost throttled entities on wakeups

2012-10-18 Thread Vladimir Davydov
If several tasks in different cpu cgroups are contending for the same resource (e.g. a semaphore) and one of those task groups is cpu limited (using cfs bandwidth control), the priority inversion problem is likely to arise: if a cpu limited task goes to sleep holding the resource (e.g. trying to ta

Re: [Devel] [PATCH RFC] sched: boost throttled entities on wakeups

2012-10-18 Thread Vladimir Davydov
There is an error in the test script: I forgot to initialize cpuset.mems of test cgroups - without it it is impossible to add a task into a cpuset cgroup. Sorry for that. Fixed version of the test script is attached. On Oct 18, 2012, at 11:32 AM, Vladimir Davydov wrote: > If several tasks

Re: [PATCH RFC] sched: boost throttled entities on wakeups

2012-10-19 Thread Vladimir Davydov
Thank you for the answer. On Oct 19, 2012, at 6:24 PM, Peter Zijlstra wrote: > its a quick hack similar to existing hacks done for rt, preferably we'd > do smarter things though. If you have any ideas how to fix this in a better way, please share. -- To unsubscribe from this list: send the line

[PATCH RFC 00/13] PRAM: Persistent over-kexec memory storage

2013-07-01 Thread Vladimir Davydov
splicing data from/to shmem. This would allow avoiding memory copying on checkpoint/restore. * Save uptodate fs cache on umount to be restored on mount after kexec. Thanks, Vladimir Davydov (13): mm: add PRAM API stubs and Kconfig mm: PRAM: implement node load and save functions mm:

[PATCH RFC 13/13] mm: shmem: enable saving to PRAM

2013-07-01 Thread Vladimir Davydov
This patch illustrates how PRAM API can be used for making tmpfs 'persistent'. It adds 'pram=' option to tmpfs, which specifies the PRAM node to load/save FS tree from/to. If the option is passed on mount, shmem will look for the corresponding PRAM node and load the FS tree from it. On the subsequ

[PATCH RFC 12/13] mm: shmem: introduce shmem_insert_page

2013-07-01 Thread Vladimir Davydov
The function inserts a memory page to a shmem file under an arbitrary offset. If there is something at the specified offset (page or swap), the function fails. The function will be sued by the next patch. --- include/linux/shmem_fs.h |3 ++ mm/shmem.c | 68

[PATCH RFC 11/13] mm: PRAM: allow to free persistent memory from userspace

2013-07-01 Thread Vladimir Davydov
To free all space utilized for persistent memory, one can write 0 to /sys/kernel/pram. This will destroy all PRAM nodes that are not currently being read or written. --- mm/pram.c | 39 ++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/mm/pram.

[PATCH RFC 07/13] mm: PRAM: preserve persistent memory at boot

2013-07-01 Thread Vladimir Davydov
Persistent memory preservation is done by reserving memory pages belonging to PRAM at early boot so that they will not be recycled. If memory reservation fails for some reason (e.g. memory region is busy), persistent memory will be lost. Currently, PRAM preservation is only implemented for x86. --

[PATCH RFC 10/13] mm: PRAM: allow to ban arbitrary memory ranges

2013-07-01 Thread Vladimir Davydov
Banning for PRAM memory ranges that have been reserved at boot time is not enough for avoiding all conflicts. The point is that kexec may load the new kernel code to some address range that have never been reserved possibly overwriting persistent data. Fortunately, it is possible to specify a memo

[PATCH RFC 09/13] mm: PRAM: ban pages that have been reserved at boot time

2013-07-01 Thread Vladimir Davydov
Obviously, not all memory ranges can be used for saving persistent over-kexec data, because some of them are reserved by the system core and various device drivers at boot time. If a memory range used for initialization of a particular device turns out to be busy because PRAM uses it for storing it

[PATCH RFC 06/13] mm: PRAM: introduce super block

2013-07-01 Thread Vladimir Davydov
The PRAM super block is the starting point for restoring persistent memory. If the kernel locates the super block at boot time, it will preserve the persistent memory structure from the previous kernel. To point the kernel to the location of the super block, one should pass its pfn via the 'pram' b

[PATCH RFC 05/13] mm: PRAM: link nodes by pfn before reboot

2013-07-01 Thread Vladimir Davydov
Since page structs, which are used for linking PRAM nodes, are cleared on boot, organize all PRAM nodes into a list singly-linked by pfn's before reboot to facilitate the node list restore in the new kernel. --- mm/pram.c | 50 ++ 1 file changed, 5

[PATCH RFC 02/13] mm: PRAM: implement node load and save functions

2013-07-01 Thread Vladimir Davydov
Persistent memory is divided into nodes, which can be saved and loaded independently of each other. PRAM nodes are kept on the list and identified by unique names. Whenever a save operation is initiated by calling pram_prepare_save(), a new node is created and linked to the list. When the save oper

[PATCH RFC 03/13] mm: PRAM: implement page stream operations

2013-07-01 Thread Vladimir Davydov
Using the pram_save_page() function, one can populate PRAM nodes with memory pages, which can be then loaded using the pram_load_page() function. Saving a memory page to PRAM is implemented as storing the pfn in the PRAM node and incrementing its ref count so that it will not get freed after the la

[PATCH RFC 01/13] mm: add PRAM API stubs and Kconfig

2013-07-01 Thread Vladimir Davydov
Persistent memory subsys or PRAM is intended to be used for saving memory pages of the currently executing kernel and restoring them after a kexec in the newly booted one. This can be utilized for speeding up reboot by leaving process memory and/or FS caches in-place. The proposed API: * Persist

[PATCH RFC 08/13] mm: PRAM: checksum saved data

2013-07-01 Thread Vladimir Davydov
Checksum PRAM pages with crc32 to ensure persistent memory is not corrupted during reboot. --- mm/Kconfig |4 ++ mm/pram.c | 128 +++- 2 files changed, 130 insertions(+), 2 deletions(-) diff --git a/mm/Kconfig b/mm/Kconfig index f1e11a

[PATCH RFC 04/13] mm: PRAM: implement byte stream operations

2013-07-01 Thread Vladimir Davydov
This patch adds ability to save arbitrary byte strings to PRAM using pram_write() to be restored later using pram_read(). These two operations are implemented on top of pram_save_page() and pram_load_page() respectively. --- include/linux/pram.h |4 +++ mm/pram.c| 86

Re: [PATCH RFC] sched: move h_load calculation to task_h_load

2013-07-15 Thread Vladimir Davydov
On 07/15/2013 12:28 PM, Peter Zijlstra wrote: OK, fair enough. It does somewhat rely on us getting the single rq->clock update thing right, but that should be ok. Frankly, I doubt that rq->clock is the right thing to use here, because it can be updated very frequently under some conditions, so

[PATCH v2] sched: move h_load calculation to task_h_load

2013-07-15 Thread Vladimir Davydov
merqueue_add 5.93% libc-2.12.so [.] usleep Changes in v2: * use jiffies instead of rq->clock for last_h_load_update. Signed-off-by: Vladimir Davydov --- kernel/sched/fair.c | 58 +++--- kernel/sched/sched.h |7 +++--- 2 files ch

Re: [PATCH RFC] pram: persistent over-kexec memory file system

2013-07-27 Thread Vladimir Davydov
On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some

Re: [PATCH RFC] pram: persistent over-kexec memory file system

2013-07-28 Thread Vladimir Davydov
On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto: Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user

Re: [PATCH RFC] pram: persistent over-kexec memory file system

2013-07-28 Thread Vladimir Davydov
On 07/28/2013 03:02 PM, Marco Stornelli wrote: Il 28/07/2013 12:05, Vladimir Davydov ha scritto: On 07/27/2013 09:37 PM, Marco Stornelli wrote: Il 27/07/2013 19:35, Vladimir Davydov ha scritto: On 07/27/2013 07:41 PM, Marco Stornelli wrote: Il 26/07/2013 14:29, Vladimir Davydov ha scritto

[PATCH RFC] pram: persistent over-kexec memory file system

2013-07-26 Thread Vladimir Davydov
Hi, We want to propose a way to upgrade a kernel on a machine without restarting all the user-space services. This is to be done with CRIU project, but we need help from the kernel to preserve some data in memory while doing kexec. The key point of our implementation is leaving process memory in-

[PATCH 1/2] cpu: common: make clearcpuid option take bits list

2012-07-20 Thread Vladimir Davydov
It is more convenient to write 'clearcpuid=147,148,...' than 'clearcpuid=147 clearcpuid=148 ...' --- arch/x86/kernel/cpu/common.c |8 1 files changed, 4 insertions(+), 4 deletions(-) diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c index 6b9333b..8ffe1b9 10064

[PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-20 Thread Vladimir Davydov
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be reported in /proc/cpuinfo and used by the kernel. However, if a userpsace process checks CPU features directly using the cpuid instruction, it will be reported about all features supported by the CPU irrespective of what featur

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-20 Thread Vladimir Davydov
On Jul 20, 2012, at 9:20 PM, H. Peter Anvin wrote: > On 07/20/2012 09:37 AM, Vladimir Davydov wrote: >> If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be >> reported in /proc/cpuinfo and used by the kernel. However, if a >> userpsace

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-20 Thread Vladimir Davydov
On Jul 21, 2012, at 12:19 AM, H. Peter Anvin wrote: > On 07/20/2012 11:21 AM, Vladimir Davydov wrote: >>> >>> I am a bit concerned about this patch: >>> >>> 1. it silently changes existing behavior. >> >> Yes, but who needs the current imp

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/21/2012 02:37 PM, Borislav Petkov wrote: (+ Andre who's been doing some cross vendor stuff) On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote: If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be reported in /proc/cpuinfo and used by

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/24/2012 12:14 PM, Andre Przywara wrote: On 07/24/2012 09:06 AM, Vladimir Davydov wrote: On 07/21/2012 02:37 PM, Borislav Petkov wrote: (+ Andre who's been doing some cross vendor stuff) On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote: If 'clearcpuid=N'

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/24/2012 02:10 PM, Borislav Petkov wrote: On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote: I guess that when the more advanced features become widely-used, vendors will offer new MSRs and/or CPUID faulting. And this right there is the dealbreaker: So what are you doing

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-24 Thread Vladimir Davydov
On 07/25/2012 04:57 AM, H. Peter Anvin wrote: On 07/24/2012 04:09 AM, Vladimir Davydov wrote: We have not encountered this situation in our environments and I hope we won't :-) But look, these CPUID functions cover majority of CPU features, don't they? So, most of "normal"

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/24/2012 04:34 PM, Andre Przywara wrote: On 07/24/2012 01:09 PM, Vladimir Davydov wrote: On 07/24/2012 02:10 PM, Borislav Petkov wrote: On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote: I guess that when the more advanced features become widely-used, vendors will offer

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the /proc/cpuinfo filtering). Does this address the issues you have? You can do the /proc/cpuinfo filtering in user space too How? -- To unsubscribe from this list: send the line "unsubscribe li

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 02:58 PM, Andre Przywara wrote: On 07/25/2012 12:31 PM, Vladimir Davydov wrote: On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the /proc/cpuinfo filtering). Does this address the issues you have? You can do the /proc/cpuinfo

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 02:43 PM, Borislav Petkov wrote: On Wed, Jul 25, 2012 at 02:31:23PM +0400, Vladimir Davydov wrote: So, you prefer adding some filtering of /proc/cpuinfo into the mainstream kernel That's already there right? And your 1/2 patch was making toggling those bits easier. (no

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 03:17 PM, Andre Przywara wrote: On 07/25/2012 01:02 PM, Vladimir Davydov wrote: On 07/25/2012 02:58 PM, Andre Przywara wrote: On 07/25/2012 12:31 PM, Vladimir Davydov wrote: On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 03:31 PM, Alan Cox wrote: On Wed, 25 Jul 2012 14:31:30 +0400 Vladimir Davydov wrote: On 07/24/2012 04:44 PM, Alan Cox wrote: This approach does not need any kernel support (except for the /proc/cpuinfo filtering). Does this address the issues you have? You can do the /proc

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/25/2012 04:57 AM, H. Peter Anvin wrote: On 07/24/2012 04:09 AM, Vladimir Davydov wrote: We have not encountered this situation in our environments and I hope we won't :-) But look, these CPUID functions cover majority of CPU features, don't they? So, most of "normal"

Re: [PATCH 2/2] cpu: intel, amd: mask cleared cpuid features

2012-07-25 Thread Vladimir Davydov
On 07/20/2012 09:10 PM, Andi Kleen wrote: + unsigned int *msr_ext_cpuid_mask) +{ + unsigned int msr, msr_ext; + + msr = msr_ext = 0; + + switch (c->x86_model) { You have to check the family too. + + return msr; +} + +static void __c

[PATCH 2/2] block: account iowait time when waiting for completion of IO request

2013-02-14 Thread Vladimir Davydov
. Signed-off-by: Vladimir Davydov --- block/blk-exec.c |4 ++-- block/blk-flush.c |2 +- block/blk-lib.c |6 +++--- 3 files changed, 6 insertions(+), 6 deletions(-) diff --git a/block/blk-exec.c b/block/blk-exec.c index 74638ec..f634de7 100644 --- a/block/blk-exec.c +++ b/block/blk

[PATCH 1/2] sched: add wait_for_completion_io[_timeout]

2013-02-14 Thread Vladimir Davydov
accounting when the completion struct is actually used for waiting for IO (e.g. completion of a bio request in the block layer). Signed-off-by: Vladimir Davydov --- include/linux/completion.h |3 ++ kernel/sched/core.c| 57 2 files

Re: [PATCH] update sc->nr_reclaimed after each shrink_slab

2016-07-22 Thread Vladimir Davydov
On Fri, Jul 22, 2016 at 09:49:13AM +0200, Michal Hocko wrote: > On Fri 22-07-16 11:43:30, Zhou Chengming wrote: > > In !global_reclaim(sc) case, we should update sc->nr_reclaimed after each > > shrink_slab in the loop. Because we need the correct sc->nr_reclaimed > > value to see if we can break ou

Re: [PATCH v2] mm: oom: deduplicate victim selection code for memcg and global oom

2016-07-23 Thread Vladimir Davydov
On Thu, Jul 21, 2016 at 08:41:44AM -0400, Johannes Weiner wrote: > On Mon, Jun 27, 2016 at 07:39:54PM +0300, Vladimir Davydov wrote: > > When selecting an oom victim, we use the same heuristic for both memory > > cgroup and global oom. The only difference is the scope of tasks to

Re: [PATCH 1/3] mm: memcontrol: make per-cpu charge cache IRQ-safe for socket accounting

2016-09-19 Thread Vladimir Davydov
rain operation > will put references of in-use pages, thus causing the imbalance. > > Disable IRQs during all per-cpu charge cache operations. > > Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified > hierarchy memory controller") > Cc: # 4.5+ > Signed-off-by: Johannes Weiner Acked-by: Vladimir Davydov

Re: [PATCH 2/3] cgroup: duplicate cgroup reference when cloning sockets

2016-09-19 Thread Vladimir Davydov
s are destroyed later on. > > Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup") > Cc: # 4.5+ > Signed-off-by: Johannes Weiner Reviewed-by: Vladimir Davydov

Re: [PATCH v3 04/13] mm: Track NR_KERNEL_STACK in KiB instead of number of stacks

2016-06-21 Thread Vladimir Davydov
; Since frv has THREAD_SIZE < PAGE_SIZE, we need to track kernel stack > allocations in a unit that divides both THREAD_SIZE and PAGE_SIZE on > all architectures. Keep it simple and use KiB. > > Cc: Vladimir Davydov > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: linux...@kva

Re: [PATCH v3 05/13] mm: Fix memcg stack accounting for sub-page stacks

2016-06-21 Thread Vladimir Davydov
12580e4b54ba8 ("mm: memcontrol: report kernel stack usage in cgroup2 > memory.stat") > Cc: Vladimir Davydov > Cc: Johannes Weiner > Cc: Michal Hocko > Cc: linux...@kvack.org > Signed-off-by: Andy Lutomirski Reviewed-by: Vladimir Davydov This patch is going to have

Re: [PATCH 3/3] mm: memcontrol: fix cgroup creation failure after many small jobs

2016-06-21 Thread Vladimir Davydov
65K cgroups it will take the reclaimer a substantial amount of time to iterate over all of them, which might result in latency spikes. Probably, to avoid that, we could move pages from a dead cgroup's lru to its parent's one on offline while still leaving dead cgroups pinned, like we do i

Re: [PATCH] memcg: mem_cgroup_migrate() may be called with irq disabled

2016-06-21 Thread Vladimir Davydov
;0038fdaa>] aio_migratepage+0x16a/0x1e8) > ([<00310568>] move_to_new_page+0xb0/0x260) > ([<003111b4>] migrate_pages+0x8f4/0x9f0) > ([<002c507c>] compact_zone+0x4dc/0xdc8) > ([<002c5e22>] kcompactd_do_work+0x1aa/0x358) > ([<002c608a>] kcompactd+0xba/0x2c8) > ([<0016b09a>] kthread+0x10a/0x110) > ([<0095315a>] kernel_thread_starter+0x6/0xc) > ([<00953154>] kernel_thread_starter+0x0/0xc) > INFO: lockdep is turned off. > > Signed-off-by: Tejun Heo > Reported-by: Christian Borntraeger > Link: http://lkml.kernel.org/g/5767cfe5.7080...@de.ibm.com Reviewed-by: Vladimir Davydov

[PATCH] mm: vmscan: fix memcg-aware shrinkers not called on global reclaim

2016-08-01 Thread Vladimir Davydov
never invoked on global reclaim. Fix that. Fixes: d71df22b55099 ("mm, vmscan: begin reclaiming pages on a per-node basis") Signed-off-by: Vladimir Davydov --- mm/vmscan.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mm/vmscan.c b/mm/vmscan.c index 650d26832569..37

[PATCH 2/3] mm: memcontrol: fix memcg id ref counter on swap charge move

2016-08-01 Thread Vladimir Davydov
many small jobs") Signed-off-by: Vladimir Davydov --- mm/memcontrol.c | 24 ++-- 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 5fe285f27ea7..58c229071fb1 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4030,9 +

[PATCH 1/3] mm: memcontrol: fix swap counter leak on swapout from offline cgroup

2016-08-01 Thread Vladimir Davydov
") Signed-off-by: Vladimir Davydov --- mm/memcontrol.c | 27 +-- 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index b5804e4e6324..5fe285f27ea7 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -4035,6 +4035,13 @@ static v

Re: [PATCH] memcg: put soft limit reclaim out of way if the excess tree is empty

2016-08-01 Thread Vladimir Davydov
On Mon, Aug 01, 2016 at 12:00:21PM +0200, Michal Hocko wrote: ... > diff --git a/mm/memcontrol.c b/mm/memcontrol.c > index c265212bec8c..eb7e39c2d948 100644 > --- a/mm/memcontrol.c > +++ b/mm/memcontrol.c > @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct > mem_cgroup *memcg,

[PATCH] radix-tree: account nodes to memcg only if explicitly requested

2016-08-01 Thread Vladimir Davydov
radix tree nodes if it was explicitly requested by passing __GFP_ACCOUNT to INIT_RADIX_TREE. Currently, we only want to account page cache entries, so mark mapping->page_tree so. Signed-off-by: Vladimir Davydov --- fs/inode.c | 2 +- lib/radix-tree.c | 14 ++ 2 files cha

[PATCH v3] mm: oom: deduplicate victim selection code for memcg and global oom

2016-08-01 Thread Vladimir Davydov
l oom heuristic related code private to oom_kill.c and make oom_kill.c use exported memcg functions when it's really necessary (like in case of iterating over memcg tasks). Signed-off-by: Vladimir Davydov Acked-by: Johannes Weiner --- Changes in v3: - rebase on top of v4.7-mmotm-2016-07-28-16-33

[PATCH] Update my e-mail address

2016-08-31 Thread Vladimir Davydov
vdavydov@{parallels,virtuozzo}.com will bounce from now on. Signed-off-by: Vladimir Davydov --- .mailmap| 2 ++ MAINTAINERS | 2 +- 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/.mailmap b/.mailmap index b18912c5121e..de22daefd9da 100644 --- a/.mailmap +++ b/.mailmap

Re: [PATCH] Update my e-mail address

2016-08-31 Thread Vladimir Davydov
On Wed, Aug 31, 2016 at 07:26:43AM -0700, Greg wrote: > On Wed, 2016-08-31 at 15:01 +0300, Vladimir Davydov wrote: > > vdavydov@{parallels,virtuozzo}.com will bounce from now on. > > > > Signed-off-by: Vladimir Davydov > > Shouldn't MAINTAINERS be in the subject

Re: [BUG] Bad page states

2016-08-08 Thread Vladimir Davydov
On Mon, Aug 08, 2016 at 10:48:45AM -0700, Linus Torvalds wrote: ... > > [ 43.477693] BUG: Bad page state in process S05containers pfn:1ff02a3 > > [ 43.484417] page:ea007fc0a8c0 count:0 mapcount:-511 mapping: > > (null) index:0x0 > > [ 43.492737] flags: 0x1000() > >

[PATCH] mm: memcontrol: only mark charged pages with PageKmemcg

2016-08-08 Thread Vladimir Davydov
uot;) Reported-by: Eric Dumazet Signed-off-by: Vladimir Davydov Cc: [4.7+] --- fs/pipe.c | 4 +--- mm/memcontrol.c | 14 -- mm/page_alloc.c | 14 +- 3 files changed, 18 insertions(+), 14 deletions(-) diff --git a/fs/pipe.c b/fs/pipe.c index 4b32928f5426..4ebe6b2e5217

Re: [PATCH] mm: memcontrol: avoid unused function warning

2016-08-24 Thread Vladimir Davydov
l: fix swap counter leak on swapout from > offline cgroup") Acked-by: Vladimir Davydov

Re: [PATCH v2] fs/dcache.c: fix spin lockup issue on nlru->lock

2017-06-21 Thread Vladimir Davydov
x10 > > Fix this lockup by reducing the number of entries to be shrinked > from the lru list to 1024 at once. Also, add cond_resched() before > processing the lru list again. > > Link: http://marc.info/?t=14972286491&r=1&w=2 > Fix-suggested-by: Jan kara >

[PATCH 1/2] mm/slab: skip memcg reclaim only if in atomic context

2015-08-30 Thread Vladimir Davydov
used without it is fallback_alloc(), which, in contrast to other cache_grow() users, preallocates a page and passes it to cache_grow() so that the latter does not need to invoke kmem_getpages() by itself. Reported-by: Tejun Heo Signed-off-by: Vladimir Davydov

[PATCH 2/2] mm/slub: do not bypass memcg reclaim for high-order page allocation

2015-08-30 Thread Vladimir Davydov
orwarding it to memcg_charge_slab() if the context allows. Reported-by: Tejun Heo Signed-off-by: Vladimir Davydov --- mm/slub.c | 24 +++- 1 file changed, 11 insertions(+), 13 deletions(-) diff --git a/mm/slub.c b/mm/slub.c index e180f8dcd06d..416a332277cb 100644 --- a/mm/slub.c

[PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-30 Thread Vladimir Davydov
pages, memcg reclaim will not get invoked on kmem allocations, which will lead to uncontrollable growth of memory usage no matter what memory.high is set to. This patch set attempts to fix this issue. For more details please see comments to individual patches. Thanks, Vladimir Davydov (2): mm

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-31 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 03:24:15PM +0200, Michal Hocko wrote: > On Sun 30-08-15 22:02:16, Vladimir Davydov wrote: > > Tejun reported that sometimes memcg/memory.high threshold seems to be > > silently ignored if kmem accounting is enabled: > > > > http://www.spinics.

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-31 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 09:43:35AM -0400, Tejun Heo wrote: > On Mon, Aug 31, 2015 at 03:24:15PM +0200, Michal Hocko wrote: > > Right but isn't that what the caller explicitly asked for? Why should we > > ignore that for kmem accounting? It seems like a fix at a wrong layer to > > me. Either we shou

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-31 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 10:39:39AM -0400, Tejun Heo wrote: > On Mon, Aug 31, 2015 at 05:30:08PM +0300, Vladimir Davydov wrote: > > slab/slub can issue alloc_pages() any time with any flags they want and > > it won't be accounted to memcg, because kmem is accounted at slab/sl

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-31 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 10:46:04AM -0400, Tejun Heo wrote: > Hello, Vladimir. > > On Mon, Aug 31, 2015 at 05:20:49PM +0300, Vladimir Davydov wrote: > ... > > That being said, this is the fix at the right layer. > > While this *might* be a necessary workaround for the har

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-31 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 11:47:56AM -0400, Tejun Heo wrote: > On Mon, Aug 31, 2015 at 06:18:14PM +0300, Vladimir Davydov wrote: > > We have to be cautious about placing memcg_charge in slab/slub. To > > understand why, consider SLAB case, which first tries to allocate from >

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-08-31 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 01:03:09PM -0400, Tejun Heo wrote: > On Mon, Aug 31, 2015 at 07:51:32PM +0300, Vladimir Davydov wrote: > ... > > If we want to allow slab/slub implementation to invoke try_charge > > wherever it wants, we need to introduce an asynchronous thread doing

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-09-01 Thread Vladimir Davydov
On Mon, Aug 31, 2015 at 03:22:22PM -0500, Christoph Lameter wrote: > On Mon, 31 Aug 2015, Vladimir Davydov wrote: > > > I totally agree that we should strive to make a kmem user feel roughly > > the same in memcg as if it were running on a host with equal amount of > > RA

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-09-01 Thread Vladimir Davydov
On Tue, Sep 01, 2015 at 02:36:12PM +0200, Michal Hocko wrote: > On Mon 31-08-15 17:20:49, Vladimir Davydov wrote: > > On Mon, Aug 31, 2015 at 03:24:15PM +0200, Michal Hocko wrote: > > > On Sun 30-08-15 22:02:16, Vladimir Davydov wrote: > > > > > > Tejun repor

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-09-01 Thread Vladimir Davydov
On Tue, Sep 01, 2015 at 05:01:20PM +0200, Michal Hocko wrote: > On Tue 01-09-15 16:40:03, Vladimir Davydov wrote: > > On Tue, Sep 01, 2015 at 02:36:12PM +0200, Michal Hocko wrote: > > > On Mon 31-08-15 17:20:49, Vladimir Davydov wrote: > {...} > > > > 1. SLAB. S

Re: [PATCH 3/3] memcg: simplify and inline __mem_cgroup_from_kmem

2015-10-16 Thread Vladimir Davydov
On Fri, Oct 16, 2015 at 04:17:26PM +0300, Kirill A. Shutemov wrote: > On Mon, Oct 05, 2015 at 01:21:43AM +0300, Vladimir Davydov wrote: > > Before the previous patch, __mem_cgroup_from_kmem had to handle two > > types of kmem - slab pages and pages allocated with alloc_kmem_pages -

Re: [PATCH 3/3] memcg: simplify and inline __mem_cgroup_from_kmem

2015-10-17 Thread Vladimir Davydov
On Fri, Oct 16, 2015 at 03:12:23PM -0700, Hugh Dickins wrote: ... > Are you expecting to use mem_cgroup_from_kmem() from other places > in future? Seems possible; but at present it's called from only Not in the near future. At least, currently I can't think of any other use for it except list_lru

Re: [PATCH 2/3] memcg: unify slab and other kmem pages charging

2015-10-17 Thread Vladimir Davydov
On Fri, Oct 16, 2015 at 05:19:32PM -0700, Johannes Weiner wrote: ... > I think it'd be better to have an outer function than a magic > parameter for the memcg lookup. Could we fold this in there? Yeah, that looks neater. Thanks! Andrew, could you please fold this one too? > > --- > > Signed-of

Re: [PATCH 0/8] mm: memcontrol: account socket memory in unified hierarchy

2015-10-22 Thread Vladimir Davydov
Hi Johannes, On Thu, Oct 22, 2015 at 12:21:28AM -0400, Johannes Weiner wrote: ... > Patch #5 adds accounting and tracking of socket memory to the unified > hierarchy memory controller, as described above. It uses the existing > per-cpu charge caches and triggers high limit reclaim asynchroneously.

Re: [PATCH 3/8] net: consolidate memcg socket buffer tracking and accounting

2015-10-22 Thread Vladimir Davydov
On Thu, Oct 22, 2015 at 12:21:31AM -0400, Johannes Weiner wrote: > The tcp memory controller has extensive provisions for future memory > accounting interfaces that won't materialize after all. Cut the code > base down to what's actually used, now and in the likely future. > > - There won't be any

Re: [PATCH 5/8] mm: memcontrol: account socket memory on unified hierarchy

2015-10-22 Thread Vladimir Davydov
On Thu, Oct 22, 2015 at 12:21:33AM -0400, Johannes Weiner wrote: ... > @@ -5500,13 +5524,38 @@ void sock_release_memcg(struct sock *sk) > */ > bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages) > { > + unsigned int batch = max(CHARGE_BATCH, nr_pages); > stru

Re: [PATCH 7/8] mm: vmscan: report vmpressure at the level of reclaim activity

2015-10-22 Thread Vladimir Davydov
On Thu, Oct 22, 2015 at 12:21:35AM -0400, Johannes Weiner wrote: ... > @@ -2437,6 +2439,10 @@ static bool shrink_zone(struct zone *zone, struct > scan_control *sc, > } > } > > + vmpressure(sc->gfp_mask, memcg, > +

Re: [PATCH 8/8] mm: memcontrol: hook up vmpressure to socket pressure

2015-10-22 Thread Vladimir Davydov
On Thu, Oct 22, 2015 at 12:21:36AM -0400, Johannes Weiner wrote: ... > @@ -185,8 +183,29 @@ static void vmpressure_work_fn(struct work_struct *work) > vmpr->reclaimed = 0; > spin_unlock(&vmpr->sr_lock); > > + level = vmpressure_calc_level(scanned, reclaimed); > + > + if (level

Re: [PATCH 0/2] Fix memcg/memory.high in case kmem accounting is enabled

2015-09-02 Thread Vladimir Davydov
ng Johannes to Cc (I noticed that I accidentally left him out), because this discussion seems to be fundamental and may affect our further steps dramatically. ] On Tue, Sep 01, 2015 at 08:38:50PM +0200, Michal Hocko wrote: > On Tue 01-09-15 19:55:54, Vladimir Davydov wrote: > > On Tue, Sep

  1   2   3   4   5   6   7   8   9   10   >