small loads,
which would probably lead to the cpuidle governor making wrong decisions due to
overestimating the system load.
So, this seems to be another reason to use some different performance
multiplier in cpuidle governor.
On Jun 4, 2012, at 2:24 PM, Vladimir Davydov wrote:
> rq->cpulo
mnt_drop_write() must be called only if mnt_want_write() succeeded,
otherwise the mnt_writers counter will diverge.
Signed-off-by: Vladimir Davydov
Cc: Doug Ledford
Cc: Andrew Morton
Cc: KOSAKI Motohiro
Cc: "Eric W. Biederman"
---
ipc/mqueue.c |3 ++-
1 files changed, 2 insert
On Mar 20, 2013, at 1:09 AM, Andrew Morton
wrote:
> On Tue, 19 Mar 2013 13:31:18 +0400 Vladimir Davydov
> wrote:
>
>> mnt_drop_write() must be called only if mnt_want_write() succeeded,
>> otherwise the mnt_writers counter will diverge.
>>
>> ...
>&
gt;tg->cfs_bandwidth.timer_active=0
which conforms pretty nice to the explanation given above.
Signed-off-by: Vladimir Davydov
---
kernel/sched/core.c |2 +-
1 files changed, 1 insertions(+), 1 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 26058d0..c7a078f 1006
On Feb 8, 2013, at 6:46 PM, Paul Turner wrote:
> On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote:
>> If cfs_rq->runtime_remaining is <= 0 then either
>> - cfs_rq is throttled and waiting for quota redistribution, or
>> - cfs_rq is currently executi
On Feb 8, 2013, at 7:26 PM, Vladimir Davydov wrote:
> On Feb 8, 2013, at 6:46 PM, Paul Turner wrote:
>
>> On Fri, Feb 08, 2013 at 11:10:46AM +0400, Vladimir Davydov wrote:
>>> If cfs_rq->runtime_remaining is <= 0 then either
>>> - cfs_rq is throttled and
track
# time modprobe -r nf_conntrack
real 0m10.337s
user 0m0.000s
sys0m0.376s
# modprobe nf_conntrack
# time modprobe -r nf_conntrack
real0m5.661s
user0m0.000s
sys 0m0.216s
Signed-off-by: Vladimir Davydov
Cc: Patrick McHardy
Cc: "David S. Miller&quo
The patch introduces nf_conntrack_cleanup_list(), which cleanups
nf_conntracks for a list of netns and calls synchronize_net() only
once for them all.
---
include/net/netfilter/nf_conntrack_core.h | 10 +-
net/netfilter/nf_conntrack_core.c | 21 +
net/netfil
merqueue_add
5.93% libc-2.12.so [.] usleep
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c | 56 ++
kernel/sched/sched.h |7 +++
2 files changed, 28 insertions(+), 35 deletions(-)
diff --git a/kernel/sched/fair.c b/kern
s of all runnable tasks there instead. Fix it.
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c |2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 9b3fe1c..13abc29 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sche
alent addition in
the check.
--
The bug can be caught by running 2*N cpuhogs pinned to two logical cpus
belonging to different cores on an HT-enabled machine with N logical
cpus: just look at se.nr_migrations growth.
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c |4 ++--
1 file
d.
--
The bug can be caught by running 2*N cpuhogs pinned to two logical cpus
belonging to different cores on an HT-enabled machine with N logical
cpus: just look at se.nr_migrations growth.
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c |3 ++-
1 file changed, 2 insertions(+), 1 delet
Currently new_dst_cpu is prevented from being reselected actually, not
dst_cpu. This can result in attempting to pull tasks to this_cpu twice.
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c |6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/fair.c
s to allow handling 'some
pinned' case when pulling tasks from a new busiest cpu.
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c | 12 ++--
1 file changed, 10 insertions(+), 2 deletions(-)
diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index cd59640..d840e51
On 09/16/2013 09:52 AM, Peter Zijlstra wrote:
On Sun, Sep 15, 2013 at 05:49:13PM +0400, Vladimir Davydov wrote:
In busiest->group_imb case we can come to calculate_imbalance() with
local->avg_load >= busiest->avg_load >= sds->avg_load. This can result
in imbalance overf
On 09/16/2013 09:43 AM, Peter Zijlstra wrote:
On Sun, Sep 15, 2013 at 09:30:14PM +0400, Vladimir Davydov wrote:
Firstly, reset env.dst_cpu/dst_rq to this_cpu/this_rq, because it could
have changed in 'some pinned' case. Otherwise, should_we_balance() can
stop balancing beforehand.
the code is more
or less the same but, with a really big thanks to Vladimir Davydov and
Parallels, the development of fsck has been started and we have now
the possibility to correct fs errors due to corruption. It's a "young"
tool but we are working on it. You can clone the code from
On 09/29/2013 01:47 PM, Yuanhan Liu wrote:
On Fri, Sep 20, 2013 at 06:46:59AM -0700, tip-bot for Vladimir Davydov wrote:
Commit-ID: 7e3115ef5149fc502e3a2e80719dba54a8e7409d
Gitweb:http://git.kernel.org/tip/7e3115ef5149fc502e3a2e80719dba54a8e7409d
Author: Vladimir Davydov
AuthorDate: Sat
If several tasks in different cpu cgroups are contending for the same resource
(e.g. a semaphore) and one of those task groups is cpu limited (using cfs
bandwidth control), the priority inversion problem is likely to arise: if a cpu
limited task goes to sleep holding the resource (e.g. trying to ta
There is an error in the test script: I forgot to initialize cpuset.mems of
test cgroups - without it it is impossible to add a task into a cpuset cgroup.
Sorry for that.
Fixed version of the test script is attached.
On Oct 18, 2012, at 11:32 AM, Vladimir Davydov wrote:
> If several tasks
Thank you for the answer.
On Oct 19, 2012, at 6:24 PM, Peter Zijlstra wrote:
> its a quick hack similar to existing hacks done for rt, preferably we'd
> do smarter things though.
If you have any ideas how to fix this in a better way, please share.
--
To unsubscribe from this list: send the line
splicing data from/to
shmem. This would allow avoiding memory copying on checkpoint/restore.
* Save uptodate fs cache on umount to be restored on mount after kexec.
Thanks,
Vladimir Davydov (13):
mm: add PRAM API stubs and Kconfig
mm: PRAM: implement node load and save functions
mm:
This patch illustrates how PRAM API can be used for making tmpfs
'persistent'. It adds 'pram=' option to tmpfs, which specifies the PRAM
node to load/save FS tree from/to.
If the option is passed on mount, shmem will look for the corresponding
PRAM node and load the FS tree from it. On the subsequ
The function inserts a memory page to a shmem file under an arbitrary
offset. If there is something at the specified offset (page or swap),
the function fails.
The function will be sued by the next patch.
---
include/linux/shmem_fs.h |3 ++
mm/shmem.c | 68
To free all space utilized for persistent memory, one can write 0 to
/sys/kernel/pram. This will destroy all PRAM nodes that are not
currently being read or written.
---
mm/pram.c | 39 ++-
1 file changed, 38 insertions(+), 1 deletion(-)
diff --git a/mm/pram.
Persistent memory preservation is done by reserving memory pages
belonging to PRAM at early boot so that they will not be recycled. If
memory reservation fails for some reason (e.g. memory region is busy),
persistent memory will be lost.
Currently, PRAM preservation is only implemented for x86.
--
Banning for PRAM memory ranges that have been reserved at boot time is
not enough for avoiding all conflicts. The point is that kexec may load
the new kernel code to some address range that have never been reserved
possibly overwriting persistent data.
Fortunately, it is possible to specify a memo
Obviously, not all memory ranges can be used for saving persistent
over-kexec data, because some of them are reserved by the system core
and various device drivers at boot time. If a memory range used for
initialization of a particular device turns out to be busy because PRAM
uses it for storing it
The PRAM super block is the starting point for restoring persistent
memory. If the kernel locates the super block at boot time, it will
preserve the persistent memory structure from the previous kernel. To
point the kernel to the location of the super block, one should pass its
pfn via the 'pram' b
Since page structs, which are used for linking PRAM nodes, are cleared
on boot, organize all PRAM nodes into a list singly-linked by pfn's
before reboot to facilitate the node list restore in the new kernel.
---
mm/pram.c | 50 ++
1 file changed, 5
Persistent memory is divided into nodes, which can be saved and loaded
independently of each other. PRAM nodes are kept on the list and
identified by unique names. Whenever a save operation is initiated by
calling pram_prepare_save(), a new node is created and linked to the
list. When the save oper
Using the pram_save_page() function, one can populate PRAM nodes with
memory pages, which can be then loaded using the pram_load_page()
function. Saving a memory page to PRAM is implemented as storing the pfn
in the PRAM node and incrementing its ref count so that it will not get
freed after the la
Persistent memory subsys or PRAM is intended to be used for saving
memory pages of the currently executing kernel and restoring them after
a kexec in the newly booted one. This can be utilized for speeding up
reboot by leaving process memory and/or FS caches in-place.
The proposed API:
* Persist
Checksum PRAM pages with crc32 to ensure persistent memory is not
corrupted during reboot.
---
mm/Kconfig |4 ++
mm/pram.c | 128 +++-
2 files changed, 130 insertions(+), 2 deletions(-)
diff --git a/mm/Kconfig b/mm/Kconfig
index f1e11a
This patch adds ability to save arbitrary byte strings to PRAM using
pram_write() to be restored later using pram_read(). These two
operations are implemented on top of pram_save_page() and
pram_load_page() respectively.
---
include/linux/pram.h |4 +++
mm/pram.c| 86
On 07/15/2013 12:28 PM, Peter Zijlstra wrote:
OK, fair enough. It does somewhat rely on us getting the single
rq->clock update thing right, but that should be ok.
Frankly, I doubt that rq->clock is the right thing to use here, because
it can be updated very frequently under some conditions, so
merqueue_add
5.93% libc-2.12.so [.] usleep
Changes in v2:
* use jiffies instead of rq->clock for last_h_load_update.
Signed-off-by: Vladimir Davydov
---
kernel/sched/fair.c | 58 +++---
kernel/sched/sched.h |7 +++---
2 files ch
On 07/27/2013 07:41 PM, Marco Stornelli wrote:
Il 26/07/2013 14:29, Vladimir Davydov ha scritto:
Hi,
We want to propose a way to upgrade a kernel on a machine without
restarting all the user-space services. This is to be done with CRIU
project, but we need help from the kernel to preserve some
On 07/27/2013 09:37 PM, Marco Stornelli wrote:
Il 27/07/2013 19:35, Vladimir Davydov ha scritto:
On 07/27/2013 07:41 PM, Marco Stornelli wrote:
Il 26/07/2013 14:29, Vladimir Davydov ha scritto:
Hi,
We want to propose a way to upgrade a kernel on a machine without
restarting all the user
On 07/28/2013 03:02 PM, Marco Stornelli wrote:
Il 28/07/2013 12:05, Vladimir Davydov ha scritto:
On 07/27/2013 09:37 PM, Marco Stornelli wrote:
Il 27/07/2013 19:35, Vladimir Davydov ha scritto:
On 07/27/2013 07:41 PM, Marco Stornelli wrote:
Il 26/07/2013 14:29, Vladimir Davydov ha scritto
Hi,
We want to propose a way to upgrade a kernel on a machine without
restarting all the user-space services. This is to be done with CRIU
project, but we need help from the kernel to preserve some data in
memory while doing kexec.
The key point of our implementation is leaving process memory in-
It is more convenient to write 'clearcpuid=147,148,...' than
'clearcpuid=147 clearcpuid=148 ...'
---
arch/x86/kernel/cpu/common.c |8
1 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/common.c b/arch/x86/kernel/cpu/common.c
index 6b9333b..8ffe1b9 10064
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be
reported in /proc/cpuinfo and used by the kernel. However, if a
userpsace process checks CPU features directly using the cpuid
instruction, it will be reported about all features supported by the CPU
irrespective of what featur
On Jul 20, 2012, at 9:20 PM, H. Peter Anvin wrote:
> On 07/20/2012 09:37 AM, Vladimir Davydov wrote:
>> If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be
>> reported in /proc/cpuinfo and used by the kernel. However, if a
>> userpsace
On Jul 21, 2012, at 12:19 AM, H. Peter Anvin wrote:
> On 07/20/2012 11:21 AM, Vladimir Davydov wrote:
>>>
>>> I am a bit concerned about this patch:
>>>
>>> 1. it silently changes existing behavior.
>>
>> Yes, but who needs the current imp
On 07/21/2012 02:37 PM, Borislav Petkov wrote:
(+ Andre who's been doing some cross vendor stuff)
On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote:
If 'clearcpuid=N' is specified in boot options, CPU feature #N won't be
reported in /proc/cpuinfo and used by
On 07/24/2012 12:14 PM, Andre Przywara wrote:
On 07/24/2012 09:06 AM, Vladimir Davydov wrote:
On 07/21/2012 02:37 PM, Borislav Petkov wrote:
(+ Andre who's been doing some cross vendor stuff)
On Fri, Jul 20, 2012 at 08:37:33PM +0400, Vladimir Davydov wrote:
If 'clearcpuid=N'
On 07/24/2012 02:10 PM, Borislav Petkov wrote:
On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote:
I guess that when the more advanced features become widely-used,
vendors will offer new MSRs and/or CPUID faulting.
And this right there is the dealbreaker:
So what are you doing
On 07/25/2012 04:57 AM, H. Peter Anvin wrote:
On 07/24/2012 04:09 AM, Vladimir Davydov wrote:
We have not encountered this situation in our environments and I hope we
won't :-)
But look, these CPUID functions cover majority of CPU features, don't
they? So, most of "normal"
On 07/24/2012 04:34 PM, Andre Przywara wrote:
On 07/24/2012 01:09 PM, Vladimir Davydov wrote:
On 07/24/2012 02:10 PM, Borislav Petkov wrote:
On Tue, Jul 24, 2012 at 12:29:19PM +0400, Vladimir Davydov wrote:
I guess that when the more advanced features become widely-used,
vendors will offer
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
/proc/cpuinfo filtering). Does this address the issues you have?
You can do the /proc/cpuinfo filtering in user space too
How?
--
To unsubscribe from this list: send the line "unsubscribe li
On 07/25/2012 02:58 PM, Andre Przywara wrote:
On 07/25/2012 12:31 PM, Vladimir Davydov wrote:
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
/proc/cpuinfo filtering). Does this address the issues you have?
You can do the /proc/cpuinfo
On 07/25/2012 02:43 PM, Borislav Petkov wrote:
On Wed, Jul 25, 2012 at 02:31:23PM +0400, Vladimir Davydov wrote:
So, you prefer adding some filtering of /proc/cpuinfo into the
mainstream kernel
That's already there right? And your 1/2 patch was making toggling those
bits easier.
(no
On 07/25/2012 03:17 PM, Andre Przywara wrote:
On 07/25/2012 01:02 PM, Vladimir Davydov wrote:
On 07/25/2012 02:58 PM, Andre Przywara wrote:
On 07/25/2012 12:31 PM, Vladimir Davydov wrote:
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
On 07/25/2012 03:31 PM, Alan Cox wrote:
On Wed, 25 Jul 2012 14:31:30 +0400
Vladimir Davydov wrote:
On 07/24/2012 04:44 PM, Alan Cox wrote:
This approach does not need any kernel support (except for the
/proc/cpuinfo filtering). Does this address the issues you have?
You can do the /proc
On 07/25/2012 04:57 AM, H. Peter Anvin wrote:
On 07/24/2012 04:09 AM, Vladimir Davydov wrote:
We have not encountered this situation in our environments and I hope we
won't :-)
But look, these CPUID functions cover majority of CPU features, don't
they? So, most of "normal"
On 07/20/2012 09:10 PM, Andi Kleen wrote:
+ unsigned int *msr_ext_cpuid_mask)
+{
+ unsigned int msr, msr_ext;
+
+ msr = msr_ext = 0;
+
+ switch (c->x86_model) {
You have to check the family too.
+
+ return msr;
+}
+
+static void __c
.
Signed-off-by: Vladimir Davydov
---
block/blk-exec.c |4 ++--
block/blk-flush.c |2 +-
block/blk-lib.c |6 +++---
3 files changed, 6 insertions(+), 6 deletions(-)
diff --git a/block/blk-exec.c b/block/blk-exec.c
index 74638ec..f634de7 100644
--- a/block/blk-exec.c
+++ b/block/blk
accounting when the
completion struct is actually used for waiting for IO (e.g. completion
of a bio request in the block layer).
Signed-off-by: Vladimir Davydov
---
include/linux/completion.h |3 ++
kernel/sched/core.c| 57
2 files
On Fri, Jul 22, 2016 at 09:49:13AM +0200, Michal Hocko wrote:
> On Fri 22-07-16 11:43:30, Zhou Chengming wrote:
> > In !global_reclaim(sc) case, we should update sc->nr_reclaimed after each
> > shrink_slab in the loop. Because we need the correct sc->nr_reclaimed
> > value to see if we can break ou
On Thu, Jul 21, 2016 at 08:41:44AM -0400, Johannes Weiner wrote:
> On Mon, Jun 27, 2016 at 07:39:54PM +0300, Vladimir Davydov wrote:
> > When selecting an oom victim, we use the same heuristic for both memory
> > cgroup and global oom. The only difference is the scope of tasks to
rain operation
> will put references of in-use pages, thus causing the imbalance.
>
> Disable IRQs during all per-cpu charge cache operations.
>
> Fixes: f7e1cb6ec51b ("mm: memcontrol: account socket memory in unified
> hierarchy memory controller")
> Cc: # 4.5+
> Signed-off-by: Johannes Weiner
Acked-by: Vladimir Davydov
s are destroyed later on.
>
> Fixes: bd1060a1d671 ("sock, cgroup: add sock->sk_cgroup")
> Cc: # 4.5+
> Signed-off-by: Johannes Weiner
Reviewed-by: Vladimir Davydov
; Since frv has THREAD_SIZE < PAGE_SIZE, we need to track kernel stack
> allocations in a unit that divides both THREAD_SIZE and PAGE_SIZE on
> all architectures. Keep it simple and use KiB.
>
> Cc: Vladimir Davydov
> Cc: Johannes Weiner
> Cc: Michal Hocko
> Cc: linux...@kva
12580e4b54ba8 ("mm: memcontrol: report kernel stack usage in cgroup2
> memory.stat")
> Cc: Vladimir Davydov
> Cc: Johannes Weiner
> Cc: Michal Hocko
> Cc: linux...@kvack.org
> Signed-off-by: Andy Lutomirski
Reviewed-by: Vladimir Davydov
This patch is going to have
65K cgroups it will take the reclaimer a substantial amount of time
to iterate over all of them, which might result in latency spikes.
Probably, to avoid that, we could move pages from a dead cgroup's lru to
its parent's one on offline while still leaving dead cgroups pinned,
like we do i
;0038fdaa>] aio_migratepage+0x16a/0x1e8)
> ([<00310568>] move_to_new_page+0xb0/0x260)
> ([<003111b4>] migrate_pages+0x8f4/0x9f0)
> ([<002c507c>] compact_zone+0x4dc/0xdc8)
> ([<002c5e22>] kcompactd_do_work+0x1aa/0x358)
> ([<002c608a>] kcompactd+0xba/0x2c8)
> ([<0016b09a>] kthread+0x10a/0x110)
> ([<0095315a>] kernel_thread_starter+0x6/0xc)
> ([<00953154>] kernel_thread_starter+0x0/0xc)
> INFO: lockdep is turned off.
>
> Signed-off-by: Tejun Heo
> Reported-by: Christian Borntraeger
> Link: http://lkml.kernel.org/g/5767cfe5.7080...@de.ibm.com
Reviewed-by: Vladimir Davydov
never invoked on global reclaim. Fix that.
Fixes: d71df22b55099 ("mm, vmscan: begin reclaiming pages on a per-node basis")
Signed-off-by: Vladimir Davydov
---
mm/vmscan.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
index 650d26832569..37
many
small jobs")
Signed-off-by: Vladimir Davydov
---
mm/memcontrol.c | 24 ++--
1 file changed, 18 insertions(+), 6 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 5fe285f27ea7..58c229071fb1 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4030,9 +
")
Signed-off-by: Vladimir Davydov
---
mm/memcontrol.c | 27 +--
1 file changed, 21 insertions(+), 6 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index b5804e4e6324..5fe285f27ea7 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -4035,6 +4035,13 @@ static v
On Mon, Aug 01, 2016 at 12:00:21PM +0200, Michal Hocko wrote:
...
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index c265212bec8c..eb7e39c2d948 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -2543,6 +2543,11 @@ static int mem_cgroup_resize_memsw_limit(struct
> mem_cgroup *memcg,
radix tree nodes if it was explicitly requested by passing
__GFP_ACCOUNT to INIT_RADIX_TREE. Currently, we only want to account
page cache entries, so mark mapping->page_tree so.
Signed-off-by: Vladimir Davydov
---
fs/inode.c | 2 +-
lib/radix-tree.c | 14 ++
2 files cha
l oom heuristic related code
private to oom_kill.c and make oom_kill.c use exported memcg functions
when it's really necessary (like in case of iterating over memcg tasks).
Signed-off-by: Vladimir Davydov
Acked-by: Johannes Weiner
---
Changes in v3:
- rebase on top of v4.7-mmotm-2016-07-28-16-33
vdavydov@{parallels,virtuozzo}.com will bounce from now on.
Signed-off-by: Vladimir Davydov
---
.mailmap| 2 ++
MAINTAINERS | 2 +-
2 files changed, 3 insertions(+), 1 deletion(-)
diff --git a/.mailmap b/.mailmap
index b18912c5121e..de22daefd9da 100644
--- a/.mailmap
+++ b/.mailmap
On Wed, Aug 31, 2016 at 07:26:43AM -0700, Greg wrote:
> On Wed, 2016-08-31 at 15:01 +0300, Vladimir Davydov wrote:
> > vdavydov@{parallels,virtuozzo}.com will bounce from now on.
> >
> > Signed-off-by: Vladimir Davydov
>
> Shouldn't MAINTAINERS be in the subject
On Mon, Aug 08, 2016 at 10:48:45AM -0700, Linus Torvalds wrote:
...
> > [ 43.477693] BUG: Bad page state in process S05containers pfn:1ff02a3
> > [ 43.484417] page:ea007fc0a8c0 count:0 mapcount:-511 mapping:
> > (null) index:0x0
> > [ 43.492737] flags: 0x1000()
> >
uot;)
Reported-by: Eric Dumazet
Signed-off-by: Vladimir Davydov
Cc: [4.7+]
---
fs/pipe.c | 4 +---
mm/memcontrol.c | 14 --
mm/page_alloc.c | 14 +-
3 files changed, 18 insertions(+), 14 deletions(-)
diff --git a/fs/pipe.c b/fs/pipe.c
index 4b32928f5426..4ebe6b2e5217
l: fix swap counter leak on swapout from
> offline cgroup")
Acked-by: Vladimir Davydov
x10
>
> Fix this lockup by reducing the number of entries to be shrinked
> from the lru list to 1024 at once. Also, add cond_resched() before
> processing the lru list again.
>
> Link: http://marc.info/?t=14972286491&r=1&w=2
> Fix-suggested-by: Jan kara
>
used
without it is fallback_alloc(), which, in contrast to other
cache_grow() users, preallocates a page and passes it to cache_grow()
so that the latter does not need to invoke kmem_getpages() by itself.
Reported-by: Tejun Heo
Signed-off-by: Vladimir Davydov
orwarding it to memcg_charge_slab()
if the context allows.
Reported-by: Tejun Heo
Signed-off-by: Vladimir Davydov
---
mm/slub.c | 24 +++-
1 file changed, 11 insertions(+), 13 deletions(-)
diff --git a/mm/slub.c b/mm/slub.c
index e180f8dcd06d..416a332277cb 100644
--- a/mm/slub.c
pages, memcg reclaim will
not get invoked on kmem allocations, which will lead to uncontrollable
growth of memory usage no matter what memory.high is set to.
This patch set attempts to fix this issue. For more details please see
comments to individual patches.
Thanks,
Vladimir Davydov (2):
mm
On Mon, Aug 31, 2015 at 03:24:15PM +0200, Michal Hocko wrote:
> On Sun 30-08-15 22:02:16, Vladimir Davydov wrote:
> > Tejun reported that sometimes memcg/memory.high threshold seems to be
> > silently ignored if kmem accounting is enabled:
> >
> > http://www.spinics.
On Mon, Aug 31, 2015 at 09:43:35AM -0400, Tejun Heo wrote:
> On Mon, Aug 31, 2015 at 03:24:15PM +0200, Michal Hocko wrote:
> > Right but isn't that what the caller explicitly asked for? Why should we
> > ignore that for kmem accounting? It seems like a fix at a wrong layer to
> > me. Either we shou
On Mon, Aug 31, 2015 at 10:39:39AM -0400, Tejun Heo wrote:
> On Mon, Aug 31, 2015 at 05:30:08PM +0300, Vladimir Davydov wrote:
> > slab/slub can issue alloc_pages() any time with any flags they want and
> > it won't be accounted to memcg, because kmem is accounted at slab/sl
On Mon, Aug 31, 2015 at 10:46:04AM -0400, Tejun Heo wrote:
> Hello, Vladimir.
>
> On Mon, Aug 31, 2015 at 05:20:49PM +0300, Vladimir Davydov wrote:
> ...
> > That being said, this is the fix at the right layer.
>
> While this *might* be a necessary workaround for the har
On Mon, Aug 31, 2015 at 11:47:56AM -0400, Tejun Heo wrote:
> On Mon, Aug 31, 2015 at 06:18:14PM +0300, Vladimir Davydov wrote:
> > We have to be cautious about placing memcg_charge in slab/slub. To
> > understand why, consider SLAB case, which first tries to allocate from
>
On Mon, Aug 31, 2015 at 01:03:09PM -0400, Tejun Heo wrote:
> On Mon, Aug 31, 2015 at 07:51:32PM +0300, Vladimir Davydov wrote:
> ...
> > If we want to allow slab/slub implementation to invoke try_charge
> > wherever it wants, we need to introduce an asynchronous thread doing
On Mon, Aug 31, 2015 at 03:22:22PM -0500, Christoph Lameter wrote:
> On Mon, 31 Aug 2015, Vladimir Davydov wrote:
>
> > I totally agree that we should strive to make a kmem user feel roughly
> > the same in memcg as if it were running on a host with equal amount of
> > RA
On Tue, Sep 01, 2015 at 02:36:12PM +0200, Michal Hocko wrote:
> On Mon 31-08-15 17:20:49, Vladimir Davydov wrote:
> > On Mon, Aug 31, 2015 at 03:24:15PM +0200, Michal Hocko wrote:
> > > On Sun 30-08-15 22:02:16, Vladimir Davydov wrote:
> >
> > > > Tejun repor
On Tue, Sep 01, 2015 at 05:01:20PM +0200, Michal Hocko wrote:
> On Tue 01-09-15 16:40:03, Vladimir Davydov wrote:
> > On Tue, Sep 01, 2015 at 02:36:12PM +0200, Michal Hocko wrote:
> > > On Mon 31-08-15 17:20:49, Vladimir Davydov wrote:
> {...}
> > > > 1. SLAB. S
On Fri, Oct 16, 2015 at 04:17:26PM +0300, Kirill A. Shutemov wrote:
> On Mon, Oct 05, 2015 at 01:21:43AM +0300, Vladimir Davydov wrote:
> > Before the previous patch, __mem_cgroup_from_kmem had to handle two
> > types of kmem - slab pages and pages allocated with alloc_kmem_pages -
On Fri, Oct 16, 2015 at 03:12:23PM -0700, Hugh Dickins wrote:
...
> Are you expecting to use mem_cgroup_from_kmem() from other places
> in future? Seems possible; but at present it's called from only
Not in the near future. At least, currently I can't think of any other
use for it except list_lru
On Fri, Oct 16, 2015 at 05:19:32PM -0700, Johannes Weiner wrote:
...
> I think it'd be better to have an outer function than a magic
> parameter for the memcg lookup. Could we fold this in there?
Yeah, that looks neater. Thanks!
Andrew, could you please fold this one too?
>
> ---
>
> Signed-of
Hi Johannes,
On Thu, Oct 22, 2015 at 12:21:28AM -0400, Johannes Weiner wrote:
...
> Patch #5 adds accounting and tracking of socket memory to the unified
> hierarchy memory controller, as described above. It uses the existing
> per-cpu charge caches and triggers high limit reclaim asynchroneously.
On Thu, Oct 22, 2015 at 12:21:31AM -0400, Johannes Weiner wrote:
> The tcp memory controller has extensive provisions for future memory
> accounting interfaces that won't materialize after all. Cut the code
> base down to what's actually used, now and in the likely future.
>
> - There won't be any
On Thu, Oct 22, 2015 at 12:21:33AM -0400, Johannes Weiner wrote:
...
> @@ -5500,13 +5524,38 @@ void sock_release_memcg(struct sock *sk)
> */
> bool mem_cgroup_charge_skmem(struct mem_cgroup *memcg, unsigned int nr_pages)
> {
> + unsigned int batch = max(CHARGE_BATCH, nr_pages);
> stru
On Thu, Oct 22, 2015 at 12:21:35AM -0400, Johannes Weiner wrote:
...
> @@ -2437,6 +2439,10 @@ static bool shrink_zone(struct zone *zone, struct
> scan_control *sc,
> }
> }
>
> + vmpressure(sc->gfp_mask, memcg,
> +
On Thu, Oct 22, 2015 at 12:21:36AM -0400, Johannes Weiner wrote:
...
> @@ -185,8 +183,29 @@ static void vmpressure_work_fn(struct work_struct *work)
> vmpr->reclaimed = 0;
> spin_unlock(&vmpr->sr_lock);
>
> + level = vmpressure_calc_level(scanned, reclaimed);
> +
> + if (level
ng Johannes to Cc (I noticed that I accidentally left him
out), because this discussion seems to be fundamental and may affect
our further steps dramatically.
]
On Tue, Sep 01, 2015 at 08:38:50PM +0200, Michal Hocko wrote:
> On Tue 01-09-15 19:55:54, Vladimir Davydov wrote:
> > On Tue, Sep
1 - 100 of 1505 matches
Mail list logo