Re: [PATCH v3 4/7] mm: unify SLAB and SLUB page accounting

2019-05-10 Thread Shakeel Butt
From: Roman Gushchin Date: Wed, May 8, 2019 at 1:40 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > Currently the page accounting code is duplicated in SLAB and SLUB > internals. Let'

Re: [PATCH v3 5/7] mm: rework non-root kmem_cache lifecycle management

2019-05-10 Thread Shakeel Butt
From: Roman Gushchin Date: Wed, May 8, 2019 at 1:41 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > This commit makes several important changes in the lifecycle > of a non-root kmem_cache,

Re: [PATCH v3 6/7] mm: reparent slab memory on cgroup removal

2019-05-10 Thread Shakeel Butt
From: Roman Gushchin Date: Wed, May 8, 2019 at 1:41 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > Let's reparent memcg slab memory on memcg offlining. This allows us > to release the memory

Re: [PATCH v3 7/7] mm: fix /proc/kpagecgroup interface for slab pages

2019-05-10 Thread Shakeel Butt
From: Roman Gushchin Date: Wed, May 8, 2019 at 1:40 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > Switching to an indirect scheme of getting mem_cgroup pointer for > !root slab pages broke

[RESEND PATCH v2 2/2] memcg, fsnotify: no oom-kill for remote memcg charging

2019-05-12 Thread Shakeel Butt
emcg, explicitly add __GFP_RETRY_MAYFAIL to the fanotigy and inotify event allocations. Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin --- Changelog since v1: - Fixed usage of __GFP_RETRY_MAYFAIL flag. fs/notify/fanotify/fanotify.c| 5 - fs/notify/inotify/inotify_fsnotify.c

[RESEND PATCH v2 1/2] memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL

2019-05-12 Thread Shakeel Butt
-killer in the charging path for fanotify and inotify event allocations. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko --- Changelog since v1: - commit message updated. mm/memcontrol.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c

Re: [PATCH] memcg: refill_stock for kmem uncharging too

2019-04-19 Thread Shakeel Butt
On Fri, Apr 19, 2019 at 1:07 PM Roman Gushchin wrote: > > On Thu, Apr 18, 2019 at 02:42:24PM -0700, Shakeel Butt wrote: > > The commit 475d0487a2ad ("mm: memcontrol: use per-cpu stocks for socket > > memory uncharging") added refill_stock() for skmem uncharging pa

[PATCH v2] memcg: refill_stock for kmem uncharging too

2019-04-23 Thread Shakeel Butt
ned memcgs but it may impact the performance of network traffic for the sockets used by other cgroups. Signed-off-by: Shakeel Butt Cc: Roman Gushchin Cc: Johannes Weiner Cc: Michal Hocko Cc: Vladimir Davydov Cc: Andrew Morton --- Changelog since v1: - No need to bypass offline memcgs in the re

Re: [PATCH v3 0/7] mm: reparent slab memory on cgroup removal

2019-05-14 Thread Shakeel Butt
From: Roman Gushchin Date: Mon, May 13, 2019 at 1:22 PM To: Shakeel Butt Cc: Andrew Morton, Linux MM, LKML, Kernel Team, Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, Cgroups > On Fri, May 10, 2019 at 05:32:15PM -0700, Shakeel Butt wrote: > > Fr

[PATCH v3 2/2] memcg, fsnotify: no oom-kill for remote memcg charging

2019-05-14 Thread Shakeel Butt
emcg, explicitly add __GFP_RETRY_MAYFAIL to the fanotigy and inotify event allocations. Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin --- Changelog since v2: - updated the comments. Changelog since v1: - Fixed usage of __GFP_RETRY_MAYFAIL flag. fs/notify/fanotify/fanotify.c

[PATCH v3 1/2] memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL

2019-05-14 Thread Shakeel Butt
-killer in the charging path for fanotify and inotify event allocations. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko --- Changelog since v2: - None Changelog since v1: - commit message updated mm/memcontrol.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm

Re: [PATCH v4 5/7] mm: rework non-root kmem_cache lifecycle management

2019-05-14 Thread Shakeel Butt
From: Roman Gushchin Date: Tue, May 14, 2019 at 2:55 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > This commit makes several important changes in the lifecycle > of a non-root kmem_cache,

Re: [PATCH v4 6/7] mm: reparent slab memory on cgroup removal

2019-05-14 Thread Shakeel Butt
From: Roman Gushchin Date: Tue, May 14, 2019 at 2:54 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > Let's reparent memcg slab memory on memcg offlining. This allows us > to release the

Re: [PATCH v4 7/7] mm: fix /proc/kpagecgroup interface for slab pages

2019-05-14 Thread Shakeel Butt
From: Roman Gushchin Date: Tue, May 14, 2019 at 2:54 PM To: Andrew Morton, Shakeel Butt Cc: , , , Johannes Weiner, Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov, , Roman Gushchin > Switching to an indirect scheme of getting mem_cgroup pointer for > !root slab pages broke

Re: [PATCH v4 5/7] mm: rework non-root kmem_cache lifecycle management

2019-05-15 Thread Shakeel Butt
From: Christopher Lameter Date: Wed, May 15, 2019 at 7:00 AM To: Roman Gushchin Cc: Andrew Morton, Shakeel Butt, , , , Johannes Weiner, Michal Hocko, Rik van Riel, Vladimir Davydov, > On Tue, 14 May 2019, Roman Gushchin wrote: > > > To make this possible we need to introduce

[PATCH v2] memcg, fsnotify: no oom-kill for remote memcg charging

2019-05-04 Thread Shakeel Butt
emcg, explicitly add __GFP_RETRY_MAYFAIL to the fanotigy and inotify event allocations. Signed-off-by: Shakeel Butt --- Changelog since v1: - Fixed usage of __GFP_RETRY_MAYFAIL flag. fs/notify/fanotify/fanotify.c| 5 - fs/notify/inotify/inotify_fsnotify.c | 7 +-- 2 files changed

Re: [PATCH v2] memcg: refill_stock for kmem uncharging too

2019-04-28 Thread Shakeel Butt
On Wed, Apr 24, 2019 at 11:49 PM Michal Hocko wrote: > > On Tue 23-04-19 08:44:05, Shakeel Butt wrote: > > The commit 475d0487a2ad ("mm: memcontrol: use per-cpu stocks for socket > > memory uncharging") added refill_stock() for skmem uncharging path to > > opti

[PATCH] memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL

2019-04-28 Thread Shakeel Butt
The documentation of __GFP_RETRY_MAYFAIL clearly mentioned that the OOM killer will not be triggered and indeed the page alloc does not invoke OOM killer for such allocations. However we do trigger memcg OOM killer for __GFP_RETRY_MAYFAIL. Fix that. Signed-off-by: Shakeel Butt --- mm

Re: [PATCH] memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL

2019-04-29 Thread Shakeel Butt
On Mon, Apr 29, 2019 at 5:22 AM Michal Hocko wrote: > > On Sun 28-04-19 16:56:13, Shakeel Butt wrote: > > The documentation of __GFP_RETRY_MAYFAIL clearly mentioned that the > > OOM killer will not be triggered and indeed the page alloc does not > > invoke OOM killer for s

[PATCH 1/2] memcg, oom: no oom-kill for __GFP_RETRY_MAYFAIL

2019-04-29 Thread Shakeel Butt
-killer in the charging path for fanotify and inotify event allocations. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko --- Changelog since v1: - commit message updated. mm/memcontrol.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c

[PATCH 2/2] memcg, fsnotify: no oom-kill for remote memcg charging

2019-04-29 Thread Shakeel Butt
emcg, explicitly add __GFP_RETRY_MAYFAIL to the fanotify and inotify event allocations. Signed-off-by: Shakeel Butt --- fs/notify/fanotify/fanotify.c| 4 +++- fs/notify/inotify/inotify_fsnotify.c | 7 +-- 2 files changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/notify/fanotify/fa

Re: [PATCH 2/2] memcg, fsnotify: no oom-kill for remote memcg charging

2019-04-29 Thread Shakeel Butt
On Mon, Apr 29, 2019 at 5:41 PM Michal Hocko wrote: > > On Mon 29-04-19 10:13:32, Shakeel Butt wrote: > [...] > > /* > >* For queues with unlimited length lost events are not expected and > >* can possibly have security implication

Re: [PATCH v2] memcg: make it work on sparse non-0-node systems

2019-05-17 Thread Shakeel Butt
eness > by a bool flag in struct list_lru. > > [v2] use the idea proposed by Vladimir -- the bool flag. > > Signed-off-by: Jiri Slaby Reviewed-by: Shakeel Butt > Cc: Johannes Weiner > Cc: Michal Hocko > Suggested-by: Vladimir Davydov > Acked-by: Vladimir Davydov > Cc:

[PATCH] mm, memcg: introduce memory.events.local

2019-05-17 Thread Shakeel Butt
there will not be any process in the internal nodes and thus no chance of local pressure. Signed-off-by: Shakeel Butt --- include/linux/memcontrol.h | 7 ++- mm/memcontrol.c| 25 + 2 files changed, 31 insertions(+), 1 deletion(-) diff --git a/include/linux

[PATCH v2] mm, memcg: introduce memory.events.local

2019-05-17 Thread Shakeel Butt
there will not be any process in the internal nodes and thus no chance of local pressure. Signed-off-by: Shakeel Butt --- Changelog since v1: - refactor memory_events_show to share between events and events.local include/linux/memcontrol.h | 7 ++- mm/memcontrol.c| 34

Re: [PATCH v2] mm, memcg: introduce memory.events.local

2019-05-17 Thread Shakeel Butt
On Fri, May 17, 2019 at 5:59 PM Roman Gushchin wrote: > > On Fri, May 17, 2019 at 05:18:18PM -0700, Shakeel Butt wrote: > > The memory controller in cgroup v2 exposes memory.events file for each > > memcg which shows the number of times events like low, high, max, oom >

[PATCH] cgroup: remove extra cgroup_migrate_finish() call

2019-04-03 Thread Shakeel Butt
The callers of cgroup_migrate_prepare_dst() correctly call cgroup_migrate_finish() for success and failure cases both. No need to call it in cgroup_migrate_prepare_dst() in failure case. Signed-off-by: Shakeel Butt --- kernel/cgroup/cgroup.c | 5 + 1 file changed, 1 insertion(+), 4

[RFC PATCH] mm, kvm: account kvm_vcpu_mmap to kmemcg

2019-03-28 Thread Shakeel Butt
. The simplest solution is to remove the assumption of no mmapping PageKmemcg() pages to user space. Signed-off-by: Shakeel Butt --- arch/s390/kvm/kvm-s390.c | 2 +- arch/x86/kvm/x86.c | 2 +- include/linux/page-flags.h | 26 ++ include/trace/events

Re: [RFC PATCH] mm, kvm: account kvm_vcpu_mmap to kmemcg

2019-03-28 Thread Shakeel Butt
On Thu, Mar 28, 2019 at 7:36 PM Matthew Wilcox wrote: > > On Thu, Mar 28, 2019 at 06:28:36PM -0700, Shakeel Butt wrote: > > A VCPU of a VM can allocate upto three pages which can be mmap'ed by the > > user space application. At the moment this memory is not charged. On a > &g

Re: [RFC PATCH] mm, kvm: account kvm_vcpu_mmap to kmemcg

2019-03-29 Thread Shakeel Butt
On Fri, Mar 29, 2019 at 12:52 AM Michal Hocko wrote: > > On Thu 28-03-19 18:28:36, Shakeel Butt wrote: > > A VCPU of a VM can allocate upto three pages which can be mmap'ed by the > > user space application. At the moment this memory is not charged. On a > > large mach

Re: [PATCH 3/4] mm: memcontrol: fix recursive statistics correctness & scalabilty

2019-04-12 Thread Shakeel Butt
the accuracy of stats are getting worse. Internally we have an additional interface memory.stat_exact for that. However I am not sure in the upstream kernel will an additional interface is better or something like /proc/sys/vm/stat_refresh which sync all per-cpu stats. > Signed-off-by: Johannes Weiner

Re: [PATCH 0/4] mm: memcontrol: memory.stat cost & correctness

2019-04-12 Thread Shakeel Butt
ange. > > The upward spilling is batched using the existing per-cpu cache. In a > sparse file stress test with 5 level cgroup nesting, the additional > cost of the flushing was negligible (a little under 1% of CPU at 100% > CPU utilization, compared to the 5% of reading memory.s

Re: [PATCH 3/4] mm: memcontrol: fix recursive statistics correctness & scalabilty

2019-04-12 Thread Shakeel Butt
On Fri, Apr 12, 2019 at 1:10 PM Johannes Weiner wrote: > > On Fri, Apr 12, 2019 at 12:55:10PM -0700, Shakeel Butt wrote: > > We also faced this exact same issue as well and had the similar solution. > > > > > Signed-off-by: Johannes Weiner > > > >

Re: [PATCH 3/4] mm: memcontrol: fix recursive statistics correctness & scalabilty

2019-04-12 Thread Shakeel Butt
On Fri, Apr 12, 2019 at 1:16 PM Roman Gushchin wrote: > > On Fri, Apr 12, 2019 at 12:55:10PM -0700, Shakeel Butt wrote: > > On Fri, Apr 12, 2019 at 8:15 AM Johannes Weiner wrote: > > > > > > Right now, when somebody needs to know the recursive memory statistic

Re: KASAN: null-ptr-deref Read in reclaim_high

2019-03-12 Thread Shakeel Butt
> > > > > > > > On Mon, 11 Mar 2019 06:08:01 -0700 syzbot > > > > wrote: > > > > > > > > > syzbot has bisected this bug to: > > > > > > > > > > commit 29a4b8e275d1f10c51c7891362877ef6cffae9e7 > > >

Re: [PATCH v3 1/2] mm, oom: fix use-after-free in oom_kill_process

2019-01-23 Thread Shakeel Butt
On Wed, Jan 23, 2019 at 2:57 PM Sasha Levin wrote: > > Hi, > > [This is an automated email] > > This commit has been processed because it contains a "Fixes:" tag, > fixing commit: 6b0c81b3be11 mm, oom: reduce dependency on tasklist_lock. > > The bot has tested the following trees: v4.20.3,

[RFC PATCH] mm, oom: fix use-after-free in oom_kill_process

2019-01-18 Thread Shakeel Butt
the history but it seems like this is there before git history. Signed-off-by: Shakeel Butt --- mm/oom_kill.c | 8 1 file changed, 8 insertions(+) diff --git a/mm/oom_kill.c b/mm/oom_kill.c index 0930b4365be7..1a007dae1e8f 100644 --- a/mm/oom_kill.c +++ b/mm/oom_kill.c @@ -981,6 +981,13 @@

Re: [RFC PATCH] mm, oom: fix use-after-free in oom_kill_process

2019-01-20 Thread Shakeel Butt
On Fri, Jan 18, 2019 at 7:35 PM Tetsuo Handa wrote: > > On 2019/01/19 9:50, Shakeel Butt wrote: > > On looking further it seems like the process selected to be oom-killed > > has exited even before reaching read_lock(_lock) in > > oom_kill_process(). More specifically t

Re: [RFC PATCH] mm, oom: fix use-after-free in oom_kill_process

2019-01-20 Thread Shakeel Butt
On Fri, Jan 18, 2019 at 11:09 PM Michal Hocko wrote: > > On Fri 18-01-19 16:50:22, Shakeel Butt wrote: > [...] > > On looking further it seems like the process selected to be oom-killed > > has exited even before reaching read_lock(_lock) in > > oom_kill_process()

Re: [RFC PATCH] mm, oom: fix use-after-free in oom_kill_process

2019-01-20 Thread Shakeel Butt
On Fri, Jan 18, 2019 at 5:58 PM Roman Gushchin wrote: > > Hi Shakeel! > > > > > On looking further it seems like the process selected to be oom-killed > > has exited even before reaching read_lock(_lock) in > > oom_kill_process(). More specifically the tsk->usage is 1 which is due > > to

[PATCH] mm, oom: remove 'prefer children over parent' heuristic

2019-01-21 Thread Shakeel Butt
led before the parent. The select_bad_process() has already selected the worst process in the system/memcg. There is no need to recheck the badness of its children and hoping to find a worse candidate. That's a lot of unneeded racy work. So, let's remove this whole heuristic. Signed-off-by: Shakeel B

Re: [PATCH] mm, oom: remove 'prefer children over parent' heuristic

2019-01-21 Thread Shakeel Butt
On Sun, Jan 20, 2019 at 5:23 PM Tetsuo Handa wrote: > > Shakeel Butt wrote: > > + pr_err("%s: Kill process %d (%s) score %lu or sacrifice child\n", > > + message, task_pid_nr(p), p->comm, oc->chosen_points); > > This patch is to make &quo

Re: [PATCH] mm, oom: remove 'prefer children over parent' heuristic

2019-01-21 Thread Shakeel Butt
On Mon, Jan 21, 2019 at 1:19 AM Michal Hocko wrote: > > On Sun 20-01-19 13:50:59, Shakeel Butt wrote: > > >From the start of the git history of Linux, the kernel after selecting > > the worst process to be oom-killed, prefer to kill its child (if the > > child does n

[PATCH v2 1/2] mm, oom: fix use-after-free in oom_kill_process

2019-01-21 Thread Shakeel Butt
xes: 6b0c81b3be11 ("mm, oom: reduce dependency on tasklist_lock") Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko Cc: Andrew Morton Cc: David Rientjes Cc: Johannes Weiner Cc: Tetsuo Handa Cc: sta...@kernel.org Cc: linux...@kvack.org Cc: linux-kernel@vg

[PATCH v2 2/2] mm, oom: remove 'prefer children over parent' heuristic

2019-01-21 Thread Shakeel Butt
ike workloads to recover much later because we constantly pick and kill processes which are not memory hogs. So, let's remove this whole heuristic. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko Cc: Roman Gushchin Cc: Andrew Morton Cc: David Rientjes Cc: Johannes Weiner Cc: Tetsuo Handa

[PATCH v3 1/2] mm, oom: fix use-after-free in oom_kill_process

2019-01-21 Thread Shakeel Butt
xes: 6b0c81b3be11 ("mm, oom: reduce dependency on tasklist_lock") Signed-off-by: Shakeel Butt Reviewed-by: Roman Gushchin Acked-by: Michal Hocko Cc: Andrew Morton Cc: David Rientjes Cc: Johannes Weiner Cc: Tetsuo Handa Cc: sta...@kernel.org Cc: linux...@kvack.org Cc: linux-kernel@vg

[PATCH v3 2/2] mm, oom: remove 'prefer children over parent' heuristic

2019-01-21 Thread Shakeel Butt
ike workloads to recover much later because we constantly pick and kill processes which are not memory hogs. So, let's remove this whole heuristic. Signed-off-by: Shakeel Butt Acked-by: Michal Hocko Cc: Roman Gushchin Cc: Andrew Morton Cc: David Rientjes Cc: Johannes Weiner Cc: Tetsuo Handa

Re: [PATCH v3 2/2] mm, oom: remove 'prefer children over parent' heuristic

2019-01-21 Thread Shakeel Butt
On Mon, Jan 21, 2019 at 1:59 PM Shakeel Butt wrote: > > From the start of the git history of Linux, the kernel after selecting > the worst process to be oom-killed, prefer to kill its child (if the > child does not share mm with the parent). Later it was changed to prefer > to k

Re: general protection fault in put_pid

2019-01-07 Thread Shakeel Butt
On Mon, Jan 7, 2019 at 10:04 AM Manfred Spraul wrote: > > On 1/3/19 11:18 PM, Shakeel Butt wrote: > > Hi Manfred, > > > > On Sun, Dec 23, 2018 at 4:26 AM Manfred Spraul > > wrote: > >> Hello Dmitry, > >> > >> On 12/23/18 10:57 AM, Dmitry

Re: [PATCH] memcg: schedule high reclaim for remote memcgs on high_work

2019-01-08 Thread Shakeel Butt
On Tue, Jan 8, 2019 at 6:59 AM Michal Hocko wrote: > > On Wed 02-01-19 17:56:38, Shakeel Butt wrote: > > If a memcg is over high limit, memory reclaim is scheduled to run on > > return-to-userland. However it is assumed that the memcg is the current > > process's memcg. Wi

[PATCH] fork, memcg: fix cached_stacks case

2019-01-02 Thread Shakeel Butt
ched stack is failed. Fixes: 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg charge fail") Signed-off-by: Shakeel Butt Cc: Rik van Riel Cc: Roman Gushchin Cc: Michal Hocko Cc: Johannes Weiner Cc: Tejun Heo Cc: --- kernel/fork.c | 1 + 1 file changed, 1 inse

Re: [PATCH 1/3] doc: memcontrol: fix the obsolete content about force empty

2019-01-02 Thread Shakeel Butt
On Wed, Jan 2, 2019 at 12:07 PM Yang Shi wrote: > > We don't do page cache reparent anymore when offlining memcg, so update > force empty related content accordingly. > > Cc: Michal Hocko > Cc: Johannes Weiner > Signed-off-by: Yang Shi Reviewed-by: Shakeel Butt > ---

Re: [PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-02 Thread Shakeel Butt
On Wed, Jan 2, 2019 at 12:06 PM Yang Shi wrote: > > The typical usecase of force empty is to try to reclaim as much as > possible memory before offlining a memcg. Since there should be no > attached tasks to offlining memcg, the tasks anonymous pages would have > already been freed or uncharged.

[PATCH] memcg: localize memcg_kmem_enabled() check

2019-01-02 Thread Shakeel Butt
but the functionally it will be same. This should not matter as memcg_charge_slab() is not in the hot path. Signed-off-by: Shakeel Butt --- fs/pipe.c | 3 +-- include/linux/memcontrol.h | 28 mm/memcontrol.c| 16 mm/page_alloc.c

[PATCH] memcg: schedule high reclaim for remote memcgs on high_work

2019-01-02 Thread Shakeel Butt
, schduling reclaim on return-to-userland for remote memcgs will ignore the high reclaim altogether. So, punt the high reclaim of remote memcgs to high_work. Signed-off-by: Shakeel Butt --- mm/memcontrol.c | 20 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/mm

[PATCH v2] netfilter: account ebt_table_info to kmemcg

2019-01-02 Thread Shakeel Butt
allocations, we need to fix vmalloc. Reported-by: syzbot+7713f3aa67be76b15...@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt Cc: Florian Westphal Cc: Michal Hocko Cc: Kirill Tkhai Cc: Pablo Neira Ayuso Cc: Jozsef Kadlecsik Cc: Roopa Prabhu Cc: Nikolay Aleksandrov Cc: Andrew Morton Cc

[PATCH v2] memcg: localize memcg_kmem_enabled() check

2019-01-03 Thread Shakeel Butt
but the functionally it will be same. This should not matter as memcg_charge_slab() is not in the hot path. Signed-off-by: Shakeel Butt --- Changelog since v1: - Fixed the build when CONFIG_MEMCG is not set fs/pipe.c | 3 +-- include/linux/memcontrol.h | 37

Re: [PATCH v2] netfilter: account ebt_table_info to kmemcg

2019-01-03 Thread Shakeel Butt
On Thu, Jan 3, 2019 at 2:15 AM William Kucharski wrote: > > > > > On Jan 2, 2019, at 8:14 PM, Shakeel Butt wrote: > > > > countersize = COUNTER_OFFSET(tmp.nentries) * nr_cpu_ids; > > - newinfo = vmalloc(sizeof(*newinfo) + countersize); > > +

Re: [PATCH 2/3] mm: memcontrol: do not try to do swap when force empty

2019-01-03 Thread Shakeel Butt
On Thu, Jan 3, 2019 at 8:57 AM Yang Shi wrote: > > > > On 1/2/19 1:45 PM, Shakeel Butt wrote: > > On Wed, Jan 2, 2019 at 12:06 PM Yang Shi wrote: > >> The typical usecase of force empty is to try to reclaim as much as > >> possible memory before of

Re: [PATCH] netfilter: account ebt_table_info to kmemcg

2019-01-03 Thread Shakeel Butt
On Mon, Dec 31, 2018 at 2:12 AM Michal Hocko wrote: > > On Sun 30-12-18 19:59:53, Shakeel Butt wrote: > > On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko wrote: > > > > > > On Sun 30-12-18 08:45:13, Michal Hocko wrote: > > > > On Sat 29-12-18 11:34:29, Sh

Re: general protection fault in put_pid

2019-01-03 Thread Shakeel Butt
Hi Manfred, On Sun, Dec 23, 2018 at 4:26 AM Manfred Spraul wrote: > > Hello Dmitry, > > On 12/23/18 10:57 AM, Dmitry Vyukov wrote: > > > > I can reproduce this infinite memory consumption with the C program: > >

Re: [v2 PATCH 2/5] mm: memcontrol: do not try to do swap when force empty

2019-01-04 Thread Shakeel Butt
On Fri, Jan 4, 2019 at 4:21 PM Yang Shi wrote: > > The typical usecase of force empty is to try to reclaim as much as > possible memory before offlining a memcg. Since there should be no > attached tasks to offlining memcg, the tasks anonymous pages would have > already been freed or uncharged.

Re: [v2 PATCH 3/5] mm: memcontrol: introduce wipe_on_offline interface

2019-01-04 Thread Shakeel Butt
On Fri, Jan 4, 2019 at 4:21 PM Yang Shi wrote: > > We have some usecases which create and remove memcgs very frequently, > and the tasks in the memcg may just access the files which are unlikely > accessed by anyone else. So, we prefer force_empty the memcg before > rmdir'ing it to reclaim the

[PATCH v2] memcg: schedule high reclaim for remote memcgs on high_work

2019-01-08 Thread Shakeel Butt
is not the descendant of the the memcg needing high reclaim, punt the high reclaim to the work queue. Signed-off-by: Shakeel Butt --- Changelog since v1: - Punt high reclaim of a memcg to work queue only if the recorded memcg is not its descendant. include/linux/sched.h | 3 +++ kernel/fork.c | 1

Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

2019-01-08 Thread Shakeel Butt
On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel wrote: > > There is an imbalance between when slab_pre_alloc_hook calls > memcg_kmem_get_cache and when slab_post_alloc_hook calls > memcg_kmem_put_cache. > Can you explain how there is an imbalance? If the returned kmem cache from

Re: [PATCH] mm,slab,memcg: call memcg kmem put cache with same condition as get

2019-01-08 Thread Shakeel Butt
On Tue, Jan 8, 2019 at 9:36 PM Shakeel Butt wrote: > > On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel wrote: > > > > There is an imbalance between when slab_pre_alloc_hook calls > > memcg_kmem_get_cache and when slab_post_alloc_hook calls > > memcg_kmem_put_cach

Re: [PATCH RFC 0/3] mm: Reduce IO by improving algorithm of memcg pagecache pages eviction

2019-01-09 Thread Shakeel Butt
Hi Kirill, On Wed, Jan 9, 2019 at 4:20 AM Kirill Tkhai wrote: > > On nodes without memory overcommit, it's common a situation, > when memcg exceeds its limit and pages from pagecache are > shrinked on reclaim, while node has a lot of free memory. > Further access to the pages requires real

Re: [PATCH RFC 0/3] mm: Reduce IO by improving algorithm of memcg pagecache pages eviction

2019-01-09 Thread Shakeel Butt
Hi Johannes, On Wed, Jan 9, 2019 at 8:45 AM Johannes Weiner wrote: > > On Wed, Jan 09, 2019 at 03:20:18PM +0300, Kirill Tkhai wrote: > > On nodes without memory overcommit, it's common a situation, > > when memcg exceeds its limit and pages from pagecache are > > shrinked on reclaim, while node

[PATCH v3] memcg: schedule high reclaim for remote memcgs on high_work

2019-01-10 Thread Shakeel Butt
is not the descendant of the the memcg needing high reclaim, punt the high reclaim to the work queue. Signed-off-by: Shakeel Butt --- Changelog since v2: - TIF_NOTIFY_RESUME can be set from places other than try_charge() in which case current->memcg_high_reclaim will be null. Correctly han

Re: [PATCH RFC 0/3] mm: Reduce IO by improving algorithm of memcg pagecache pages eviction

2019-01-10 Thread Shakeel Butt
On Thu, Jan 10, 2019 at 1:46 AM Kirill Tkhai wrote: > > Hi, Shakeel, > > On 09.01.2019 20:37, Shakeel Butt wrote: > > Hi Kirill, > > > > On Wed, Jan 9, 2019 at 4:20 AM Kirill Tkhai wrote: > >> > >> On nodes without memory overcommit, it's commo

Re: [PATCH v3] memcg: schedule high reclaim for remote memcgs on high_work

2019-01-11 Thread Shakeel Butt
Hi Johannes, On Fri, Jan 11, 2019 at 12:59 PM Johannes Weiner wrote: > > Hi Shakeel, > > On Thu, Jan 10, 2019 at 09:44:32AM -0800, Shakeel Butt wrote: > > If a memcg is over high limit, memory reclaim is scheduled to run on > > return-to-userland. However it i

Re: [PATCH] mm: Introduce GFP_PGTABLE

2019-01-12 Thread Shakeel Butt
On Sat, Jan 12, 2019 at 2:27 AM Anshuman Khandual wrote: > > All architectures have been defining their own PGALLOC_GFP as (GFP_KERNEL | > __GFP_ZERO) and using it for allocating page table pages. This causes some > code duplication which can be easily avoided. GFP_KERNEL allocated and > cleared

Re: [PATCH] mm: Introduce GFP_PGTABLE

2019-01-12 Thread Shakeel Butt
On Sat, Jan 12, 2019 at 7:50 AM Matthew Wilcox wrote: > > On Sat, Jan 12, 2019 at 02:49:29PM +0100, Christophe Leroy wrote: > > As far as I can see, > > > > #define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT) > > > > So what's the difference between: > > > > (GFP_KERNEL_ACCOUNT | __GFP_ZERO)

[PATCH] percpu: plumb gfp flag to pcpu_get_pages

2018-12-28 Thread Shakeel Butt
__alloc_percpu_gfp() can be called from atomic context, so, make pcpu_get_pages use the gfp provided to the higher layer. Signed-off-by: Shakeel Butt --- mm/percpu-vm.c | 9 + 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c index

[PATCH] netfilter: account ebt_table_info to kmemcg

2018-12-28 Thread Shakeel Butt
+7713f3aa67be76b15...@syzkaller.appspotmail.com Signed-off-by: Shakeel Butt --- net/bridge/netfilter/ebtables.c | 6 -- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c index 491828713e0b..5e55cef0cec3 100644 --- a/net/bridge

Re: [PATCH] netfilter: account ebt_table_info to kmemcg

2018-12-29 Thread Shakeel Butt
On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko wrote: > > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > Michal Hocko wrote: > > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > > > > The [ip,ip6,arp]_tables use x_tables_info internally and the und

Re: Re: [PATCH] netfilter: account ebt_table_info to kmemcg

2018-12-29 Thread Shakeel Butt
Hi Kirill, On Sat, Dec 29, 2018 at 1:52 AM Kirill Tkhai wrote: > > Hi, Michal! > > On 29.12.2018 10:33, Michal Hocko wrote: > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote: > >> The [ip,ip6,arp]_tables use x_tables_info internally and the underlying > >> m

Re: [PATCH] percpu: plumb gfp flag to pcpu_get_pages

2018-12-29 Thread Shakeel Butt
Hi Dennis, On Sat, Dec 29, 2018 at 1:26 PM Dennis Zhou wrote: > > Hi Andrew, > > On Sat, Dec 29, 2018 at 01:03:52PM -0800, Andrew Morton wrote: > > On Fri, 28 Dec 2018 17:31:47 -0800 Shakeel Butt wrote: > > > > > __alloc_percpu_gfp() can be cal

Re: [PATCH] netfilter: account ebt_table_info to kmemcg

2018-12-30 Thread Shakeel Butt
On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko wrote: > > On Sun 30-12-18 08:45:13, Michal Hocko wrote: > > On Sat 29-12-18 11:34:29, Shakeel Butt wrote: > > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko wrote: > > > > > > > > On Sat 29-12-18 10:52

Re: [PATCH] netfilter: account ebt_table_info to kmemcg

2018-12-30 Thread Shakeel Butt
On Sat, Dec 29, 2018 at 11:45 PM Michal Hocko wrote: > > On Sat 29-12-18 11:34:29, Shakeel Butt wrote: > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko wrote: > > > > > > On Sat 29-12-18 10:52:15, Florian Westphal wrote: > > > > Michal Hocko wrote: &

[PATCH] memcg: reduce memcg tree traversals for stats collection

2018-07-24 Thread Shakeel Butt
memcgs on cgroup-v1. The results are: Without the patch: $ time ./read-root-stat-1000-times real0m1.663s user0m0.000s sys 0m1.660s With the patch: $ time ./read-root-stat-1000-times real0m0.468s user0m0.000s sys 0m0.467s Signed-off-by: Shakeel Butt --- mm/memcontrol.c

Re: [PATCH] memcg: reduce memcg tree traversals for stats collection

2018-07-25 Thread Shakeel Butt
On Wed, Jul 25, 2018 at 4:26 AM Bruce Merry wrote: > > On 25 July 2018 at 00:46, Shakeel Butt wrote: > > I ran a simple benchmark which reads the root_mem_cgroup's stat file > > 1000 times in the presense of 2500 memcgs on cgroup-v1. The results are: > > > > Withou

[PATCH v3] mm: fix race between kmem_cache destroy, create and deactivate

2018-05-29 Thread Shakeel Butt
kmemcg_deactivate_workfn process_one_work worker_thread kthread ret_from_fork+0x35/0x40 To fix this race, on root kmem cache destruction, mark the cache as dying and flush the workqueue used for memcg kmem cache creation and deactivation. Signed-off-by: Shakeel Butt --- Changelog

Re: [PATCH] memcg: force charge kmem counter too

2018-05-30 Thread Shakeel Butt
On Tue, May 29, 2018 at 1:31 AM, Michal Hocko wrote: > On Mon 28-05-18 10:23:07, Shakeel Butt wrote: >> On Mon, May 28, 2018 at 2:11 AM, Michal Hocko wrote: >> Though is there a precedence where the broken feature is not fixed >> because an alternative is available? &g

Re: [PATCH v3] mm: fix race between kmem_cache destroy, create and deactivate

2018-05-31 Thread Shakeel Butt
On Thu, May 31, 2018 at 5:18 PM, Andrew Morton wrote: > On Tue, 29 May 2018 17:12:04 -0700 Shakeel Butt wrote: > >> The memcg kmem cache creation and deactivation (SLUB only) is >> asynchronous. If a root kmem cache is destroyed whose memcg cache is in >> the process of

[PATCH] block, mm: remove unnecessary __GFP_HIGH flag

2018-07-03 Thread Shakeel Butt
The flag GFP_ATOMIC already contains __GFP_HIGH. There is no need to explicitly or __GFP_HIGH again. So, just remove unnecessary __GFP_HIGH. Signed-off-by: Shakeel Butt --- block/blk-ioc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/block/blk-ioc.c b/block/blk-ioc.c

Re: [PATCH v8 03/17] mm: Assign id to every memcg-aware shrinker

2018-07-03 Thread Shakeel Butt
On Tue, Jul 3, 2018 at 12:13 PM Kirill Tkhai wrote: > > On 03.07.2018 20:58, Matthew Wilcox wrote: > > On Tue, Jul 03, 2018 at 06:46:57PM +0300, Kirill Tkhai wrote: > >> shrinker_idr now contains only memcg-aware shrinkers, so all bits from > >> memcg map > >> may be potentially populated. In

Re: [PATCH v8 03/17] mm: Assign id to every memcg-aware shrinker

2018-07-03 Thread Shakeel Butt
On Tue, Jul 3, 2018 at 12:25 PM Matthew Wilcox wrote: > > On Tue, Jul 03, 2018 at 12:19:35PM -0700, Shakeel Butt wrote: > > On Tue, Jul 3, 2018 at 12:13 PM Kirill Tkhai wrote: > > > > Do we really have so very many !memcg-aware shrinkers? > > > > > &g

Re: [PATCH] mm: Cleanup in do_shrink_slab()

2018-07-19 Thread Shakeel Butt
f-by: Kirill Tkhai Reviewed-by: Shakeel Butt > --- > mm/vmscan.c | 11 +++ > 1 file changed, 3 insertions(+), 8 deletions(-) > > diff --git a/mm/vmscan.c b/mm/vmscan.c > index 9918bfc1d2f9..636657213b9b 100644 > --- a/mm/vmscan.c > +++ b/mm/vmscan.c > @@ -445,16

Re: [PATCH] mm: memcg: fix use after free in mem_cgroup_iter()

2018-07-19 Thread Shakeel Butt
On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko wrote: > > [CC Andrew] > > On Thu 19-07-18 18:06:47, Jing Xia wrote: > > It was reported that a kernel crash happened in mem_cgroup_iter(), > > which can be triggered if the legacy cgroup-v1 non-hierarchical > > mode is used. > > > > Unable to handle

Re: [PATCH] proc: fixup PDE allocation bloat

2018-07-19 Thread Shakeel Butt
On Sun, Jun 17, 2018 at 2:57 PM Alexey Dobriyan wrote: > > commit 24074a35c5c975c94cd9691ae962855333aac47f > ("proc: Make inline name size calculation automatic") > started to put PDE allocations into kmalloc-256 which is unnecessary as > ~40 character names are very rare. > > Put allocation back

Re: [PATCH] mm: memcg: fix use after free in mem_cgroup_iter()

2018-07-23 Thread Shakeel Butt
On Sun, Jul 22, 2018 at 11:44 PM Michal Hocko wrote: > > On Thu 19-07-18 09:23:10, Shakeel Butt wrote: > > On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko wrote: > > > > > > [CC Andrew] > > > > > > On Thu 19-07-18 18:06:47, Jing Xia wrote: >

[PATCH v4] mm: fix race between kmem_cache destroy, create and deactivate

2018-06-11 Thread Shakeel Butt
includes RCU callback and thus make sure all previous registered RCU callbacks have completed as well. Signed-off-by: Shakeel Butt --- Changelog since v3: - Handle the RCU callbacks for SLUB deactivation Changelog since v2: - Rewrote the patch and used workqueue flushing instead of refcount

Re: [PATCH v2] mm: slowly shrink slabs with a relatively small number of objects

2018-09-05 Thread Shakeel Butt
On Wed, Sep 5, 2018 at 2:23 PM Roman Gushchin wrote: > > On Wed, Sep 05, 2018 at 01:51:52PM -0700, Andrew Morton wrote: > > On Tue, 4 Sep 2018 15:47:07 -0700 Roman Gushchin wrote: > > > > > Commit 9092c71bb724 ("mm: use sc->priority for slab shrink targets") > > > changed the way how the target

Re: [PATCH] mm: avoid bothering interrupted task when charge memcg in softirq

2018-07-14 Thread Shakeel Butt
On Sat, Jul 14, 2018 at 1:32 AM Yafang Shao wrote: > > try_charge maybe executed in packet receive path, which is in interrupt > context. > In this situation, the 'current' is the interrupted task, which may has > no relation to the rx softirq, So it is nonsense to use 'current'. > Have you

Re: [PATCH] mm: avoid bothering interrupted task when charge memcg in softirq

2018-07-14 Thread Shakeel Butt
On Sat, Jul 14, 2018 at 7:10 PM Yafang Shao wrote: > > On Sat, Jul 14, 2018 at 11:38 PM, Shakeel Butt wrote: > > On Sat, Jul 14, 2018 at 1:32 AM Yafang Shao wrote: > >> > >> try_charge maybe executed in packet receive path, which is in interrupt > &

Re: [PATCH] mm: avoid bothering interrupted task when charge memcg in softirq

2018-07-15 Thread Shakeel Butt
On Sat, Jul 14, 2018 at 10:26 PM Yafang Shao wrote: > > On Sun, Jul 15, 2018 at 12:25 PM, Shakeel Butt wrote: > > On Sat, Jul 14, 2018 at 7:10 PM Yafang Shao wrote: > >> > >> On Sat, Jul 14, 2018 at 11:38 PM, Shakeel Butt wrote: > >> > On Sat,

Re: [PATCH] mm: avoid bothering interrupted task when charge memcg in softirq

2018-07-15 Thread Shakeel Butt
On Sun, Jul 15, 2018 at 1:02 AM Yafang Shao wrote: > > On Sun, Jul 15, 2018 at 2:34 PM, Shakeel Butt wrote: > > On Sat, Jul 14, 2018 at 10:26 PM Yafang Shao wrote: > >> > >> On Sun, Jul 15, 2018 at 12:25 PM, Shakeel Butt wrote: > >> > On Sat,

Re: [PATCH] mm: avoid bothering interrupted task when charge memcg in softirq

2018-07-15 Thread Shakeel Butt
On Sun, Jul 15, 2018 at 6:50 PM Yafang Shao wrote: > > On Sun, Jul 15, 2018 at 11:04 PM, Shakeel Butt wrote: > > On Sun, Jul 15, 2018 at 1:02 AM Yafang Shao wrote: > >> > >> On Sun, Jul 15, 2018 at 2:34 PM, Shakeel Butt wrote: > >> > On Sat, Jul 14, 2

Re: [PATCH v7 00/17] Improve shrink_slab() scalability (old complexity was O(n^2), new is O(n))

2018-06-06 Thread Shakeel Butt
On Tue, May 22, 2018 at 3:07 AM Kirill Tkhai wrote: > > Hi, > > this patches solves the problem with slow shrink_slab() occuring > on the machines having many shrinkers and memory cgroups (i.e., > with many containers). The problem is complexity of shrink_slab() > is O(n^2) and it grows too fast

<    5   6   7   8   9   10   11   12   >