From: Roman Gushchin
Date: Wed, May 8, 2019 at 1:40 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> Currently the page accounting code is duplicated in SLAB and SLUB
> internals. Let'
From: Roman Gushchin
Date: Wed, May 8, 2019 at 1:41 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> This commit makes several important changes in the lifecycle
> of a non-root kmem_cache,
From: Roman Gushchin
Date: Wed, May 8, 2019 at 1:41 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> Let's reparent memcg slab memory on memcg offlining. This allows us
> to release the memory
From: Roman Gushchin
Date: Wed, May 8, 2019 at 1:40 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> Switching to an indirect scheme of getting mem_cgroup pointer for
> !root slab pages broke
emcg, explicitly add
__GFP_RETRY_MAYFAIL to the fanotigy and inotify event allocations.
Signed-off-by: Shakeel Butt
Reviewed-by: Roman Gushchin
---
Changelog since v1:
- Fixed usage of __GFP_RETRY_MAYFAIL flag.
fs/notify/fanotify/fanotify.c| 5 -
fs/notify/inotify/inotify_fsnotify.c
-killer in the charging path for fanotify and inotify
event allocations.
Signed-off-by: Shakeel Butt
Acked-by: Michal Hocko
---
Changelog since v1:
- commit message updated.
mm/memcontrol.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
On Fri, Apr 19, 2019 at 1:07 PM Roman Gushchin wrote:
>
> On Thu, Apr 18, 2019 at 02:42:24PM -0700, Shakeel Butt wrote:
> > The commit 475d0487a2ad ("mm: memcontrol: use per-cpu stocks for socket
> > memory uncharging") added refill_stock() for skmem uncharging pa
ned
memcgs but it may impact the performance of network traffic for the
sockets used by other cgroups.
Signed-off-by: Shakeel Butt
Cc: Roman Gushchin
Cc: Johannes Weiner
Cc: Michal Hocko
Cc: Vladimir Davydov
Cc: Andrew Morton
---
Changelog since v1:
- No need to bypass offline memcgs in the re
From: Roman Gushchin
Date: Mon, May 13, 2019 at 1:22 PM
To: Shakeel Butt
Cc: Andrew Morton, Linux MM, LKML, Kernel Team, Johannes Weiner,
Michal Hocko, Rik van Riel, Christoph Lameter, Vladimir Davydov,
Cgroups
> On Fri, May 10, 2019 at 05:32:15PM -0700, Shakeel Butt wrote:
> > Fr
emcg, explicitly add
__GFP_RETRY_MAYFAIL to the fanotigy and inotify event allocations.
Signed-off-by: Shakeel Butt
Reviewed-by: Roman Gushchin
---
Changelog since v2:
- updated the comments.
Changelog since v1:
- Fixed usage of __GFP_RETRY_MAYFAIL flag.
fs/notify/fanotify/fanotify.c
-killer in the charging path for fanotify and inotify
event allocations.
Signed-off-by: Shakeel Butt
Acked-by: Michal Hocko
---
Changelog since v2:
- None
Changelog since v1:
- commit message updated
mm/memcontrol.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm
From: Roman Gushchin
Date: Tue, May 14, 2019 at 2:55 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> This commit makes several important changes in the lifecycle
> of a non-root kmem_cache,
From: Roman Gushchin
Date: Tue, May 14, 2019 at 2:54 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> Let's reparent memcg slab memory on memcg offlining. This allows us
> to release the
From: Roman Gushchin
Date: Tue, May 14, 2019 at 2:54 PM
To: Andrew Morton, Shakeel Butt
Cc: , ,
, Johannes Weiner, Michal Hocko, Rik van Riel,
Christoph Lameter, Vladimir Davydov, , Roman
Gushchin
> Switching to an indirect scheme of getting mem_cgroup pointer for
> !root slab pages broke
From: Christopher Lameter
Date: Wed, May 15, 2019 at 7:00 AM
To: Roman Gushchin
Cc: Andrew Morton, Shakeel Butt, ,
, , Johannes Weiner,
Michal Hocko, Rik van Riel, Vladimir Davydov,
> On Tue, 14 May 2019, Roman Gushchin wrote:
>
> > To make this possible we need to introduce
emcg, explicitly add
__GFP_RETRY_MAYFAIL to the fanotigy and inotify event allocations.
Signed-off-by: Shakeel Butt
---
Changelog since v1:
- Fixed usage of __GFP_RETRY_MAYFAIL flag.
fs/notify/fanotify/fanotify.c| 5 -
fs/notify/inotify/inotify_fsnotify.c | 7 +--
2 files changed
On Wed, Apr 24, 2019 at 11:49 PM Michal Hocko wrote:
>
> On Tue 23-04-19 08:44:05, Shakeel Butt wrote:
> > The commit 475d0487a2ad ("mm: memcontrol: use per-cpu stocks for socket
> > memory uncharging") added refill_stock() for skmem uncharging path to
> > opti
The documentation of __GFP_RETRY_MAYFAIL clearly mentioned that the
OOM killer will not be triggered and indeed the page alloc does not
invoke OOM killer for such allocations. However we do trigger memcg
OOM killer for __GFP_RETRY_MAYFAIL. Fix that.
Signed-off-by: Shakeel Butt
---
mm
On Mon, Apr 29, 2019 at 5:22 AM Michal Hocko wrote:
>
> On Sun 28-04-19 16:56:13, Shakeel Butt wrote:
> > The documentation of __GFP_RETRY_MAYFAIL clearly mentioned that the
> > OOM killer will not be triggered and indeed the page alloc does not
> > invoke OOM killer for s
-killer in the charging path for fanotify and inotify
event allocations.
Signed-off-by: Shakeel Butt
Acked-by: Michal Hocko
---
Changelog since v1:
- commit message updated.
mm/memcontrol.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
emcg, explicitly add
__GFP_RETRY_MAYFAIL to the fanotify and inotify event allocations.
Signed-off-by: Shakeel Butt
---
fs/notify/fanotify/fanotify.c| 4 +++-
fs/notify/inotify/inotify_fsnotify.c | 7 +--
2 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/fs/notify/fanotify/fa
On Mon, Apr 29, 2019 at 5:41 PM Michal Hocko wrote:
>
> On Mon 29-04-19 10:13:32, Shakeel Butt wrote:
> [...]
> > /*
> >* For queues with unlimited length lost events are not expected and
> >* can possibly have security implication
eness
> by a bool flag in struct list_lru.
>
> [v2] use the idea proposed by Vladimir -- the bool flag.
>
> Signed-off-by: Jiri Slaby
Reviewed-by: Shakeel Butt
> Cc: Johannes Weiner
> Cc: Michal Hocko
> Suggested-by: Vladimir Davydov
> Acked-by: Vladimir Davydov
> Cc:
there will not be
any process in the internal nodes and thus no chance of local pressure.
Signed-off-by: Shakeel Butt
---
include/linux/memcontrol.h | 7 ++-
mm/memcontrol.c| 25 +
2 files changed, 31 insertions(+), 1 deletion(-)
diff --git a/include/linux
there will not be
any process in the internal nodes and thus no chance of local pressure.
Signed-off-by: Shakeel Butt
---
Changelog since v1:
- refactor memory_events_show to share between events and events.local
include/linux/memcontrol.h | 7 ++-
mm/memcontrol.c| 34
On Fri, May 17, 2019 at 5:59 PM Roman Gushchin wrote:
>
> On Fri, May 17, 2019 at 05:18:18PM -0700, Shakeel Butt wrote:
> > The memory controller in cgroup v2 exposes memory.events file for each
> > memcg which shows the number of times events like low, high, max, oom
>
The callers of cgroup_migrate_prepare_dst() correctly call
cgroup_migrate_finish() for success and failure cases both. No need to
call it in cgroup_migrate_prepare_dst() in failure case.
Signed-off-by: Shakeel Butt
---
kernel/cgroup/cgroup.c | 5 +
1 file changed, 1 insertion(+), 4
. The simplest solution is to remove the
assumption of no mmapping PageKmemcg() pages to user space.
Signed-off-by: Shakeel Butt
---
arch/s390/kvm/kvm-s390.c | 2 +-
arch/x86/kvm/x86.c | 2 +-
include/linux/page-flags.h | 26 ++
include/trace/events
On Thu, Mar 28, 2019 at 7:36 PM Matthew Wilcox wrote:
>
> On Thu, Mar 28, 2019 at 06:28:36PM -0700, Shakeel Butt wrote:
> > A VCPU of a VM can allocate upto three pages which can be mmap'ed by the
> > user space application. At the moment this memory is not charged. On a
> &g
On Fri, Mar 29, 2019 at 12:52 AM Michal Hocko wrote:
>
> On Thu 28-03-19 18:28:36, Shakeel Butt wrote:
> > A VCPU of a VM can allocate upto three pages which can be mmap'ed by the
> > user space application. At the moment this memory is not charged. On a
> > large mach
the accuracy of stats are getting worse.
Internally we have an additional interface memory.stat_exact for that.
However I am not sure in the upstream kernel will an additional
interface is better or something like /proc/sys/vm/stat_refresh which
sync all per-cpu stats.
> Signed-off-by: Johannes Weiner
ange.
>
> The upward spilling is batched using the existing per-cpu cache. In a
> sparse file stress test with 5 level cgroup nesting, the additional
> cost of the flushing was negligible (a little under 1% of CPU at 100%
> CPU utilization, compared to the 5% of reading memory.s
On Fri, Apr 12, 2019 at 1:10 PM Johannes Weiner wrote:
>
> On Fri, Apr 12, 2019 at 12:55:10PM -0700, Shakeel Butt wrote:
> > We also faced this exact same issue as well and had the similar solution.
> >
> > > Signed-off-by: Johannes Weiner
> >
> >
On Fri, Apr 12, 2019 at 1:16 PM Roman Gushchin wrote:
>
> On Fri, Apr 12, 2019 at 12:55:10PM -0700, Shakeel Butt wrote:
> > On Fri, Apr 12, 2019 at 8:15 AM Johannes Weiner wrote:
> > >
> > > Right now, when somebody needs to know the recursive memory statistic
> > > >
> > > > On Mon, 11 Mar 2019 06:08:01 -0700 syzbot
> > > > wrote:
> > > >
> > > > > syzbot has bisected this bug to:
> > > > >
> > > > > commit 29a4b8e275d1f10c51c7891362877ef6cffae9e7
> > >
On Wed, Jan 23, 2019 at 2:57 PM Sasha Levin wrote:
>
> Hi,
>
> [This is an automated email]
>
> This commit has been processed because it contains a "Fixes:" tag,
> fixing commit: 6b0c81b3be11 mm, oom: reduce dependency on tasklist_lock.
>
> The bot has tested the following trees: v4.20.3,
the history but it
seems like this is there before git history.
Signed-off-by: Shakeel Butt
---
mm/oom_kill.c | 8
1 file changed, 8 insertions(+)
diff --git a/mm/oom_kill.c b/mm/oom_kill.c
index 0930b4365be7..1a007dae1e8f 100644
--- a/mm/oom_kill.c
+++ b/mm/oom_kill.c
@@ -981,6 +981,13 @@
On Fri, Jan 18, 2019 at 7:35 PM Tetsuo Handa
wrote:
>
> On 2019/01/19 9:50, Shakeel Butt wrote:
> > On looking further it seems like the process selected to be oom-killed
> > has exited even before reaching read_lock(_lock) in
> > oom_kill_process(). More specifically t
On Fri, Jan 18, 2019 at 11:09 PM Michal Hocko wrote:
>
> On Fri 18-01-19 16:50:22, Shakeel Butt wrote:
> [...]
> > On looking further it seems like the process selected to be oom-killed
> > has exited even before reaching read_lock(_lock) in
> > oom_kill_process()
On Fri, Jan 18, 2019 at 5:58 PM Roman Gushchin wrote:
>
> Hi Shakeel!
>
> >
> > On looking further it seems like the process selected to be oom-killed
> > has exited even before reaching read_lock(_lock) in
> > oom_kill_process(). More specifically the tsk->usage is 1 which is due
> > to
led
before the parent.
The select_bad_process() has already selected the worst process in the
system/memcg. There is no need to recheck the badness of its children
and hoping to find a worse candidate. That's a lot of unneeded racy
work. So, let's remove this whole heuristic.
Signed-off-by: Shakeel B
On Sun, Jan 20, 2019 at 5:23 PM Tetsuo Handa
wrote:
>
> Shakeel Butt wrote:
> > + pr_err("%s: Kill process %d (%s) score %lu or sacrifice child\n",
> > + message, task_pid_nr(p), p->comm, oc->chosen_points);
>
> This patch is to make &quo
On Mon, Jan 21, 2019 at 1:19 AM Michal Hocko wrote:
>
> On Sun 20-01-19 13:50:59, Shakeel Butt wrote:
> > >From the start of the git history of Linux, the kernel after selecting
> > the worst process to be oom-killed, prefer to kill its child (if the
> > child does n
xes: 6b0c81b3be11 ("mm, oom: reduce dependency on tasklist_lock")
Signed-off-by: Shakeel Butt
Reviewed-by: Roman Gushchin
Acked-by: Michal Hocko
Cc: Andrew Morton
Cc: David Rientjes
Cc: Johannes Weiner
Cc: Tetsuo Handa
Cc: sta...@kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vg
ike
workloads to recover much later because we constantly pick and kill
processes which are not memory hogs. So, let's remove this whole
heuristic.
Signed-off-by: Shakeel Butt
Acked-by: Michal Hocko
Cc: Roman Gushchin
Cc: Andrew Morton
Cc: David Rientjes
Cc: Johannes Weiner
Cc: Tetsuo Handa
xes: 6b0c81b3be11 ("mm, oom: reduce dependency on tasklist_lock")
Signed-off-by: Shakeel Butt
Reviewed-by: Roman Gushchin
Acked-by: Michal Hocko
Cc: Andrew Morton
Cc: David Rientjes
Cc: Johannes Weiner
Cc: Tetsuo Handa
Cc: sta...@kernel.org
Cc: linux...@kvack.org
Cc: linux-kernel@vg
ike
workloads to recover much later because we constantly pick and kill
processes which are not memory hogs. So, let's remove this whole
heuristic.
Signed-off-by: Shakeel Butt
Acked-by: Michal Hocko
Cc: Roman Gushchin
Cc: Andrew Morton
Cc: David Rientjes
Cc: Johannes Weiner
Cc: Tetsuo Handa
On Mon, Jan 21, 2019 at 1:59 PM Shakeel Butt wrote:
>
> From the start of the git history of Linux, the kernel after selecting
> the worst process to be oom-killed, prefer to kill its child (if the
> child does not share mm with the parent). Later it was changed to prefer
> to k
On Mon, Jan 7, 2019 at 10:04 AM Manfred Spraul
wrote:
>
> On 1/3/19 11:18 PM, Shakeel Butt wrote:
> > Hi Manfred,
> >
> > On Sun, Dec 23, 2018 at 4:26 AM Manfred Spraul
> > wrote:
> >> Hello Dmitry,
> >>
> >> On 12/23/18 10:57 AM, Dmitry
On Tue, Jan 8, 2019 at 6:59 AM Michal Hocko wrote:
>
> On Wed 02-01-19 17:56:38, Shakeel Butt wrote:
> > If a memcg is over high limit, memory reclaim is scheduled to run on
> > return-to-userland. However it is assumed that the memcg is the current
> > process's memcg. Wi
ched stack is failed.
Fixes: 5eed6f1dff87 ("fork,memcg: fix crash in free_thread_stack on memcg
charge fail")
Signed-off-by: Shakeel Butt
Cc: Rik van Riel
Cc: Roman Gushchin
Cc: Michal Hocko
Cc: Johannes Weiner
Cc: Tejun Heo
Cc:
---
kernel/fork.c | 1 +
1 file changed, 1 inse
On Wed, Jan 2, 2019 at 12:07 PM Yang Shi wrote:
>
> We don't do page cache reparent anymore when offlining memcg, so update
> force empty related content accordingly.
>
> Cc: Michal Hocko
> Cc: Johannes Weiner
> Signed-off-by: Yang Shi
Reviewed-by: Shakeel Butt
> ---
On Wed, Jan 2, 2019 at 12:06 PM Yang Shi wrote:
>
> The typical usecase of force empty is to try to reclaim as much as
> possible memory before offlining a memcg. Since there should be no
> attached tasks to offlining memcg, the tasks anonymous pages would have
> already been freed or uncharged.
but the functionally it will be same. This should not matter as
memcg_charge_slab() is not in the hot path.
Signed-off-by: Shakeel Butt
---
fs/pipe.c | 3 +--
include/linux/memcontrol.h | 28
mm/memcontrol.c| 16
mm/page_alloc.c
, schduling reclaim on return-to-userland for remote
memcgs will ignore the high reclaim altogether. So, punt the high
reclaim of remote memcgs to high_work.
Signed-off-by: Shakeel Butt
---
mm/memcontrol.c | 20
1 file changed, 12 insertions(+), 8 deletions(-)
diff --git a/mm
allocations,
we need to fix vmalloc.
Reported-by: syzbot+7713f3aa67be76b15...@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt
Cc: Florian Westphal
Cc: Michal Hocko
Cc: Kirill Tkhai
Cc: Pablo Neira Ayuso
Cc: Jozsef Kadlecsik
Cc: Roopa Prabhu
Cc: Nikolay Aleksandrov
Cc: Andrew Morton
Cc
but the functionally it will be same. This should not matter as
memcg_charge_slab() is not in the hot path.
Signed-off-by: Shakeel Butt
---
Changelog since v1:
- Fixed the build when CONFIG_MEMCG is not set
fs/pipe.c | 3 +--
include/linux/memcontrol.h | 37
On Thu, Jan 3, 2019 at 2:15 AM William Kucharski
wrote:
>
>
>
> > On Jan 2, 2019, at 8:14 PM, Shakeel Butt wrote:
> >
> > countersize = COUNTER_OFFSET(tmp.nentries) * nr_cpu_ids;
> > - newinfo = vmalloc(sizeof(*newinfo) + countersize);
> > +
On Thu, Jan 3, 2019 at 8:57 AM Yang Shi wrote:
>
>
>
> On 1/2/19 1:45 PM, Shakeel Butt wrote:
> > On Wed, Jan 2, 2019 at 12:06 PM Yang Shi wrote:
> >> The typical usecase of force empty is to try to reclaim as much as
> >> possible memory before of
On Mon, Dec 31, 2018 at 2:12 AM Michal Hocko wrote:
>
> On Sun 30-12-18 19:59:53, Shakeel Butt wrote:
> > On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko wrote:
> > >
> > > On Sun 30-12-18 08:45:13, Michal Hocko wrote:
> > > > On Sat 29-12-18 11:34:29, Sh
Hi Manfred,
On Sun, Dec 23, 2018 at 4:26 AM Manfred Spraul wrote:
>
> Hello Dmitry,
>
> On 12/23/18 10:57 AM, Dmitry Vyukov wrote:
> >
> > I can reproduce this infinite memory consumption with the C program:
> >
On Fri, Jan 4, 2019 at 4:21 PM Yang Shi wrote:
>
> The typical usecase of force empty is to try to reclaim as much as
> possible memory before offlining a memcg. Since there should be no
> attached tasks to offlining memcg, the tasks anonymous pages would have
> already been freed or uncharged.
On Fri, Jan 4, 2019 at 4:21 PM Yang Shi wrote:
>
> We have some usecases which create and remove memcgs very frequently,
> and the tasks in the memcg may just access the files which are unlikely
> accessed by anyone else. So, we prefer force_empty the memcg before
> rmdir'ing it to reclaim the
is not the descendant of the the memcg
needing high reclaim, punt the high reclaim to the work queue.
Signed-off-by: Shakeel Butt
---
Changelog since v1:
- Punt high reclaim of a memcg to work queue only if the recorded memcg
is not its descendant.
include/linux/sched.h | 3 +++
kernel/fork.c | 1
On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel wrote:
>
> There is an imbalance between when slab_pre_alloc_hook calls
> memcg_kmem_get_cache and when slab_post_alloc_hook calls
> memcg_kmem_put_cache.
>
Can you explain how there is an imbalance? If the returned kmem cache
from
On Tue, Jan 8, 2019 at 9:36 PM Shakeel Butt wrote:
>
> On Tue, Jan 8, 2019 at 8:01 PM Rik van Riel wrote:
> >
> > There is an imbalance between when slab_pre_alloc_hook calls
> > memcg_kmem_get_cache and when slab_post_alloc_hook calls
> > memcg_kmem_put_cach
Hi Kirill,
On Wed, Jan 9, 2019 at 4:20 AM Kirill Tkhai wrote:
>
> On nodes without memory overcommit, it's common a situation,
> when memcg exceeds its limit and pages from pagecache are
> shrinked on reclaim, while node has a lot of free memory.
> Further access to the pages requires real
Hi Johannes,
On Wed, Jan 9, 2019 at 8:45 AM Johannes Weiner wrote:
>
> On Wed, Jan 09, 2019 at 03:20:18PM +0300, Kirill Tkhai wrote:
> > On nodes without memory overcommit, it's common a situation,
> > when memcg exceeds its limit and pages from pagecache are
> > shrinked on reclaim, while node
is not the descendant of the the memcg
needing high reclaim, punt the high reclaim to the work queue.
Signed-off-by: Shakeel Butt
---
Changelog since v2:
- TIF_NOTIFY_RESUME can be set from places other than try_charge() in
which case current->memcg_high_reclaim will be null. Correctly han
On Thu, Jan 10, 2019 at 1:46 AM Kirill Tkhai wrote:
>
> Hi, Shakeel,
>
> On 09.01.2019 20:37, Shakeel Butt wrote:
> > Hi Kirill,
> >
> > On Wed, Jan 9, 2019 at 4:20 AM Kirill Tkhai wrote:
> >>
> >> On nodes without memory overcommit, it's commo
Hi Johannes,
On Fri, Jan 11, 2019 at 12:59 PM Johannes Weiner wrote:
>
> Hi Shakeel,
>
> On Thu, Jan 10, 2019 at 09:44:32AM -0800, Shakeel Butt wrote:
> > If a memcg is over high limit, memory reclaim is scheduled to run on
> > return-to-userland. However it i
On Sat, Jan 12, 2019 at 2:27 AM Anshuman Khandual
wrote:
>
> All architectures have been defining their own PGALLOC_GFP as (GFP_KERNEL |
> __GFP_ZERO) and using it for allocating page table pages. This causes some
> code duplication which can be easily avoided. GFP_KERNEL allocated and
> cleared
On Sat, Jan 12, 2019 at 7:50 AM Matthew Wilcox wrote:
>
> On Sat, Jan 12, 2019 at 02:49:29PM +0100, Christophe Leroy wrote:
> > As far as I can see,
> >
> > #define GFP_KERNEL_ACCOUNT (GFP_KERNEL | __GFP_ACCOUNT)
> >
> > So what's the difference between:
> >
> > (GFP_KERNEL_ACCOUNT | __GFP_ZERO)
__alloc_percpu_gfp() can be called from atomic context, so, make
pcpu_get_pages use the gfp provided to the higher layer.
Signed-off-by: Shakeel Butt
---
mm/percpu-vm.c | 9 +
1 file changed, 5 insertions(+), 4 deletions(-)
diff --git a/mm/percpu-vm.c b/mm/percpu-vm.c
index
+7713f3aa67be76b15...@syzkaller.appspotmail.com
Signed-off-by: Shakeel Butt
---
net/bridge/netfilter/ebtables.c | 6 --
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/net/bridge/netfilter/ebtables.c b/net/bridge/netfilter/ebtables.c
index 491828713e0b..5e55cef0cec3 100644
--- a/net/bridge
On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko wrote:
>
> On Sat 29-12-18 10:52:15, Florian Westphal wrote:
> > Michal Hocko wrote:
> > > On Fri 28-12-18 17:55:24, Shakeel Butt wrote:
> > > > The [ip,ip6,arp]_tables use x_tables_info internally and the und
Hi Kirill,
On Sat, Dec 29, 2018 at 1:52 AM Kirill Tkhai wrote:
>
> Hi, Michal!
>
> On 29.12.2018 10:33, Michal Hocko wrote:
> > On Fri 28-12-18 17:55:24, Shakeel Butt wrote:
> >> The [ip,ip6,arp]_tables use x_tables_info internally and the underlying
> >> m
Hi Dennis,
On Sat, Dec 29, 2018 at 1:26 PM Dennis Zhou wrote:
>
> Hi Andrew,
>
> On Sat, Dec 29, 2018 at 01:03:52PM -0800, Andrew Morton wrote:
> > On Fri, 28 Dec 2018 17:31:47 -0800 Shakeel Butt wrote:
> >
> > > __alloc_percpu_gfp() can be cal
On Sun, Dec 30, 2018 at 12:00 AM Michal Hocko wrote:
>
> On Sun 30-12-18 08:45:13, Michal Hocko wrote:
> > On Sat 29-12-18 11:34:29, Shakeel Butt wrote:
> > > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko wrote:
> > > >
> > > > On Sat 29-12-18 10:52
On Sat, Dec 29, 2018 at 11:45 PM Michal Hocko wrote:
>
> On Sat 29-12-18 11:34:29, Shakeel Butt wrote:
> > On Sat, Dec 29, 2018 at 2:06 AM Michal Hocko wrote:
> > >
> > > On Sat 29-12-18 10:52:15, Florian Westphal wrote:
> > > > Michal Hocko wrote:
&
memcgs on cgroup-v1. The results are:
Without the patch:
$ time ./read-root-stat-1000-times
real0m1.663s
user0m0.000s
sys 0m1.660s
With the patch:
$ time ./read-root-stat-1000-times
real0m0.468s
user0m0.000s
sys 0m0.467s
Signed-off-by: Shakeel Butt
---
mm/memcontrol.c
On Wed, Jul 25, 2018 at 4:26 AM Bruce Merry wrote:
>
> On 25 July 2018 at 00:46, Shakeel Butt wrote:
> > I ran a simple benchmark which reads the root_mem_cgroup's stat file
> > 1000 times in the presense of 2500 memcgs on cgroup-v1. The results are:
> >
> > Withou
kmemcg_deactivate_workfn
process_one_work
worker_thread
kthread
ret_from_fork+0x35/0x40
To fix this race, on root kmem cache destruction, mark the cache as
dying and flush the workqueue used for memcg kmem cache creation and
deactivation.
Signed-off-by: Shakeel Butt
---
Changelog
On Tue, May 29, 2018 at 1:31 AM, Michal Hocko wrote:
> On Mon 28-05-18 10:23:07, Shakeel Butt wrote:
>> On Mon, May 28, 2018 at 2:11 AM, Michal Hocko wrote:
>> Though is there a precedence where the broken feature is not fixed
>> because an alternative is available?
&g
On Thu, May 31, 2018 at 5:18 PM, Andrew Morton
wrote:
> On Tue, 29 May 2018 17:12:04 -0700 Shakeel Butt wrote:
>
>> The memcg kmem cache creation and deactivation (SLUB only) is
>> asynchronous. If a root kmem cache is destroyed whose memcg cache is in
>> the process of
The flag GFP_ATOMIC already contains __GFP_HIGH. There is no need to
explicitly or __GFP_HIGH again. So, just remove unnecessary __GFP_HIGH.
Signed-off-by: Shakeel Butt
---
block/blk-ioc.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/block/blk-ioc.c b/block/blk-ioc.c
On Tue, Jul 3, 2018 at 12:13 PM Kirill Tkhai wrote:
>
> On 03.07.2018 20:58, Matthew Wilcox wrote:
> > On Tue, Jul 03, 2018 at 06:46:57PM +0300, Kirill Tkhai wrote:
> >> shrinker_idr now contains only memcg-aware shrinkers, so all bits from
> >> memcg map
> >> may be potentially populated. In
On Tue, Jul 3, 2018 at 12:25 PM Matthew Wilcox wrote:
>
> On Tue, Jul 03, 2018 at 12:19:35PM -0700, Shakeel Butt wrote:
> > On Tue, Jul 3, 2018 at 12:13 PM Kirill Tkhai wrote:
> > > > Do we really have so very many !memcg-aware shrinkers?
> > > >
> &g
f-by: Kirill Tkhai
Reviewed-by: Shakeel Butt
> ---
> mm/vmscan.c | 11 +++
> 1 file changed, 3 insertions(+), 8 deletions(-)
>
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 9918bfc1d2f9..636657213b9b 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -445,16
On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko wrote:
>
> [CC Andrew]
>
> On Thu 19-07-18 18:06:47, Jing Xia wrote:
> > It was reported that a kernel crash happened in mem_cgroup_iter(),
> > which can be triggered if the legacy cgroup-v1 non-hierarchical
> > mode is used.
> >
> > Unable to handle
On Sun, Jun 17, 2018 at 2:57 PM Alexey Dobriyan wrote:
>
> commit 24074a35c5c975c94cd9691ae962855333aac47f
> ("proc: Make inline name size calculation automatic")
> started to put PDE allocations into kmalloc-256 which is unnecessary as
> ~40 character names are very rare.
>
> Put allocation back
On Sun, Jul 22, 2018 at 11:44 PM Michal Hocko wrote:
>
> On Thu 19-07-18 09:23:10, Shakeel Butt wrote:
> > On Thu, Jul 19, 2018 at 3:43 AM Michal Hocko wrote:
> > >
> > > [CC Andrew]
> > >
> > > On Thu 19-07-18 18:06:47, Jing Xia wrote:
>
includes RCU
callback and thus make sure all previous registered RCU callbacks
have completed as well.
Signed-off-by: Shakeel Butt
---
Changelog since v3:
- Handle the RCU callbacks for SLUB deactivation
Changelog since v2:
- Rewrote the patch and used workqueue flushing instead of refcount
On Wed, Sep 5, 2018 at 2:23 PM Roman Gushchin wrote:
>
> On Wed, Sep 05, 2018 at 01:51:52PM -0700, Andrew Morton wrote:
> > On Tue, 4 Sep 2018 15:47:07 -0700 Roman Gushchin wrote:
> >
> > > Commit 9092c71bb724 ("mm: use sc->priority for slab shrink targets")
> > > changed the way how the target
On Sat, Jul 14, 2018 at 1:32 AM Yafang Shao wrote:
>
> try_charge maybe executed in packet receive path, which is in interrupt
> context.
> In this situation, the 'current' is the interrupted task, which may has
> no relation to the rx softirq, So it is nonsense to use 'current'.
>
Have you
On Sat, Jul 14, 2018 at 7:10 PM Yafang Shao wrote:
>
> On Sat, Jul 14, 2018 at 11:38 PM, Shakeel Butt wrote:
> > On Sat, Jul 14, 2018 at 1:32 AM Yafang Shao wrote:
> >>
> >> try_charge maybe executed in packet receive path, which is in interrupt
> &
On Sat, Jul 14, 2018 at 10:26 PM Yafang Shao wrote:
>
> On Sun, Jul 15, 2018 at 12:25 PM, Shakeel Butt wrote:
> > On Sat, Jul 14, 2018 at 7:10 PM Yafang Shao wrote:
> >>
> >> On Sat, Jul 14, 2018 at 11:38 PM, Shakeel Butt wrote:
> >> > On Sat,
On Sun, Jul 15, 2018 at 1:02 AM Yafang Shao wrote:
>
> On Sun, Jul 15, 2018 at 2:34 PM, Shakeel Butt wrote:
> > On Sat, Jul 14, 2018 at 10:26 PM Yafang Shao wrote:
> >>
> >> On Sun, Jul 15, 2018 at 12:25 PM, Shakeel Butt wrote:
> >> > On Sat,
On Sun, Jul 15, 2018 at 6:50 PM Yafang Shao wrote:
>
> On Sun, Jul 15, 2018 at 11:04 PM, Shakeel Butt wrote:
> > On Sun, Jul 15, 2018 at 1:02 AM Yafang Shao wrote:
> >>
> >> On Sun, Jul 15, 2018 at 2:34 PM, Shakeel Butt wrote:
> >> > On Sat, Jul 14, 2
On Tue, May 22, 2018 at 3:07 AM Kirill Tkhai wrote:
>
> Hi,
>
> this patches solves the problem with slow shrink_slab() occuring
> on the machines having many shrinkers and memory cgroups (i.e.,
> with many containers). The problem is complexity of shrink_slab()
> is O(n^2) and it grows too fast
901 - 1000 of 1184 matches
Mail list logo