On Wed, 16 Aug 2017, Roman Gushchin wrote:
> It's natural to expect that inside a container there are their own sshd,
> "activity manager" or some other stuff, which can play with oom_score_adj.
> If it can override the upper cgroup-level settings, the whole delegation model
> is broken.
>
I don
On Thu, 17 Aug 2017, Roman Gushchin wrote:
> Hi David!
>
> Please, find an updated version of docs patch below.
>
Looks much better, thanks! I think the only pending issue is discussing
the relationship of memory.oom_kill_all_tasks with /proc/pid/oom_score_adj
== OOM_SCORE_ADJ_MIN.
when CONFIG_COMPACTION=n.
Signed-off-by: David Rientjes
---
include/linux/pageblock-flags.h | 11 +++
mm/compaction.c | 8 +++-
2 files changed, 18 insertions(+), 1 deletion(-)
diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h
--- a/include/
simple solution that doesn't involve any additional
subsystems in pageblock skip manipulation.
Signed-off-by: David Rientjes
---
mm/compaction.c | 48 +---
1 file changed, 37 insertions(+), 11 deletions(-)
diff --git a/mm/compaction.c
so, or
that it is beneficial from attempting to isolate memory.
Use the pageblock skip hint to avoid rescanning pageblocks needlessly
until that information is reset.
Signed-off-by: David Rientjes
---
mm/compaction.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/mm
On Tue, 15 Aug 2017, Roman Gushchin wrote:
> > I'm curious about the decision made in this conditional and how
> > oom_kill_memcg_member() ignores task->signal->oom_score_adj. It means
> > that memory.oom_kill_all_tasks overrides /proc/pid/oom_score_adj if it
> > should otherwise be disabled.
On Tue, 15 Aug 2017, Roman Gushchin wrote:
> > > diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> > > index dec5afdaa36d..22108f31e09d 100644
> > > --- a/Documentation/cgroup-v2.txt
> > > +++ b/Documentation/cgroup-v2.txt
> > > @@ -48,6 +48,7 @@ v1 is available under Docume
t simply add VM_FAULT_SIGBUS to the
> existing error code because all arch specific page fault handlers and
> g-u-p would have to learn a new error code combination.
>
> Reported-by: Tetsuo Handa
> Fixes: 3f70dc38cec2 ("mm: make sure that kthreads will not refault oom reaped
> memory")
> Cc: stable # 4.9+
> Signed-off-by: Michal Hocko
Acked-by: David Rientjes
After "mm: oom: let oom_reap_task and exit_mmap to run concurrently",
mmput_async() is no longer used. Remove it.
Cc: Andrea Arcangeli
Signed-off-by: David Rientjes
---
include/linux/sched/mm.h | 6 --
kernel/fork.c| 16
2 files changed, 22
he mm if
> MMF_OOM_SKIP is already set and in turn all memory is already freed
> and furthermore the mm data structures may already have been taken
> down by free_pgtables.
>
> Signed-off-by: Andrea Arcangeli
With your follow-up one liner to include linux/oom.h folded in:
Tested-by: David Rientjes
On Mon, 14 Aug 2017, Roman Gushchin wrote:
> diff --git a/Documentation/cgroup-v2.txt b/Documentation/cgroup-v2.txt
> index dec5afdaa36d..22108f31e09d 100644
> --- a/Documentation/cgroup-v2.txt
> +++ b/Documentation/cgroup-v2.txt
> @@ -48,6 +48,7 @@ v1 is available under Documentation/cgroup-v1/.
ger priority if they are
> populated with elegible tasks.
>
> The oom_priority value is compared within sibling cgroups.
>
> The root cgroup has the oom_priority 0, which cannot be changed.
>
> Signed-off-by: Roman Gushchin
> Cc: Michal Hocko
> Cc: Vladimir Davydov
&
On Mon, 14 Aug 2017, Roman Gushchin wrote:
> diff --git a/include/linux/oom.h b/include/linux/oom.h
> index 8a266e2be5a6..b7ec3bd441be 100644
> --- a/include/linux/oom.h
> +++ b/include/linux/oom.h
> @@ -39,6 +39,7 @@ struct oom_control {
> unsigned long totalpages;
> struct task_struc
m cgroup. We don't need to print
> the debug information for the each task, as well as play
> with task selection (considering task's children),
> so we can't use the existing oom_kill_process().
>
> Signed-off-by: Roman Gushchin
> Cc: Michal Hocko
> Cc: Vladimi
> Fix the leak by adding the missing call to kobject_put() to
> sysfs_slab_remove_workfn().
>
> Signed-off-by: Vladimir Davydov
> Reported-and-tested-by: Andrei Vagin
> Acked-by: Tejun Heo
> Cc: Michal Hocko
> Cc: Johannes Weiner
> Cc: Christoph Lameter
> Cc: Pekka Enberg
> Cc: David Rientjes
> Cc: Joonsoo Kim
> Fixes: 3b7b314053d02 ("slub: make sysfs file removal asynchronous")
Acked-by: David Rientjes
On Wed, 26 Jul 2017, Roman Gushchin wrote:
> +Cgroup-aware OOM Killer
> +~~~
> +
> +Cgroup v2 memory controller implements a cgroup-aware OOM killer.
> +It means that it treats memory cgroups as first class OOM entities.
> +
> +Under OOM conditions the memory controller tries t
o priority based oom killing that we have done. I think this
kind of support is long overdue in the oom killer.
Comment inline.
> Signed-off-by: Roman Gushchin
> Cc: Michal Hocko
> Cc: Vladimir Davydov
> Cc: Johannes Weiner
> Cc: David Rientjes
> Cc: Tejun Heo
>
On Tue, 1 Aug 2017, Roman Gushchin wrote:
> > To the rest of the patch. I have to say I do not quite like how it is
> > implemented. I was hoping for something much simpler which would hook
> > into oom_evaluate_task. If a task belongs to a memcg with kill-all flag
> > then we would update the cum
On Tue, 18 Jul 2017, Dave Chinner wrote:
> > Thanks for looking into this, Dave!
> >
> > The number of GFP_NOFS allocations that build up the deferred counts can
> > be unbounded, however, so this can become excessive, and the oom killer
> > will not kill any processes in this context. Althoug
On Mon, 17 Jul 2017, Dave Chinner wrote:
> > This is a side effect of super_cache_count() returning the appropriate
> > count but super_cache_scan() refusing to do anything about it and
> > immediately terminating with SHRINK_STOP, mostly for GFP_NOFS allocations.
>
> Yup. Happens during things
Hi Al and everyone,
We're encountering an issue where the per-shrinker per-node deferred
counts grow excessively large for the superblock shrinker. This appears
to be long-standing behavior, so reaching out to you to see if there's any
subtleties being overlooked since there is a reference to
On Wed, 12 Jul 2017, Roman Gushchin wrote:
> > It's a no-op if nobody sets up priorities or the system-wide sysctl is
> > disabled. Presumably, as in our model, the Activity Manager sets the
> > sysctl and is responsible for configuring the priorities if present. All
> > memcgs at the sibling
On Tue, 11 Jul 2017, Roman Gushchin wrote:
> > Yes, the original motivation was to limit killing to a single process, if
> > possible. To do that, we kill the process with the largest rss to free
> > the most memory and rely on the user to configure /proc/pid/oom_score_adj
> > if something els
On Tue, 11 Jul 2017, Michal Hocko wrote:
> This?
> ---
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 5dc0ff22d567..e155d1d8064f 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -470,11 +470,14 @@ static bool __oom_reap_task_mm(struct task_struct *tsk,
> struct mm_struct *mm)
> {
>
On Tue, 27 Jun 2017, Tetsuo Handa wrote:
> I wonder why you prefer timeout based approach. Your patch will after all
> set MMF_OOM_SKIP if operations between down_write() and up_write() took
> more than one second. lock_anon_vma_root() from unlink_anon_vmas() from
> free_pgtables() for example cal
On Mon, 26 Jun 2017, Michal Hocko wrote:
> diff --git a/mm/mmap.c b/mm/mmap.c
> index 3bd5ecd20d4d..253808e716dc 100644
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -2962,6 +2962,11 @@ void exit_mmap(struct mm_struct *mm)
> /* Use -1 here to ensure all VMAs in the mm are unmapped */
> unma
On Wed, 21 Jun 2017, Roman Gushchin wrote:
> Traditionally, the OOM killer is operating on a process level.
> Under oom conditions, it finds a process with the highest oom score
> and kills it.
>
> This behavior doesn't suit well the system with many running
> containers. There are two main issue
and work toward moving arm64, and other architectures, toward
CONFIG_OPTIMIZE_INLINING behavior.
Reported-by: Sodagudi Prasad
Tested-by: Matthias Kaehlcke
Signed-off-by: David Rientjes
---
Resend of http://marc.info/?l=linux-kernel&m=149681501816319 for 4.12
inclusion.
Prasad, ple
On Wed, 21 Jun 2017, Tetsuo Handa wrote:
> Umm... So, you are pointing out that select_bad_process() aborts based on
> TIF_MEMDIE or MMF_OOM_SKIP is broken because victim threads can be removed
> from global task list or cgroup's task list. Then, the OOM killer will have
> to
> wait until all mm
On Tue, 20 Jun 2017, Mark Rutland wrote:
> As with my reply to David, my preference would be that we:
>
> 1) Align compiler-clang.h with the compiler-gcc.h inlining behaviour, so
>that things work by default.
>
> 2) Fix up the arm64 core code (and drivers for architected / common
>periph
On Sat, 17 Jun 2017, Tetsuo Handa wrote:
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 04c9143..cf1d331 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -470,38 +470,9 @@ static bool __oom_reap_task_mm(struct task_struct *tsk,
> struct mm_struct *mm)
> {
> struct mmu_gather t
On Mon, 19 Jun 2017, Sodagudi Prasad wrote:
> > > Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
> > > static inline functions") re-defining the 'inline' macro but
> > > __attribute__((always_inline)) is missing. Some compilers may
> > > not honor inline hint if always_iniline
Reported-by: Larry Finger
Signed-off-by: David Rientjes
---
Note: Larry should be back as of June 17 to test if this fixes the
reported issue.
mm/khugepaged.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -65
On Mon, 19 Jun 2017, Kirill Tkhai wrote:
> This series implements killable version of down_read()
> similar to already existing down_write_killable() function.
> Patches [1-2/7] add arch-independent low-level primitives
> for the both rwsem types.
>
> Patches [3-6/7] add arch-dependent primitives
On Mon, 19 Jun 2017, Prasad Sodagudi wrote:
> Commit abb2ea7dfd82 ("compiler, clang: suppress warning for unused
> static inline functions") re-defining the 'inline' macro but
> __attribute__((always_inline)) is missing. Some compilers may
> not honor inline hint if always_iniline attribute not th
On Fri, 16 Jun 2017, Michal Hocko wrote:
> I am sorry but I have really hard to make the oom reaper a reliable way
> to stop all the potential oom lockups go away. I do not want to
> reintroduce another potential lockup now.
Please show where this "potential lockup" ever existed in a bug report o
On Thu, 15 Jun 2017, Michal Hocko wrote:
> > Yes, quite a bit in testing.
> >
> > One oom kill shows the system to be oom:
> >
> > [22999.488705] Node 0 Normal free:90484kB min:90500kB ...
> > [22999.488711] Node 1 Normal free:91536kB min:91948kB ...
> >
> > followed up by one or more unnecessa
On Thu, 15 Jun 2017, Tetsuo Handa wrote:
> David is trying to avoid setting MMF_OOM_SKIP when the OOM reaper found that
> mm->users == 0.
Yes, because MMF_OOM_SKIP enables the oom killer to select another process
to kill and will do so without the original victim's mm being able to
undergo exit
On Thu, 15 Jun 2017, Michal Hocko wrote:
> > If mm->mm_users is not incremented because it is already zero by the oom
> > reaper, meaning the final refcount has been dropped, do not set
> > MMF_OOM_SKIP prematurely.
> >
> > __mmput() may not have had a chance to do exit_mmap() yet, so memory from
On Thu, 8 Jun 2017, Michal Hocko wrote:
> collapse_huge_page
> pte_offset_map
> kmap_atomic
> kmap_atomic_prot
> preempt_disable
> __collapse_huge_page_copy
> pte_unmap
> kunmap_atomic
> __kunmap_atomic
> preempt_enable
>
> I suspect, so cond_resched seem
On Mon, 12 Jun 2017, Michal Hocko wrote:
> > These are not soft lockups, these are need_resched warnings. We monitor
> > how long need_resched has been set and when a thread takes an excessive
> > amount of time to reschedule after it has been set. A loop of 512 pages
> > with ptl contention
lly requires no
references on mm->mm_users to do exit_mmap().
Without this, several processes can be oom killed unnecessarily and the
oom log can show an abundance of memory available if exit_mmap() is in
progress at the time the process is skipped.
Signed-off-by: David Rientjes
---
mm/oom_
On Sat, 10 Jun 2017, Michal Hocko wrote:
> > > I would just pull the cond_resched out of __collapse_huge_page_copy
> > > right after pte_unmap. But I am not really sure why this cond_resched is
> > > really needed because the changelog of the patch which adds is is quite
> > > terse on details.
>
On Thu, 8 Jun 2017, Michal Hocko wrote:
> I would just pull the cond_resched out of __collapse_huge_page_copy
> right after pte_unmap. But I am not really sure why this cond_resched is
> really needed because the changelog of the patch which adds is is quite
> terse on details.
I'm not sure what
A few hugetlb allocators loop while calling the page allocator and can
potentially prevent rescheduling if the page allocator slowpath is not
utilized.
Conditionally schedule when large numbers of hugepages can be allocated.
Signed-off-by: David Rientjes
---
Based on -mm only to prevent merge
On Wed, 7 Jun 2017, Mike Kravetz wrote:
> > @@ -2364,6 +2366,7 @@ static unsigned long set_max_huge_pages(struct hstate
> > *h, unsigned long count,
> > ret = alloc_fresh_gigantic_page(h, nodes_allowed);
> > else
> > ret = alloc_fresh_huge_page(
d register_node().
>
> [Test in Qemu by 4 hotpluggable nodes in x86-64 system]
>
> Signed-off-by: Dou Liyang
Acked-by: David Rientjes
A few hugetlb allocators loop while calling the page allocator and can
potentially prevent rescheduling if the page allocator slowpath is not
utilized.
Conditionally schedule when large numbers of hugepages can be allocated.
Signed-off-by: David Rientjes
---
Based on -mm only to prevent merge
On Wed, 7 Jun 2017, Vlastimil Babka wrote:
> >> Hmm I'd expect such spin lock to be reported together with mmap_sem in
> >> the debugging "locks held" message?
> >
> > My bisection of the problem is about half done. My latest good version is
> > commit
> > 7b8cd33 and the latest bad one is 2ea6
On Tue, 6 Jun 2017, Matthias Kaehlcke wrote:
> Unfortunately as is the patch doesn't work:
>
> include/linux/compiler-clang.h:20:9: error: 'inline' macro redefined
> [-Werror,-Wmacro-redefined]
> #define inline inline __attribute__((unused))
> ^
> include/linux/compiler-gcc.h:78:9: note:
inline' ends up overriding the definition in
compiler-gcc.h.
Simply annotate all inline functions as __attribute__((unused)). It's
necessary to suppress the warning for clang and is implicit with gcc.
Reported-by: Matthias Kaehlcke
Signed-off-by: David Rientjes
---
Matthias, please a
On Tue, 6 Jun 2017, Roman Gushchin wrote:
> Hi David!
>
> Thank you for sharing this!
>
> It's very interesting, and it looks like,
> it's not that far from what I've suggested.
>
> So we definitily need to come up with some common solution.
>
Hi Roman,
Yes, definitely. I could post a serie
out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.
Suppress the warning for clang.
Signed-off-by: David Rientjes
---
This is a resend of my patch from http://marc.info/?t=14956596926
that did not seem to end very productively, but I
We use a heavily modified system and memcg oom killer and I'm wondering
if there is some opportunity for collaboration because we may have some
shared goals.
I can summarize how we currently use the oom killer at a high level so
that it is not overwhelming with implementation details and give some
On Fri, 2 Jun 2017, Andrew Morton wrote:
> On Mon, 1 May 2017 14:34:21 -0700 (PDT) David Rientjes
> wrote:
>
> > The purpose of the code that commit 623762517e23 ("revert 'mm: vmscan: do
> > not swap anon pages just because free+file is low'") reintro
On Wed, 31 May 2017, Doug Anderson wrote:
> > Again, I defer to maintainers like Andrew and Ingo who have to deal with
> > an enormous amount of patches on how they would like to handle it; I don't
> > think myself or anybody else who doesn't deal with a large number of
> > patches should be manda
ed for
backwards compatibility.
See the change to Documentation/cgroup-v1/memory.txt for full
specification.
Signed-off-by: David Rientjes
---
Documentation/cgroup-v1/memory.txt | 47 ++
mm/vmpressure.c| 122 -
2 files changed
On Wed, 24 May 2017, Doug Anderson wrote:
> * Matthias has been sending out individual patches that take each
> particular case into account to try to remove the warnings. In some
> cases this removes totally dead code. In other cases this adds
> __maybe_unused. ...and as a last resort it uses
On Thu, 25 May 2017, Konstantin Khlebnikov wrote:
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 04c9143a8625..dd30a045ef5b 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -876,6 +876,11 @@ static void oom_kill_process(struct oom_control *oc,
> const char *message)
> /* Get a
On Mon, 29 May 2017, Mike Rapoport wrote:
> Currently applications can explicitly enable or disable THP for a memory
> region using MADV_HUGEPAGE or MADV_NOHUGEPAGE. However, once either of
> these advises is used, the region will always have
> VM_HUGEPAGE/VM_NOHUGEPAGE flag set in vma->vm_flags.
out that suppressing the warnings avoids potentially complex
#ifdef directives, which also reduces LOC.
Supress the warning for clang.
Signed-off-by: David Rientjes
---
include/linux/compiler-clang.h | 7 +++
1 file changed, 7 insertions(+)
diff --git a/include/linux/compiler-clang.h b
On Tue, 23 May 2017, Konstantin Khlebnikov wrote:
> This is worth addition. Let's call it "oom_victim" for short.
>
> It allows to locate leaky part if they are spread over sub-containers within
> common limit.
> But doesn't tell which limit caused this kill. For hierarchical limits this
> might
On Tue, 23 May 2017, Matthias Kaehlcke wrote:
> > diff --git a/include/linux/compiler-clang.h b/include/linux/compiler-clang.h
> > index de179993e039..e1895ce6fa1b 100644
> > --- a/include/linux/compiler-clang.h
> > +++ b/include/linux/compiler-clang.h
> > @@ -15,3 +15,8 @@
> > * with any versio
On Mon, 22 May 2017, Konstantin Khlebnikov wrote:
> Nope, they are different. I think we should rephase documentation somehow
>
> low - count of reclaims below low level
> high - count of post-allocation reclaims above high level
> max - count of direct reclaims
> oom - count of failed direct rec
On Mon, 22 May 2017, Andrew Morton wrote:
> > > Is clang not inlining kmalloc_large_node_hook() for some reason? I don't
> > > think this should ever warn on gcc.
> >
> > clang warns about unused static inline functions outside of header
> > files, in difference to gcc.
>
> I wish it wouldn't.
-Wunused-function]
>
Is clang not inlining kmalloc_large_node_hook() for some reason? I don't
think this should ever warn on gcc.
> Signed-off-by: Matthias Kaehlcke
Acked-by: David Rientjes
> ---
> mm/slub.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff
On Mon, 22 May 2017, Mike Snitzer wrote:
> > > The lvm2 was designed this way - it is broken, but there is not much that
> > > can be done about it - fixing this would mean major rewrite. The only
> > > thing we can do about it is to lower the deadlock probability with
> > > __GFP_HIGH (or PF_M
> slub_attributes which must be propagated and avoid that insane conversion
> to and from ASCII, but that's too large for a hot fix.
>
> Check at least the return value of the show() function, so calling store()
> with stale content is prevented.
>
> Reported-by: Steven Roste
We have encountered need_resched warnings in __collapse_huge_page_copy()
while doing {clear,copy}_user_highpage() over HPAGE_PMD_NR source pages.
mm->mmap_sem is held for write, but the iteration is well bounded.
Reschedule as needed.
Signed-off-by: David Rientjes
---
mm/khugepaged.c
On Tue, 9 May 2017, Andrew Morton wrote:
> > We've encountered zombies that are waiting for a thread to exit that are
> > looping in ep_poll() almost endlessly although there is a pending SIGKILL
> > as a result of a group exit.
> >
> > This happens because we always find ep_events_available() an
g for ep_events_available(), but there have been no
reports of delayed signal handling other than SIGKILL preventing zombies
from exiting that would be fixed by this.
Signed-off-by: David Rientjes
---
fs/eventpoll.c | 10 ++
1 file changed, 10 insertions(+)
diff --git a/fs/eventpoll.c b/fs/eventpoll.c
On Wed, 3 May 2017, Michal Hocko wrote:
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index 24efcc20af91..f3ec8760dc06 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -2113,16 +2113,14 @@ static void get_scan_count(struct lruvec *lruvec,
> struct mem_cgroup *memcg,
> u64 denominator = 0;
On Thu, 27 Apr 2017, zhongjiang wrote:
> From: zhong jiang
>
> Recently, I found the following issue, it will result in the panic.
>
> [ 168.739152] mmap1: Corrupted page table at address 7f3e6275a002
> [ 168.745039] PGD 61f4a1067
> [ 168.745040] PUD 61ab19067
> [ 168.747730] PMD 61fb8b067
On Tue, 2 May 2017, Michal Hocko wrote:
> I have already asked and my questions were ignored. So let me ask again
> and hopefuly not get ignored this time. So Why do we need a different
> criterion on anon pages than file pages?
The preference in get_scan_count() as already implemented is to recl
On Tue, 11 Apr 2017, Wei Yang wrote:
> On Mon, Apr 10, 2017 at 05:26:03PM -0700, David Rientjes wrote:
> >On Tue, 11 Apr 2017, Wei Yang wrote:
> >
> >> According to current code path, numa_nodes_parsed is already setup when
> >> numa_emucation() is cal
sufficient, fallback to balanced reclaim so the
file lru doesn't remain untouched.
Suggested-by: Minchan Kim
Signed-off-by: David Rientjes
---
to akpm: this issue has been possible since at least 3.15, so it's
probably not high priority for 4.12 but applies cleanly if it ca
}
> }
> }
>
Hi Minchan,
This looks good and it correctly biases against SCAN_ANON for my workload
that was thrashing the anon lrus. Feel free to use parts of my changelog
if you'd like.
Tested-by: David Rientjes
On Tue, 18 Apr 2017, Minchan Kim wrote:
> > The purpose of the code that commit 623762517e23 ("revert 'mm: vmscan: do
> > not swap anon pages just because free+file is low'") reintroduces is to
> > prefer swapping anonymous memory rather than trashing the file lru.
> >
> > If all anonymous memory
system call madvise().
>
> Signed-off-by: Anshuman Khandual
Acked-by: David Rientjes
Looks like this depends on existing patches in -mm.
AFE_BY_RCU in order
> to avoid future instances of this sort of confusion.
>
> Signed-off-by: Paul E. McKenney
> Cc: Christoph Lameter
> Cc: Pekka Enberg
> Cc: David Rientjes
> Cc: Joonsoo Kim
> Cc: Andrew Morton
> Cc:
> Acked-by: Johannes Weiner
> Acked-by:
e lru doesn't remain
untouched.
Signed-off-by: David Rientjes
---
mm/vmscan.c | 41 +++--
1 file changed, 23 insertions(+), 18 deletions(-)
diff --git a/mm/vmscan.c b/mm/vmscan.c
--- a/mm/vmscan.c
+++ b/mm/vmscan.c
@@ -2186,26 +2186,31 @@ static void get
> open("/tmp/x", "w").close()
> os.unlink("/tmp/x")
> b = ipi_count()
> print "%d loops: %d => %d (+%d ipis)" % (n, a, b, b-a)
> echo(pid, "cgroup.procs")
> for i in range(n):
> os.rmdir(str(i))
>
> patched: 1 loops: 1069 => 1170 (+101 ipis)
> unpatched: 1 loops: 1192 => 48933 (+47741 ipis)
>
> Signed-off-by: Greg Thelen
Acked-by: David Rientjes
ch restructures numa_nodes_parsed from emulated nodes.
>
> Signed-off-by: Wei Yang
Acked-by: David Rientjes
although there's a small nit: NODE_MASK_NONE is only used for
initialization, this should be nodes_clear(numa_nodes_parsed) instead, but
that would be up to the x86 maintainers to allow
ad of re-finding it
when calling numa_emulation().
> This means we can get the physnode_mask directly from numa_nodes_parsed. At
> the same time, this patch correct the comment of these two functions.
>
> Signed-off-by: Wei Yang
Acked-by: David Rientjes
by re-order the code path.
>
> Signed-off-by: Wei Yang
Acked-by: David Rientjes
We got need_resched() warnings in swap_cgroup_swapoff() because
swap_cgroup_ctrl[type].length is particularly large.
Reschedule when needed.
Signed-off-by: David Rientjes
---
mm/swap_cgroup.c | 2 ++
1 file changed, 2 insertions(+)
diff --git a/mm/swap_cgroup.c b/mm/swap_cgroup.c
--- a/mm
et appropriately for
"defer+madvise".
Fixes: 21440d7eb904 ("mm, thp: add new defer+madvise defrag option")
Signed-off-by: David Rientjes
---
mm/huge_memory.c | 12 ++--
1 file changed, 6 insertions(+), 6 deletions(-)
diff --git a/mm/huge_memory.c b/mm/huge_
On Mon, 20 Mar 2017, Huang, Ying wrote:
> From: Huang Ying
>
> Now vzalloc() is used in swap code to allocate various data
> structures, such as swap cache, swap slots cache, cluster info, etc.
> Because the size may be too large on some system, so that normal
> kzalloc() may fail. But using kz
On Fri, 17 Mar 2017, Michal Hocko wrote:
> > Does it really make sense to print any counters of that zone though?
> > Your follow up patch just suggests that we don't want some but what
> > about others?
> >
Managed and present pages needs to be emitted for userspace parsing of
memory hotplug,
On Fri, 17 Mar 2017, Huang, Ying wrote:
> From: Huang Ying
>
> Now vzalloc() is used in swap code to allocate various data
> structures, such as swap cache, swap slots cache, cluster info, etc.
> Because the size may be too large on some system, so that normal
> kzalloc() may fail. But using kz
protection information above pcp stats since it
is relevant for all zones per vm.lowmem_reserve_ratio.
Signed-off-by: David Rientjes
---
mm/vmstat.c | 20 +---
1 file changed, 13 insertions(+), 7 deletions(-)
diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@
not done for unpopulated zones.
Signed-off-by: David Rientjes
---
v2: - s/bool populated/b assert_populated/ per Anshuman
- add comment to zoneinfo_show() to describe why we care
mm/vmstat.c | 27 +--
1 file changed, 17 insertions(+), 10 deletions(-)
diff --git a/mm
On Fri, 3 Mar 2017, Anshuman Khandual wrote:
> > This patch shows statistics for non-populated zones in /proc/zoneinfo.
> > The zones exist and hold a spot in the vm.lowmem_reserve_ratio array.
> > Without this patch, it is not possible to determine which index in the
> > array controls which zone
not done for unpopulated zones.
Signed-off-by: David Rientjes
---
mm/vmstat.c | 22 +-
1 file changed, 13 insertions(+), 9 deletions(-)
diff --git a/mm/vmstat.c b/mm/vmstat.c
--- a/mm/vmstat.c
+++ b/mm/vmstat.c
@@ -1121,8 +1121,12 @@ static void frag_stop(struct seq_file *m
On Thu, 2 Feb 2017, Tobin C. Harding wrote:
> @@ -3696,8 +3695,8 @@ int handle_mm_fault(struct vm_area_struct *vma,
> unsigned long address,
> * VM_FAULT_OOM), there is no need to kill anything.
> * Just clean up the OOM state peacefully.
> */
On Wed, 25 Jan 2017, Anshuman Khandual wrote:
> But in the due course there might be other changes in number of VMAs of
> the process because of unmap() or merge() which could reduce the total
> number of VMAs and hence this condition may not exist afterwards. In
> that case EAGAIN still makes sen
madvise(2) may return ENOMEM if the advice acts on a vma that must be
split and creating the new vma will result in the process exceeding
/proc/sys/vm/max_map_count.
Specify this additional possibility.
Signed-off-by: David Rientjes
---
man2/madvise.2 | 7 ++-
1 file changed, 6 insertions
(for vmas, anon_vmas,
or mempolicies) cannot be allocated.
Encountering /proc/sys/vm/max_map_count is not a temporary failure,
however, so return ENOMEM to indicate this is a more serious issue. A
followup patch to the man page will specify this behavior.
Signed-off-by: David Rientjes
> higher looks safe and makes it obvious to both me and gcc that
> the initialization comes before the first use.
>
> Fixes: 74eaa4a97e8e ("mm: consolidate GFP_NOFAIL checks in the allocator
> slowpath")
> Signed-off-by: Arnd Bergmann
Acked-by: David Rientjes
On Fri, 20 Jan 2017, Vlastimil Babka wrote:
> Could we simplify both patches with something like this?
> Although the sizeof("null") is not the nicest thing, because it relies on
> knowledge
> that pointer() in lib/vsprintf.c uses this string. Maybe Rasmus has some
> better idea?
>
> Thanks,
>
701 - 800 of 3168 matches
Mail list logo