From: NeilBrown
commit 8658452e4a588da603f6cb5ee2615deafcd82b71 upstream
PF_LESS_THROTTLE has a very specific use case: to avoid deadlocks
and live-locks while writing to the page cache in a loop-back
NFS mount situation.
It therefore makes sense to *only* set PF_LESS_THROTTLE in this
situation
Integrate Neil Brown's nfs loopback mount patches from 3.16.
46dbf93 nfsd: Only set PF_LESS_THROTTLE when really needed.
309c169 SUNRPC: track whether a request is coming from a loop-back interface.
--
___
linux-yocto mailing list
linux-yocto@yoctoproj
From: NeilBrown
commit ef11ce24875a8a540adc185e7bce3d7d49c8296f upstream
If an incoming NFS request is coming from the local host, then
nfsd will need to perform some special handling. So detect that
possibility and make the source visible in rq_local.
Signed-off-by: NeilBrown
Signed-off-by:
From: Steven Rostedt
commit e9dd685ce81815811fb4da72e6ab10a694ac8468 upstream
As Peter Zijlstra told me, we have the following path:
do_exit()
exit_itimers()
itimer_delete()
spin_lock_irqsave(&timer->it_lock, &flags);
timer_delete_hook(timer);
kc->timer_del(timer) := p
From: Emil Medve
commit af4459d3636790735fccd83f0337c8380a0a4cc2 upstream
Signed-off-by: Emil Medve
Cc: Ingo Molnar
Cc: Thomas Gleixner
Cc: "H. Peter Anvin"
Cc: Yinghai Lu
Signed-off-by: Andrew Morton
Signed-off-by: Linus Torvalds
Signed-off-by: Yang Shi
---
arch/x86/mm/numa.c | 6 +++--
From: Rik van Riel
commit 1662867a9b2574bfdb9d4e97186aa131218d7210 upstream
This function is supposed to return true if the new load imbalance is
worse than the old one. It didn't. I can only hope brown paper bags
are in style.
Now things converge much better on both the 4 node and 8 node sys
From: Mel Gorman
commit 11de9927f9dd3cb0a0f18064fa4b6976fc37e79c upstream
Migration of misplaced transhuge pages uses page_add_new_anon_rmap() when
putting the page back as it avoided an atomic operations and added the new
page to the correct LRU. A side-effect is that the page gets marked
acti
From: Rik van Riel
commit 792568ec6a31ca560ca4d528782cbc6cd2cea8b0 upstream
The NUMA code is smart enough to distribute the memory of workloads
that span multiple NUMA nodes across those NUMA nodes.
However, it still has a pretty high scan rate for such workloads,
because any memory that is lef
From: Rik van Riel
commit 096aa33863a5e48de52d2ff30e0801b7487944f4 upstream
Affine wakeups have the potential to interfere with NUMA placement.
If a task wakes up too many other tasks, affine wakeups will get
disabled.
However, regardless of how many other tasks it wakes up, it gets
re-enabled
From: Jason Low
commit 2b4cfe64dee0d84506b951d81bf55d9891744d25 upstream
Also initialize the per-sd variables for newidle load balancing
in sd_numa_init().
Signed-off-by: Jason Low
Acked-by: morten.rasmus...@arm.com
Cc: daniel.lezc...@linaro.org
Cc: alex@linaro.org
Cc: pre...@linux.vnet.ib
From: Mike Galbraith
commit 156654f491dd8d52687a5fbe1637f472a52ce75b upstream
Bad idea on -rt:
[ 908.026136] [] rt_spin_lock_slowlock+0xaa/0x2c0
[ 908.026145] [] task_numa_free+0x31/0x130
[ 908.026151] [] finish_task_switch+0xce/0x100
[ 908.026156] [] thread_return+0x48/0x4ae
[ 908.026
From: Rik van Riel
commit e63da03639cc9e6e83b62e7ef8ffdbb92421416a upstream
Currently the NUMA balancing code only allows moving tasks between NUMA
nodes when the load on both nodes is in balance. This breaks down when
the load was imbalanced to begin with.
Allow tasks to be moved between NUMA
From: Mike Galbraith
commit 60e69eed85bb7b5198ef70643b5895c26ad76ef7 upstream
Sasha reported that lockdep claims that the following commit:
made numa_group.lock interrupt unsafe:
156654f491dd ("sched/numa: Move task_numa_free() to __put_task_struct()")
While I don't see how that could be, gi
From: Srikar Dronamraju
commit 834a964a098e7726fc296d7cd8f65ed3eeedd412 upstream
LAST_CPUPID_MASK is calculated using LAST_CPUPID_WIDTH. However
LAST_CPUPID_WIDTH itself can be 0. (when LAST_CPUPID_NOT_IN_PAGE_FLAGS is
set). In such a case LAST_CPUPID_MASK turns out to be 0.
But with recent
From: Rik van Riel
commit b1ad065e65f56103db8b97edbd218a271ff5b1bb upstream
Update the migrate_improves/degrades_locality() functions with
knowledge of pseudo-interleaving.
Do not consider moving tasks around within the set of group's active
nodes as improving or degrading locality. Instead, le
From: Rik van Riel
commit 68d1b02a58f5d9f584c1fb2923ed60ec68cbbd9b upstream
Setting the numa_preferred_node for a task in task_numa_migrate
does nothing on a 2-node system. Either we migrate to the node
that already was our preferred node, or we stay where we were.
On a 4-node system, it can sl
From: Tang Chen
commit 8f28ed92d9314b98dc2033df770f5e6b85c5ffb7 upstream
In document numa_memory_policy.txt, the following examples for flag
MPOL_F_RELATIVE_NODES are incorrect.
For example, consider a task that is attached to a cpuset with
mems 2-5 that sets an Interleave polic
From: Rik van Riel
commit 5085e2a328849bdee6650b32d52c87c3788ab01c upstream
When tasks have not converged on their preferred nodes yet, we want
to retry fairly often, to make sure we do not migrate a task's memory
to an undesirable location, only to have to move it again later.
This patch reduc
From: Rik van Riel
commit a5338093bfb462256f70f3450c08f73e59543e26 upstream
The NUMA scanning code can end up iterating over many gigabytes of
unpopulated memory, especially in the case of a freshly started KVM
guest with lots of memory.
This results in the mmu notifier code being called even w
From: Rik van Riel
commit 88a9ab6e3dfb5b10168130c255c6102c925343ab upstream
Reorganize the order of ifs in change_pmd_range a little, in preparation
for the next patch.
[a...@linux-foundation.org: fix indenting, per David]
Signed-off-by: Rik van Riel
Cc: Peter Zijlstra
Cc: Andrea Arcangeli
R
From: Xishi Qiu
commit 92d585ef067da7a966d6ce78c601bd1562b62619 upstream
When doing socket hot remove, "node_devices[nid]" is set to NULL;
acpi_processor_remove()
try_offline_node()
unregister_one_node()
Then hot add a socket, but do not echo 1 > /sys/devices/system/cpu/
From: Rik van Riel
commit 50ec8a401fed6d246ab65e6011d61ac91c34af70 upstream
Track which nodes NUMA faults are triggered from, in other words
the CPUs on which the NUMA faults happened. This uses a similar
mechanism to what is used to track the memory involved in numa faults.
The next patches us
From: Rik van Riel
commit 10f39042711ba21773763f267b4943a2c66c8bef upstream
Use the active_nodes nodemask to make smarter decisions on NUMA migrations.
In order to maximize performance of workloads that do not fit in one NUMA
node, we want to satisfy the following criteria:
1) keep private m
From: Rik van Riel
commit 35664fd41e1c8cc4f0b89f6a51db5af39ba50640 upstream
The current code in task_numa_placement calculates the difference
between the old and the new value, but also temporarily stores half
of the old value in the per-process variables.
The NUMA balancing code looks at those
From: Rik van Riel
commit 7e2703e6099609adc93679c4d45cd6247f565971 upstream
Tracing the code that decides the active nodes has made it abundantly clear
that the naive implementation of the faults_from code has issues.
Specifically, the garbage collector in some workloads will access orders
of m
From: Mel Gorman
commit 1ad9f620c3a22fa800489455ce517c29e576934e upstream
Sasha reported the following bug using trinity
kernel BUG at mm/mprotect.c:149!
invalid opcode: [#1] PREEMPT SMP DEBUG_PAGEALLOC
Dumping ftrace buffer:
(ftrace buffer empty)
Modules linked in:
CPU: 20
From: Rik van Riel
commit 52bf84aa206cd2c2516dfa3e03b578edf8a3242f upstream
Excessive migration of pages can hurt the performance of workloads
that span multiple NUMA nodes. However, it turns out that the
p->numa_migrate_deferred knob is a really big hammer, which does
reduce migration rates, b
From: Rik van Riel
commit be1e4e760d940c14d119bffef5eb007dfdf29046 upstream
Cleanup suggested by Mel Gorman. Now the code contains some more
hints on what statistics go where.
Suggested-by: Mel Gorman
Signed-off-by: Rik van Riel
Acked-by: Mel Gorman
Signed-off-by: Peter Zijlstra
Cc: Chegu V
From: Rik van Riel
commit 20e07dea286a90f096a779706861472d296397c6 upstream
The numa_faults_cpu statistics are used to maintain an active_nodes nodemask
per numa_group. This allows us to be smarter about when to do numa migrations.
Signed-off-by: Rik van Riel
Acked-by: Mel Gorman
Signed-off-b
From: Rik van Riel
commit ff1df896aef8e0ec1556a5c44f424bd45bfa2cbe upstream
In order to get a more consistent naming scheme, making it clear
which fault statistics track memory locality, and which track
CPU locality, rename the memory fault statistics.
Suggested-by: Mel Gorman
Signed-off-by: R
From: Rik van Riel
commit 58b46da336a9312b2e21bb576d1c2c484dbf6257 upstream
We track both the node of the memory after a NUMA fault, and the node
of the CPU on which the fault happened. Rename the local variables in
task_numa_fault to make things more explicit.
Suggested-by: Mel Gorman
Signed-
Refresh kernel NUMA up to 3.16.
Primarily merged:
numa,sched,mm: pseudo-interleaving for automatic NUMA balancing
https://lkml.org/lkml/2014/1/27/459
patch 1 - 9
fix numa vs kvm scalability issue
https://lkml.org/lkml/2014/2/18/677
patch 12/13
sched,numa: reduce page migrations with pseudo-int
32 matches
Mail list logo