= &max_lock_depth,
- .maxlen = sizeof(int),
- .mode = 0644,
- .proc_handler = proc_dointvec,
- },
-#endif
#ifdef CONFIG_TREE_RCU
{
.procname = "panic_on_rcu_stall",
Acked-by: Waiman Long
runs. However, these tests may still fail
once in a while if the memory usage goes beyond the newly extended range.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests
side of the expected ranges.
Suggested-by: Michal Koutný
Signed-off-by: Waiman Long
---
.../testing/selftests/cgroup/test_memcontrol.c | 18 ++
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
b/tools/test
(with memory_recursiveprot enabled) and sporadically fails
its test_memcg_min sub-test. This patchset fixes the test_memcg_min
and test_memcg_low failures by adjusting the test_memcontrol selftest
to fix these test failures.
Waiman Long (2):
selftests: memcg: Allow low event with no memory.lo
On 4/23/25 12:49 PM, Michal Koutný wrote:
On Tue, Apr 22, 2025 at 07:58:56PM -0400, Waiman Long wrote:
Am I correct to assume that the purpose of 1d09069f5313f ("selftests:
memcg: expect no low events in unprotected sibling") is to force a
failure in the test_memcg_low test to forc
On 4/16/25 5:25 AM, Michal Koutný wrote:
On Tue, Apr 15, 2025 at 05:04:14PM -0400, Waiman Long
wrote:
+ /*
+* Child 2 has memory.low=0, but some low protection is still being
+* distributed down from its parent with memory.low=50M if cgroup2
()
returns true. If we ever change mem_cgroup_below_min() in such a way
that it no longer skips the no usage case, we will have to add code to
explicitly skip it.
Suggested-by: Michal Koutný
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 16 +++-
1 file
runs. However, these tests may still fail
once in a while if the memory usage goes beyond the newly extended range.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests
fails
its test_memcg_min sub-test. This patchset fixes the test_memcg_min
and test_memcg_low failures by adjusting the test_memcontrol selftest
to fix these test failures.
Waiman Long (2):
selftests: memcg: Allow low event with no memory.low and
memory_recursiveprot on
selftests: memcg: Inc
On 4/14/25 8:42 AM, Michal Koutný wrote:
On Sun, Apr 13, 2025 at 10:12:48PM -0400, Waiman Long
wrote:
2) memory.low is set to a non-zero value but the cgroup has no task in
it so that it has an effective low value of 0. Again it may have a
non-zero low event count if memory reclaim
and test_memcg_min sub-tests may still fail occasionally if the
memory.current values fall outside of the expected ranges.
Suggested-by: Johannes Weiner
Suggested-by: Michal Koutný
Signed-off-by: Waiman Long
---
mm/internal.h| 9 ++
runs. However, these tests may still fail
once in a while if the memory usage goes beyond the newly extended range.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests
es by skipping the !usage case in
shrink_node_memcgs() and adjust the test_memcontrol selftest to fix
other causes of the test failures.
Waiman Long (2):
mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()
selftests: memcg: Increase error tolerance of child memory.current
che
On 4/8/25 6:22 PM, Roman Gushchin wrote:
On Sat, Apr 05, 2025 at 10:40:10PM -0400, Waiman Long wrote:
The test_memcg_protection() function is used for the test_memcg_min and
test_memcg_low sub-tests. This function generates a set of parent/child
cgroups like:
parent: memory.min/low = 50M
On 4/11/25 1:22 PM, Michal Koutný wrote:
On Mon, Apr 07, 2025 at 12:23:16PM -0400, Waiman Long
wrote:
Child Actual usageExpected usage%err
- --
1 16990208 22020096 -12.9%
1 17252352 22020096
On 4/11/25 1:11 PM, Michal Koutný wrote:
Hello.
On Mon, Apr 07, 2025 at 12:23:15PM -0400, Waiman Long
wrote:
--- a/mm/memcontrol-v1.h
+++ b/mm/memcontrol-v1.h
@@ -22,8 +22,6 @@
iter != NULL; \
iter = mem_cgroup_iter(NULL, iter, NULL
On 4/7/25 11:25 AM, Michal Koutný wrote:
Hi Waiman.
On Sun, Apr 06, 2025 at 09:41:58PM -0400, Waiman Long
wrote:
...
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
b/tools/testing/selftests/cgroup/test_memcontrol.c
index 16f5d74ae762..bab826b6b7b0 100644
--- a/tools/testing
Johannes Weiner
Signed-off-by: Waiman Long
---
mm/internal.h| 9 +
mm/memcontrol-v1.h | 2 --
mm/vmscan.c | 4
tools/testing/selftests/cgroup/test_memcontrol.c | 7 ++-
4 files
runs.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
b/tools/testing/selftests/cgroup/test_memcontrol.c
index bab826b6b7b0..8f4f2479650e
oradically fails its test_memcg_min sub-test. This
patchset fixes the test_memcg_min and test_memcg_low failures by
skipping the !usage case in shrink_node_memcgs() and adjust the
test_memcontrol selftest to fix other causes of the test failures.
Waiman Long (2):
mm/vmscan: Skip memcg with
On 4/7/25 10:24 AM, Johannes Weiner wrote:
On Sun, Apr 06, 2025 at 09:41:58PM -0400, Waiman Long wrote:
The test_memcontrol selftest consistently fails its test_memcg_low
sub-test due to the fact that two of its test child cgroups which
have a memmory.low of 0 or an effective memory.low of 0
ses of the test failures.
Note that I decide not to use the suggested mem_cgroup_usage() call as
it is a real function call defined in mm/memcontrol.c to be used mainly
by cgroup v1 code.
Waiman Long (2):
mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()
selftests: memcg: Increase err
runs.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
b/tools/testing/selftests/cgroup/test_memcontrol.c
index bab826b6b7b0..8f4f2479650e
patch applied, the test_memcg_low sub-test finishes
successfully without failure in most cases. Though both test_memcg_low
and test_memcg_min sub-tests may still fail occasionally if the
memory.current values fall outside of the expected ranges.
Suggested-by: Johannes Weiner
Signed
in mm/memcontrol.c which is not
available if CONFIG_MEMCG isn't defined.
Waiman Long (2):
mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()
selftests: memcg: Increase error tolerance of child memory.current
check in test_memcg_protection()
mm/vmscan.c
runs.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
b/tools/testing/selftests/cgroup/test_memcontrol.c
index bab826b6b7b0..8f4f2479650e
test_memcg_min sub-tests may still fail occasionally if the
memory.current values fall outside of the expected ranges.
Suggested-by: Johannes Weiner
Signed-off-by: Waiman Long
---
mm/vmscan.c | 4
tools/testing/selftests/cgroup/test_memcontrol.c | 7
On 4/4/25 3:38 PM, Johannes Weiner wrote:
On Fri, Apr 04, 2025 at 02:55:35PM -0400, Waiman Long wrote:
On 4/4/25 2:13 PM, Johannes Weiner wrote:
* Waiman points out that the weirdness is seeing low events without
having a low configured. Eh, this isn't really true with recu
On 4/3/25 9:33 AM, Michal Koutný wrote:
On Sun, Mar 30, 2025 at 05:52:45PM -0400, Waiman Long
wrote:
The goto statement in sched_partition_write() is not needed. Remove
it and rename sched_partition_write()/sched_partition_show() to
cpuset_partition_write()/cpuset_partition_show().
Signed
On 4/4/25 2:13 PM, Johannes Weiner wrote:
On Fri, Apr 04, 2025 at 01:25:33PM -0400, Waiman Long wrote:
On 4/4/25 1:12 PM, Tejun Heo wrote:
Hello,
On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote:
...
The simple and naive fix of changing the operator to ">", however
On 4/4/25 2:26 PM, Michal Koutný wrote:
Hello Waiman.
On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long
wrote:
1) memory.low is set to 0, but low events can still be triggered and
so the cgroup may have a non-zero low event count. I doubt users are
looking for that as they
',' can appear as part of
the expected values.
Signed-off-by: Waiman Long
---
.../selftests/cgroup/test_cpuset_prs.sh | 236 +-
1 file changed, 118 insertions(+), 118 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh
b/tools/testing/s
On 4/4/25 1:12 PM, Tejun Heo wrote:
Hello,
On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote:
...
The simple and naive fix of changing the operator to ">", however,
changes the memory reclaim behavior which can lead to other failures
as low events are needed to faci
e ("cpuset: Add new v2 cpuset.sched.partition flag")
Signed-off-by: Waiman Long
---
include/linux/cgroup-defs.h | 1 +
include/linux/cgroup.h | 2 +-
kernel/cgroup/cgroup.c | 6 ++
kernel/cgroup/cpuset.c | 20 +---
4 files changed, 25 insertions(+), 4 d
cases above with low replaced by min.
Signed-off-by: Waiman Long
---
include/linux/memcontrol.h | 18 ++
1 file changed, 14 insertions(+), 4 deletions(-)
diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
index 53364526d877..4d4a1f159eaa 100644
--- a/include/li
runs.
Signed-off-by: Waiman Long
---
tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c
b/tools/testing/selftests/cgroup/test_memcontrol.c
index 16f5d74ae762..f442c0c3f5a7
On 4/3/25 9:33 AM, Michal Koutný wrote:
On Sun, Mar 30, 2025 at 05:52:43PM -0400, Waiman Long
wrote:
Currently, we don't allow the creation of a remote partition underneath
another local or remote partition. However, it is currently possible to
create a new local partition with an exi
On 4/1/25 4:41 PM, Waiman Long wrote:
On 4/1/25 3:59 PM, Tejun Heo wrote:
Hello, Waiman.
On Mon, Mar 31, 2025 at 11:12:06PM -0400, Waiman Long wrote:
The problem is the RCU delay between the time a cgroup is killed and
is in a
dying state and when the partition is deactivated when
On 4/1/25 3:59 PM, Tejun Heo wrote:
Hello, Waiman.
On Mon, Mar 31, 2025 at 11:12:06PM -0400, Waiman Long wrote:
The problem is the RCU delay between the time a cgroup is killed and is in a
dying state and when the partition is deactivated when cpuset_css_offline()
is called. That delay can
On 3/31/25 7:13 PM, Tejun Heo wrote:
Hello,
On Sun, Mar 30, 2025 at 05:52:39PM -0400, Waiman Long wrote:
...
One possible way to fix this is to iterate the dying cpusets as well and
avoid using the exclusive CPUs in those dying cpusets. However, this
can still cause random partition creation
Cleaning up the test_cpuset_prs.sh script and restructure some of the
functions so that a new test matrix with a different cgroup directory
structure can be added in the next patch.
Signed-off-by: Waiman Long
---
.../selftests/cgroup/test_cpuset_prs.sh | 257 +++---
1 file
The goto statement in sched_partition_write() is not needed. Remove
it and rename sched_partition_write()/sched_partition_show() to
cpuset_partition_write()/cpuset_partition_show().
Signed-off-by: Waiman Long
---
kernel/cgroup/cpuset.c | 15 ++-
1 file changed, 6 insertions(+), 9
new set of remote partition tests REMOTE_TEST_MATRIX with another
cgroup directory structure more tailored for remote partition testing
to provide better code coverage.
Also add a few new test cases as well as adjusting existig ones for
the original TEST_MATRIX.
Signed-off-by: Waiman Long
expected.
Signed-off-by: Waiman Long
---
kernel/cgroup/cpuset.c | 61 ++
1 file changed, 38 insertions(+), 23 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index f26f791e9323..aa529b2dbf56 100644
--- a/kernel/cgroup/cpuset.c
+++ b
partition root is being handled (updated instead of invalidation)
in update_cpumasks_hier().
Signed-off-by: Waiman Long
---
kernel/cgroup/cpuset.c| 258 ++
.../selftests/cgroup/test_cpuset_prs.sh | 4 +-
2 files changed, 143 insertions(+), 119 deletions
l. So forbid
that by making sure that exclusive_cpus mask doesn't overlap with
subpartitions_cpus and invalidate the partition if that happens.
Signed-off-by: Waiman Long
---
kernel/cgroup/cpuset-internal.h | 1 +
kernel/cgroup/cpuset.c | 14 ++
2 files changed, 15
rtition.
Fixes: 11e5f407b64a ("cgroup/cpuset: Keep track of CPUs in isolated partitions")
Signed-off-by: Waiman Long
---
kernel/cgroup/cpuset.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index 749994312d47..
error.
Fixes: 181c8e091aae ("cgroup/cpuset: Introduce remote partition")
Signed-off-by: Waiman Long
---
kernel/cgroup/cpuset.c | 29 -
1 file changed, 20 insertions(+), 9 deletions(-)
diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c
index a4
This patch series fixes a number of bugs in the cpuset partition code as
well as improvement in remote partition handling. The test_cpuset_prs.sh
is also enhanced to allow more vigorous remote partition testing.
Waiman Long (10):
cgroup/cpuset: Fix race between newly created partition and dying
he way the boot time isolated CPUs are handled in
test_cpuset_prs.sh to make sure that those isolated CPUs are really
isolated instead of just skipping them in the tests.
Fixes: ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation
problem")
Signed-off-by: Waiman Long
---
ke
oc() behaviour default as
kasan_record_aux_stack().
[bigeasy: Dressed the diff as patch. ]
Reported-by: syzbot+39f85d612b7c20d8d...@syzkaller.appspotmail.com
Closes: https://lore.kernel.org/all/67275485.050a0220.3c8d68.0a37....@google.com
Acked-by: Waiman Long
Reviewed-by: Andrey Konovalov
Review
eric: introduce kasan_record_aux_stack_noalloc()")
there.
Right now task_work_add() is the only caller of
kasan_record_aux_stack(). So it essentially make all its callers use the
noalloc version of kasan_record_aux_stack().
Acked-by: Waiman Long
include/linux/kasan.h |
On 6/16/24 10:05, syzbot wrote:
syzbot has bisected this issue to:
commit 21c38a3bd4ee3fb7337d013a638302fb5e5f9dc2
Author: Jesper Dangaard Brouer
Date: Wed May 1 14:04:11 2024 +
cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints
bisection log: https://syzkaller.appspo
On 5/3/24 17:10, David Laight wrote:
From: Waiman Long
Sent: 03 May 2024 17:00
...
David,
Could you respin the series based on the latest upstream code?
I've just reapplied the patches to 'master' and they all apply
cleanly and diffing the new patches to the old ones gives
On 12/31/23 23:14, Waiman Long wrote:
On 12/31/23 16:55, David Laight wrote:
per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number.
This requires the cpu number be 64bit.
However the value is osq_lock() comes from a 32bit xchg() and there
isn't a way of telling gcc the high bit
On 3/20/24 12:20, Steven Rostedt wrote:
From: Steven Rostedt (Google)
I'm debugging some latency issues on a Chromebook and the preemptirqsoff
tracer hit this:
# tracer: preemptirqsoff
#
# preemptirqsoff latency trace v1.1.5 on 5.15.148-21853-g165fd2387469-dirty
# -
_ptr(&osq_node, encoded_cpu_val - 1);
}
/*
You really like micro-optimization.
Anyway,
Reviewed-by: Waiman Long
return true;
- node->prev_cpu = prev_cpu;
+ node = this_cpu_ptr(&osq_node);
prev = decode_cpu(prev_cpu);
+ node->prev_cpu = prev_cpu;
node->locked = 0;
/*
Reviewed-by: Waiman Long
prev = decode_cpu(prev_cpu);
+ }
Just a minor nit. It is not that common in the kernel to add another
nesting level just to reduce the scope of new_prev_cpu auto variable.
Anyway,
Reviewed-by: Waiman Long
ode->prev_cpu) -
1)))
return true;
/* unqueue */
@@ -201,6 +198,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
* it will wait in Step-A.
*/
+ WRITE_ONCE(next->prev_cpu, prev->cpu);
WRITE_ONCE(next->prev, prev);
WRITE_ONCE(prev->next, next);
Reviewed-by: Waiman Long
Reviewed-by: Waiman Long
rev;
Reviewed-by: Waiman Long
On 12/30/23 06:35, David Laight wrote:
From: Ingo Molnar
Sent: 30 December 2023 11:09
* Waiman Long wrote:
On 12/29/23 15:57, David Laight wrote:
this_cpu_ptr() is rather more expensive than raw_cpu_read() since
the latter can use an 'offset from register' (%gs for x86-84).
A
On 12/30/23 17:39, David Laight wrote:
From: Linus Torvalds
Sent: 30 December 2023 19:41
On Fri, 29 Dec 2023 at 12:52, David Laight wrote:
David Laight (5):
Move the definition of optimistic_spin_node into osf_lock.c
Clarify osq_wait_next()
I took these two as preparatory independent
On 12/29/23 22:13, Waiman Long wrote:
On 12/29/23 15:58, David Laight wrote:
The vcpu_is_preempted() test stops osq_lock() spinning if a virtual
cpu is no longer running.
Although patched out for bare-metal the code still needs the cpu number.
Reading this from 'prev->cpu' is
On 12/29/23 17:11, David Laight wrote:
osq_lock() starts by setting node->next to NULL and node->locked to 0.
Careful analysis shows that node->next is always NULL on entry.
node->locked is set non-zero by another cpu to force a wakeup.
This can only happen after the 'prev->next = node' assign
queue */
@@ -205,6 +202,7 @@ bool osq_lock(struct optimistic_spin_queue *lock)
* it will wait in Step-A.
*/
+ WRITE_ONCE(next->prev_cpu, prev->cpu - 1);
WRITE_ONCE(next->prev, prev);
WRITE_ONCE(prev->next, next);
Reviewed-by: Waiman Long
On 12/29/23 15:57, David Laight wrote:
this_cpu_ptr() is rather more expensive than raw_cpu_read() since
the latter can use an 'offset from register' (%gs for x86-84).
Add a 'self' field to 'struct optimistic_spin_node' that can be
read with raw_cpu_read(), initialise on first call.
Signed-off-
On 12/29/23 15:56, David Laight wrote:
osq_wait_next() is passed 'prev' from osq_lock() and NULL from osq_unlock()
but only needs the 'cpu' value to write to lock->tail.
Just pass prev->cpu or OSQ_UNLOCKED_VAL instead.
Also directly return NULL or 'next' instead of breaking the loop.
Should h
c".
After the fix, you can add
Acked-by: Waiman Long
On 4/13/22 02:01, Dan Williams wrote:
Changes since v1 [1]:
- Improve the clarity of the cover letter and changelogs of the
major patches (Patch2 and Patch12) (Pierre, Kevin, and Dave)
- Fix device_lock_interruptible() false negative deadlock detection
(Kevin)
- Fix off-by-one error in the
calls to __mod_objcg_state()
by more than 80%.
Signed-off-by: Waiman Long
Reviewed-by: Shakeel Butt
---
mm/memcontrol.c | 86 +++--
1 file changed, 83 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 7cd7187a017c
(cgroup v2).
Signed-off-by: Waiman Long
---
mm/memcontrol.c | 20 ++--
1 file changed, 14 insertions(+), 6 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index 292b4783b1a7..2f87d0b05092 100644
--- a/mm/memcontrol.c
+++ b/mm/memcontrol.c
@@ -3153,10 +3153,12
only modest.
Patch 4 helps in cgroup v2, but performs worse in cgroup v1 as
eliminating the irq_disable/irq_enable overhead seems to aggravate the
cacheline contention.
[1] https://lore.kernel.org/linux-mm/20210408193948.vfktg3azh2wrt56t@gabell/T/#u
[2] https://lore.kernel.org/lkml/20210114025151.
The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c
so that further optimization can be done to it in later patches without
exposing unnecessary details to other mm components.
Signed-off-by: Waiman Long
Acked-by: Johannes Weiner
---
mm/memcontrol.c | 13 +
mm
price to pay for better performance.
Signed-off-by: Waiman Long
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
---
mm/memcontrol.c | 94 +++--
1 file changed, 68 insertions(+), 26 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index
On 4/19/21 12:55 PM, Johannes Weiner wrote:
On Sun, Apr 18, 2021 at 08:00:31PM -0400, Waiman Long wrote:
Currently, the object stock structure caches either reclaimable vmstat
bytes or unreclaimable vmstat bytes in its object stock structure. The
hit rate can be improved if both types of vmstat
On 4/19/21 12:38 PM, Johannes Weiner wrote:
On Sun, Apr 18, 2021 at 08:00:29PM -0400, Waiman Long wrote:
Before the new slab memory controller with per object byte charging,
charging and vmstat data update happen only when new slab pages are
allocated or freed. Now they are done with every
On 4/19/21 5:11 PM, Johannes Weiner wrote:
BTW, have you ever thought of moving the cgroup-v1 specific functions out
into a separate memcontrol-v1.c file just like kernel/cgroup/cgroup-v1.c?
I thought of that before, but memcontrol.c is a frequently changed file and
so a bit hard to do.
I hav
On 4/19/21 1:19 PM, Waiman Long wrote:
On 4/19/21 1:13 PM, Johannes Weiner wrote:
On Mon, Apr 19, 2021 at 12:18:29PM -0400, Waiman Long wrote:
On 4/19/21 11:21 AM, Waiman Long wrote:
On 4/19/21 11:14 AM, Johannes Weiner wrote:
On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote:
The
On 4/19/21 1:13 PM, Johannes Weiner wrote:
On Mon, Apr 19, 2021 at 12:18:29PM -0400, Waiman Long wrote:
On 4/19/21 11:21 AM, Waiman Long wrote:
On 4/19/21 11:14 AM, Johannes Weiner wrote:
On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote:
The mod_objcg_state() function is moved
On 4/19/21 11:21 AM, Waiman Long wrote:
On 4/19/21 11:14 AM, Johannes Weiner wrote:
On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote:
The mod_objcg_state() function is moved from mm/slab.h to
mm/memcontrol.c
so that further optimization can be done to it in later patches without
On 4/19/21 2:06 AM, Muchun Song wrote:
On Mon, Apr 19, 2021 at 8:01 AM Waiman Long wrote:
There are two issues with the current refill_obj_stock() code. First of
all, when nr_bytes reaches over PAGE_SIZE, it calls drain_obj_stock() to
atomically flush out remaining bytes to obj_cgroup, clear
On 4/19/21 11:14 AM, Johannes Weiner wrote:
On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote:
The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c
so that further optimization can be done to it in later patches without
exposing unnecessary details to other mm
On 4/19/21 11:00 AM, Shakeel Butt wrote:
On Sun, Apr 18, 2021 at 11:07 PM Muchun Song wrote:
On Mon, Apr 19, 2021 at 8:01 AM Waiman Long wrote:
There are two issues with the current refill_obj_stock() code. First of
all, when nr_bytes reaches over PAGE_SIZE, it calls drain_obj_stock() to
tup. However,
the miss rate for parallel kernel build remained about the same probably
because most of the touched kmemcache objects were reclaimable inodes
and dentries.
Signed-off-by: Waiman Long
---
mm/memcontrol.c | 79 +++--
1 file changed,
,
a new overfill flag is added to refill_obj_stock() which will be set
when called from obj_cgroup_charge().
Signed-off-by: Waiman Long
---
mm/memcontrol.c | 23 +--
1 file changed, 17 insertions(+), 6 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index
as this code path isn't being exercised. The
large object test, however, sees a pretty good performance improvement
with this patch.
[1] https://lore.kernel.org/linux-mm/20210408193948.vfktg3azh2wrt56t@gabell/T/#u
[2] https://lore.kernel.org/lkml/20210114025151.GA22932@xsang-OptiPlex-9020/
price to pay for better performance.
Signed-off-by: Waiman Long
Acked-by: Roman Gushchin
Reviewed-by: Shakeel Butt
---
mm/memcontrol.c | 94 +++--
1 file changed, 68 insertions(+), 26 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index
calls to __mod_objcg_state()
by more than 80%.
Signed-off-by: Waiman Long
Reviewed-by: Shakeel Butt
---
mm/memcontrol.c | 64 ++---
1 file changed, 61 insertions(+), 3 deletions(-)
diff --git a/mm/memcontrol.c b/mm/memcontrol.c
index dc9032f28f2e
The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c
so that further optimization can be done to it in later patches without
exposing unnecessary details to other mm components.
Signed-off-by: Waiman Long
---
mm/memcontrol.c | 13 +
mm/slab.h | 16
printed if truncation may have
happened. The cgroup path name is provided for informational purpose
only, so occasional path name truncation should not be a big problem.
Fixes: efe25c2c7b3a ("sched: Reinstate group names in /proc/sched_debug")
Suggested-by: Peter Zijlstra
Signed-off-by:
On 4/15/21 3:40 PM, Johannes Weiner wrote:
On Thu, Apr 15, 2021 at 02:47:31PM -0400, Waiman Long wrote:
On 4/15/21 2:10 PM, Johannes Weiner wrote:
On Thu, Apr 15, 2021 at 12:35:45PM -0400, Waiman Long wrote:
On 4/15/21 12:30 PM, Johannes Weiner wrote:
On Tue, Apr 13, 2021 at 09:20:24PM -0400
On 4/15/21 2:53 PM, Johannes Weiner wrote:
On Thu, Apr 15, 2021 at 02:16:17PM -0400, Waiman Long wrote:
On 4/15/21 1:53 PM, Johannes Weiner wrote:
On Tue, Apr 13, 2021 at 09:20:27PM -0400, Waiman Long wrote:
Most kmem_cache_alloc() calls are from user context. With instrumentation
enabled
On 4/15/21 2:10 PM, Johannes Weiner wrote:
On Thu, Apr 15, 2021 at 12:35:45PM -0400, Waiman Long wrote:
On 4/15/21 12:30 PM, Johannes Weiner wrote:
On Tue, Apr 13, 2021 at 09:20:24PM -0400, Waiman Long wrote:
In memcg_slab_free_hook()/pcpu_memcg_free_hook(), obj_cgroup_uncharge()
is followed
On 4/15/21 1:53 PM, Johannes Weiner wrote:
On Tue, Apr 13, 2021 at 09:20:27PM -0400, Waiman Long wrote:
Most kmem_cache_alloc() calls are from user context. With instrumentation
enabled, the measured amount of kmem_cache_alloc() calls from non-task
context was about 0.01% of the total.
The irq
- } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING,
+ atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING);
+ } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING,
_QW_LOCKED) != _QW_WAITING);
unlock:
arch_spin_unlock(&lock->wait_lock);
Acked-by: Waiman Long
On 4/15/21 1:10 PM, Matthew Wilcox wrote:
On Tue, Apr 13, 2021 at 09:20:22PM -0400, Waiman Long wrote:
With memory accounting disable, the run time was 2.848s. With memory
accounting enabled, the run times with the application of various
patches in the patchset were:
Applied patches Run
On 4/15/21 12:50 PM, Johannes Weiner wrote:
On Tue, Apr 13, 2021 at 09:20:25PM -0400, Waiman Long wrote:
Before the new slab memory controller with per object byte charging,
charging and vmstat data update happen only when new slab pages are
allocated or freed. Now they are done with every
On 4/15/21 12:40 PM, Johannes Weiner wrote:
On Tue, Apr 13, 2021 at 09:20:23PM -0400, Waiman Long wrote:
The caller of mod_memcg_lruvec_state() has both memcg and lruvec readily
available. So both of them are now passed to mod_memcg_lruvec_state()
and __mod_memcg_lruvec_state(). The
On 4/15/21 12:45 PM, Will Deacon wrote:
With that in mind, it would probably be a good idea to eyeball the qspinlock
slowpath as well, as that uses both atomic_cond_read_acquire() and
atomic_try_cmpxchg_relaxed().
It seems plausible that the same thing could occur here in qspinlock:
1 - 100 of 1837 matches
Mail list logo