Re: [PATCH 02/12] locking/rtmutex: Move max_lock_depth into rtmutex.c

2025-05-09 Thread Waiman Long
= &max_lock_depth, - .maxlen = sizeof(int), - .mode = 0644, - .proc_handler = proc_dointvec, - }, -#endif #ifdef CONFIG_TREE_RCU { .procname = "panic_on_rcu_stall", Acked-by: Waiman Long

[PATCH v8 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-05-01 Thread Waiman Long
runs. However, these tests may still fail once in a while if the memory usage goes beyond the newly extended range. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests

[PATCH v8 1/2] selftests: memcg: Allow low event with no memory.low and memory_recursiveprot on

2025-05-01 Thread Waiman Long
side of the expected ranges. Suggested-by: Michal Koutný Signed-off-by: Waiman Long --- .../testing/selftests/cgroup/test_memcontrol.c | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/test

[PATCH v8 0/2] memcg: Fix test_memcg_min/low test failures

2025-05-01 Thread Waiman Long
(with memory_recursiveprot enabled) and sporadically fails its test_memcg_min sub-test. This patchset fixes the test_memcg_min and test_memcg_low failures by adjusting the test_memcontrol selftest to fix these test failures. Waiman Long (2): selftests: memcg: Allow low event with no memory.lo

Re: [PATCH v7 1/2] selftests: memcg: Allow low event with no memory.low and memory_recursiveprot on

2025-04-23 Thread Waiman Long
On 4/23/25 12:49 PM, Michal Koutný wrote: On Tue, Apr 22, 2025 at 07:58:56PM -0400, Waiman Long wrote: Am I correct to assume that the purpose of 1d09069f5313f ("selftests: memcg: expect no low events in unprotected sibling") is to force a failure in the test_memcg_low test to forc

Re: [PATCH v7 1/2] selftests: memcg: Allow low event with no memory.low and memory_recursiveprot on

2025-04-20 Thread Waiman Long
On 4/16/25 5:25 AM, Michal Koutný wrote: On Tue, Apr 15, 2025 at 05:04:14PM -0400, Waiman Long wrote: + /* +* Child 2 has memory.low=0, but some low protection is still being +* distributed down from its parent with memory.low=50M if cgroup2

[PATCH v7 1/2] selftests: memcg: Allow low event with no memory.low and memory_recursiveprot on

2025-04-15 Thread Waiman Long
() returns true. If we ever change mem_cgroup_below_min() in such a way that it no longer skips the no usage case, we will have to add code to explicitly skip it. Suggested-by: Michal Koutný Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 16 +++- 1 file

[PATCH v7 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-15 Thread Waiman Long
runs. However, these tests may still fail once in a while if the memory usage goes beyond the newly extended range. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests

[PATCH v7 0/2] memcg: Fix test_memcg_min/low test failures

2025-04-15 Thread Waiman Long
fails its test_memcg_min sub-test. This patchset fixes the test_memcg_min and test_memcg_low failures by adjusting the test_memcontrol selftest to fix these test failures. Waiman Long (2): selftests: memcg: Allow low event with no memory.low and memory_recursiveprot on selftests: memcg: Inc

Re: [PATCH v6 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-14 Thread Waiman Long
On 4/14/25 8:42 AM, Michal Koutný wrote: On Sun, Apr 13, 2025 at 10:12:48PM -0400, Waiman Long wrote: 2) memory.low is set to a non-zero value but the cgroup has no task in it so that it has an effective low value of 0. Again it may have a non-zero low event count if memory reclaim

[PATCH v6 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-13 Thread Waiman Long
and test_memcg_min sub-tests may still fail occasionally if the memory.current values fall outside of the expected ranges. Suggested-by: Johannes Weiner Suggested-by: Michal Koutný Signed-off-by: Waiman Long --- mm/internal.h| 9 ++

[PATCH v6 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-13 Thread Waiman Long
runs. However, these tests may still fail once in a while if the memory usage goes beyond the newly extended range. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests

[PATCH v6 0/2] memcg: Fix test_memcg_min/low test failures

2025-04-13 Thread Waiman Long
es by skipping the !usage case in shrink_node_memcgs() and adjust the test_memcontrol selftest to fix other causes of the test failures. Waiman Long (2): mm/vmscan: Skip memcg with !usage in shrink_node_memcgs() selftests: memcg: Increase error tolerance of child memory.current che

Re: [PATCH v3 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-12 Thread Waiman Long
On 4/8/25 6:22 PM, Roman Gushchin wrote: On Sat, Apr 05, 2025 at 10:40:10PM -0400, Waiman Long wrote: The test_memcg_protection() function is used for the test_memcg_min and test_memcg_low sub-tests. This function generates a set of parent/child cgroups like: parent: memory.min/low = 50M

Re: [PATCH v5 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-11 Thread Waiman Long
On 4/11/25 1:22 PM, Michal Koutný wrote: On Mon, Apr 07, 2025 at 12:23:16PM -0400, Waiman Long wrote: Child Actual usageExpected usage%err - -- 1 16990208 22020096 -12.9% 1 17252352 22020096

Re: [PATCH v5 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-11 Thread Waiman Long
On 4/11/25 1:11 PM, Michal Koutný wrote: Hello. On Mon, Apr 07, 2025 at 12:23:15PM -0400, Waiman Long wrote: --- a/mm/memcontrol-v1.h +++ b/mm/memcontrol-v1.h @@ -22,8 +22,6 @@ iter != NULL; \ iter = mem_cgroup_iter(NULL, iter, NULL

Re: [PATCH v4 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-11 Thread Waiman Long
On 4/7/25 11:25 AM, Michal Koutný wrote: Hi Waiman. On Sun, Apr 06, 2025 at 09:41:58PM -0400, Waiman Long wrote: ... diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 16f5d74ae762..bab826b6b7b0 100644 --- a/tools/testing

[PATCH v5 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-09 Thread Waiman Long
Johannes Weiner Signed-off-by: Waiman Long --- mm/internal.h| 9 + mm/memcontrol-v1.h | 2 -- mm/vmscan.c | 4 tools/testing/selftests/cgroup/test_memcontrol.c | 7 ++- 4 files

[PATCH v5 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-07 Thread Waiman Long
runs. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index bab826b6b7b0..8f4f2479650e

[PATCH v5 0/2] memcg: Fix test_memcg_min/low test failures

2025-04-07 Thread Waiman Long
oradically fails its test_memcg_min sub-test. This patchset fixes the test_memcg_min and test_memcg_low failures by skipping the !usage case in shrink_node_memcgs() and adjust the test_memcontrol selftest to fix other causes of the test failures. Waiman Long (2): mm/vmscan: Skip memcg with

Re: [PATCH v4 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-07 Thread Waiman Long
On 4/7/25 10:24 AM, Johannes Weiner wrote: On Sun, Apr 06, 2025 at 09:41:58PM -0400, Waiman Long wrote: The test_memcontrol selftest consistently fails its test_memcg_low sub-test due to the fact that two of its test child cgroups which have a memmory.low of 0 or an effective memory.low of 0

[PATCH v4 0/2] memcg: Fix test_memcg_min/low test failures

2025-04-06 Thread Waiman Long
ses of the test failures. Note that I decide not to use the suggested mem_cgroup_usage() call as it is a real function call defined in mm/memcontrol.c to be used mainly by cgroup v1 code. Waiman Long (2): mm/vmscan: Skip memcg with !usage in shrink_node_memcgs() selftests: memcg: Increase err

[PATCH v4 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-06 Thread Waiman Long
runs. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index bab826b6b7b0..8f4f2479650e

[PATCH v4 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-06 Thread Waiman Long
patch applied, the test_memcg_low sub-test finishes successfully without failure in most cases. Though both test_memcg_low and test_memcg_min sub-tests may still fail occasionally if the memory.current values fall outside of the expected ranges. Suggested-by: Johannes Weiner Signed

[PATCH v3 0/2] memcg: Fix test_memcg_min/low test failures

2025-04-05 Thread Waiman Long
in mm/memcontrol.c which is not available if CONFIG_MEMCG isn't defined. Waiman Long (2): mm/vmscan: Skip memcg with !usage in shrink_node_memcgs() selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection() mm/vmscan.c

[PATCH v3 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-05 Thread Waiman Long
runs. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index bab826b6b7b0..8f4f2479650e

[PATCH v3 1/2] mm/vmscan: Skip memcg with !usage in shrink_node_memcgs()

2025-04-05 Thread Waiman Long
test_memcg_min sub-tests may still fail occasionally if the memory.current values fall outside of the expected ranges. Suggested-by: Johannes Weiner Signed-off-by: Waiman Long --- mm/vmscan.c | 4 tools/testing/selftests/cgroup/test_memcontrol.c | 7

Re: [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0

2025-04-05 Thread Waiman Long
On 4/4/25 3:38 PM, Johannes Weiner wrote: On Fri, Apr 04, 2025 at 02:55:35PM -0400, Waiman Long wrote: On 4/4/25 2:13 PM, Johannes Weiner wrote: * Waiman points out that the weirdness is seeing low events without having a low configured. Eh, this isn't really true with recu

Re: [PATCH 07/10] cgroup/cpuset: Remove unneeded goto in sched_partition_write() and rename it

2025-04-05 Thread Waiman Long
On 4/3/25 9:33 AM, Michal Koutný wrote: On Sun, Mar 30, 2025 at 05:52:45PM -0400, Waiman Long wrote: The goto statement in sched_partition_write() is not needed. Remove it and rename sched_partition_write()/sched_partition_show() to cpuset_partition_write()/cpuset_partition_show(). Signed

Re: [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0

2025-04-04 Thread Waiman Long
On 4/4/25 2:13 PM, Johannes Weiner wrote: On Fri, Apr 04, 2025 at 01:25:33PM -0400, Waiman Long wrote: On 4/4/25 1:12 PM, Tejun Heo wrote: Hello, On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote: ... The simple and naive fix of changing the operator to ">", however

Re: [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0

2025-04-04 Thread Waiman Long
On 4/4/25 2:26 PM, Michal Koutný wrote: Hello Waiman. On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote: 1) memory.low is set to 0, but low events can still be triggered and so the cgroup may have a non-zero low event count. I doubt users are looking for that as they

[PATCH 08/10] selftest/cgroup: Update test_cpuset_prs.sh to use | as effective CPUs and state separator

2025-04-04 Thread Waiman Long
',' can appear as part of the expected values. Signed-off-by: Waiman Long --- .../selftests/cgroup/test_cpuset_prs.sh | 236 +- 1 file changed, 118 insertions(+), 118 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_cpuset_prs.sh b/tools/testing/s

Re: [PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0

2025-04-04 Thread Waiman Long
On 4/4/25 1:12 PM, Tejun Heo wrote: Hello, On Thu, Apr 03, 2025 at 09:24:34PM -0400, Waiman Long wrote: ... The simple and naive fix of changing the operator to ">", however, changes the memory reclaim behavior which can lead to other failures as low events are needed to faci

[PATCH 01/10] cgroup/cpuset: Fix race between newly created partition and dying one

2025-04-04 Thread Waiman Long
e ("cpuset: Add new v2 cpuset.sched.partition flag") Signed-off-by: Waiman Long --- include/linux/cgroup-defs.h | 1 + include/linux/cgroup.h | 2 +- kernel/cgroup/cgroup.c | 6 ++ kernel/cgroup/cpuset.c | 20 +--- 4 files changed, 25 insertions(+), 4 d

[PATCH v2 1/2] memcg: Don't generate low/min events if either low/min or elow/emin is 0

2025-04-03 Thread Waiman Long
cases above with low replaced by min. Signed-off-by: Waiman Long --- include/linux/memcontrol.h | 18 ++ 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h index 53364526d877..4d4a1f159eaa 100644 --- a/include/li

[PATCH v2 2/2] selftests: memcg: Increase error tolerance of child memory.current check in test_memcg_protection()

2025-04-03 Thread Waiman Long
runs. Signed-off-by: Waiman Long --- tools/testing/selftests/cgroup/test_memcontrol.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/cgroup/test_memcontrol.c b/tools/testing/selftests/cgroup/test_memcontrol.c index 16f5d74ae762..f442c0c3f5a7

Re: [PATCH 05/10] cgroup/cpuset: Don't allow creation of local partition over a remote one

2025-04-03 Thread Waiman Long
On 4/3/25 9:33 AM, Michal Koutný wrote: On Sun, Mar 30, 2025 at 05:52:43PM -0400, Waiman Long wrote: Currently, we don't allow the creation of a remote partition underneath another local or remote partition. However, it is currently possible to create a new local partition with an exi

Re: [PATCH 01/10] cgroup/cpuset: Fix race between newly created partition and dying one

2025-04-01 Thread Waiman Long
On 4/1/25 4:41 PM, Waiman Long wrote: On 4/1/25 3:59 PM, Tejun Heo wrote: Hello, Waiman. On Mon, Mar 31, 2025 at 11:12:06PM -0400, Waiman Long wrote: The problem is the RCU delay between the time a cgroup is killed and is in a dying state and when the partition is deactivated when

Re: [PATCH 01/10] cgroup/cpuset: Fix race between newly created partition and dying one

2025-04-01 Thread Waiman Long
On 4/1/25 3:59 PM, Tejun Heo wrote: Hello, Waiman. On Mon, Mar 31, 2025 at 11:12:06PM -0400, Waiman Long wrote: The problem is the RCU delay between the time a cgroup is killed and is in a dying state and when the partition is deactivated when cpuset_css_offline() is called. That delay can

Re: [PATCH 01/10] cgroup/cpuset: Fix race between newly created partition and dying one

2025-03-31 Thread Waiman Long
On 3/31/25 7:13 PM, Tejun Heo wrote: Hello, On Sun, Mar 30, 2025 at 05:52:39PM -0400, Waiman Long wrote: ... One possible way to fix this is to iterate the dying cpusets as well and avoid using the exclusive CPUs in those dying cpusets. However, this can still cause random partition creation

[PATCH 09/10] selftest/cgroup: Clean up and restructure test_cpuset_prs.sh

2025-03-30 Thread Waiman Long
Cleaning up the test_cpuset_prs.sh script and restructure some of the functions so that a new test matrix with a different cgroup directory structure can be added in the next patch. Signed-off-by: Waiman Long --- .../selftests/cgroup/test_cpuset_prs.sh | 257 +++--- 1 file

[PATCH 07/10] cgroup/cpuset: Remove unneeded goto in sched_partition_write() and rename it

2025-03-30 Thread Waiman Long
The goto statement in sched_partition_write() is not needed. Remove it and rename sched_partition_write()/sched_partition_show() to cpuset_partition_write()/cpuset_partition_show(). Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 15 ++- 1 file changed, 6 insertions(+), 9

[PATCH 10/10] selftest/cgroup: Add a remote partition transition test to test_cpuset_prs.sh

2025-03-30 Thread Waiman Long
new set of remote partition tests REMOTE_TEST_MATRIX with another cgroup directory structure more tailored for remote partition testing to provide better code coverage. Also add a few new test cases as well as adjusting existig ones for the original TEST_MATRIX. Signed-off-by: Waiman Long

[PATCH 06/10] cgroup/cpuset: Code cleanup and comment update

2025-03-30 Thread Waiman Long
expected. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 61 ++ 1 file changed, 38 insertions(+), 23 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index f26f791e9323..aa529b2dbf56 100644 --- a/kernel/cgroup/cpuset.c +++ b

[PATCH 04/10] cgroup/cpuset: Remove remote_partition_check() & make update_cpumasks_hier() handle remote partition

2025-03-30 Thread Waiman Long
partition root is being handled (updated instead of invalidation) in update_cpumasks_hier(). Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c| 258 ++ .../selftests/cgroup/test_cpuset_prs.sh | 4 +- 2 files changed, 143 insertions(+), 119 deletions

[PATCH 05/10] cgroup/cpuset: Don't allow creation of local partition over a remote one

2025-03-30 Thread Waiman Long
l. So forbid that by making sure that exclusive_cpus mask doesn't overlap with subpartitions_cpus and invalidate the partition if that happens. Signed-off-by: Waiman Long --- kernel/cgroup/cpuset-internal.h | 1 + kernel/cgroup/cpuset.c | 14 ++ 2 files changed, 15

[PATCH 02/10] cgroup/cpuset: Fix incorrect isolated_cpus update in update_parent_effective_cpumask()

2025-03-30 Thread Waiman Long
rtition. Fixes: 11e5f407b64a ("cgroup/cpuset: Keep track of CPUs in isolated partitions") Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index 749994312d47..

[PATCH 03/10] cgroup/cpuset: Fix error handling in remote_partition_disable()

2025-03-30 Thread Waiman Long
error. Fixes: 181c8e091aae ("cgroup/cpuset: Introduce remote partition") Signed-off-by: Waiman Long --- kernel/cgroup/cpuset.c | 29 - 1 file changed, 20 insertions(+), 9 deletions(-) diff --git a/kernel/cgroup/cpuset.c b/kernel/cgroup/cpuset.c index a4

[PATCH 00/10] cgroup/cpuset: Miscellaneous partition bug fixes and enhancements

2025-03-30 Thread Waiman Long
This patch series fixes a number of bugs in the cpuset partition code as well as improvement in remote partition handling. The test_cpuset_prs.sh is also enhanced to allow more vigorous remote partition testing. Waiman Long (10): cgroup/cpuset: Fix race between newly created partition and dying

[PATCH] cgroup/cpuset: Prevent leakage of isolated CPUs into sched domains

2024-12-05 Thread Waiman Long
he way the boot time isolated CPUs are handled in test_cpuset_prs.sh to make sure that those isolated CPUs are really isolated instead of just skipping them in the tests. Fixes: ccac8e8de99c ("cgroup/cpuset: Fix remote root partition creation problem") Signed-off-by: Waiman Long --- ke

Re: [PATCH v2] kasan: Make kasan_record_aux_stack_noalloc() the default behaviour

2024-11-22 Thread Waiman Long
oc() behaviour default as kasan_record_aux_stack(). [bigeasy: Dressed the diff as patch. ] Reported-by: syzbot+39f85d612b7c20d8d...@syzkaller.appspotmail.com Closes: https://lore.kernel.org/all/67275485.050a0220.3c8d68.0a37....@google.com Acked-by: Waiman Long Reviewed-by: Andrey Konovalov Review

Re: [PATCH] kasan: Remove kasan_record_aux_stack_noalloc().

2024-11-19 Thread Waiman Long
eric: introduce kasan_record_aux_stack_noalloc()") there. Right now task_work_add() is the only caller of kasan_record_aux_stack(). So it essentially make all its callers use the noalloc version of kasan_record_aux_stack(). Acked-by: Waiman Long include/linux/kasan.h |

Re: [syzbot] [mm?] possible deadlock in __mmap_lock_do_trace_start_locking

2024-06-16 Thread Waiman Long
On 6/16/24 10:05, syzbot wrote: syzbot has bisected this issue to: commit 21c38a3bd4ee3fb7337d013a638302fb5e5f9dc2 Author: Jesper Dangaard Brouer Date: Wed May 1 14:04:11 2024 + cgroup/rstat: add cgroup_rstat_cpu_lock helpers and tracepoints bisection log: https://syzkaller.appspo

Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2024-05-03 Thread Waiman Long
On 5/3/24 17:10, David Laight wrote: From: Waiman Long Sent: 03 May 2024 17:00 ... David, Could you respin the series based on the latest upstream code? I've just reapplied the patches to 'master' and they all apply cleanly and diffing the new patches to the old ones gives

Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2024-05-03 Thread Waiman Long
On 12/31/23 23:14, Waiman Long wrote: On 12/31/23 16:55, David Laight wrote: per_cpu_ptr() indexes __per_cpu_offset[] with the cpu number. This requires the cpu number be 64bit. However the value is osq_lock() comes from a 32bit xchg() and there isn't a way of telling gcc the high bit

Re: [RFC][PATCH] tracing: Introduce restart_critical_timings()

2024-03-20 Thread Waiman Long
On 3/20/24 12:20, Steven Rostedt wrote: From: Steven Rostedt (Google) I'm debugging some latency issues on a Chromebook and the preemptirqsoff tracer hit this: # tracer: preemptirqsoff # # preemptirqsoff latency trace v1.1.5 on 5.15.148-21853-g165fd2387469-dirty # -

Re: [PATCH next v2 5/5] locking/osq_lock: Optimise decode_cpu() and per_cpu_ptr().

2023-12-31 Thread Waiman Long
_ptr(&osq_node, encoded_cpu_val - 1); } /* You really like micro-optimization. Anyway, Reviewed-by: Waiman Long

Re: [PATCH next v2 4/5] locking/osq_lock: Avoid writing to node->next in the osq_lock() fast path.

2023-12-31 Thread Waiman Long
return true; - node->prev_cpu = prev_cpu; + node = this_cpu_ptr(&osq_node); prev = decode_cpu(prev_cpu); + node->prev_cpu = prev_cpu; node->locked = 0; /* Reviewed-by: Waiman Long

Re: [PATCH next v2 3/5] locking/osq_lock: Use node->prev_cpu instead of saving node->prev.

2023-12-31 Thread Waiman Long
prev = decode_cpu(prev_cpu); + } Just a minor nit. It is not that common in the kernel to add another nesting level just to reduce the scope of  new_prev_cpu auto variable. Anyway, Reviewed-by: Waiman Long

Re: [PATCH next v2 2/5] locking/osq_lock: Optimise the vcpu_is_preempted() check.

2023-12-31 Thread Waiman Long
ode->prev_cpu) - 1))) return true; /* unqueue */ @@ -201,6 +198,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * it will wait in Step-A. */ + WRITE_ONCE(next->prev_cpu, prev->cpu); WRITE_ONCE(next->prev, prev); WRITE_ONCE(prev->next, next); Reviewed-by: Waiman Long Reviewed-by: Waiman Long

Re: [PATCH next v2 1/5] locking/osq_lock: Defer clearing node->locked until the slow osq_lock() path.

2023-12-31 Thread Waiman Long
rev; Reviewed-by: Waiman Long

Re: [PATCH next 4/5] locking/osq_lock: Optimise per-cpu data accesses.

2023-12-30 Thread Waiman Long
On 12/30/23 06:35, David Laight wrote: From: Ingo Molnar Sent: 30 December 2023 11:09 * Waiman Long wrote: On 12/29/23 15:57, David Laight wrote: this_cpu_ptr() is rather more expensive than raw_cpu_read() since the latter can use an 'offset from register' (%gs for x86-84). A

Re: [PATCH next 0/5] locking/osq_lock: Optimisations to osq_lock code

2023-12-30 Thread Waiman Long
On 12/30/23 17:39, David Laight wrote: From: Linus Torvalds Sent: 30 December 2023 19:41 On Fri, 29 Dec 2023 at 12:52, David Laight wrote: David Laight (5): Move the definition of optimistic_spin_node into osf_lock.c Clarify osq_wait_next() I took these two as preparatory independent

Re: [PATCH next 5/5] locking/osq_lock: Optimise vcpu_is_preempted() check.

2023-12-30 Thread Waiman Long
On 12/29/23 22:13, Waiman Long wrote: On 12/29/23 15:58, David Laight wrote: The vcpu_is_preempted() test stops osq_lock() spinning if a virtual    cpu is no longer running. Although patched out for bare-metal the code still needs the cpu number. Reading this from 'prev->cpu' is

Re: [PATCH next 2/5] locking/osq_lock: Avoid dirtying the local cpu's 'node' in the osq_lock() fast path.

2023-12-29 Thread Waiman Long
On 12/29/23 17:11, David Laight wrote: osq_lock() starts by setting node->next to NULL and node->locked to 0. Careful analysis shows that node->next is always NULL on entry. node->locked is set non-zero by another cpu to force a wakeup. This can only happen after the 'prev->next = node' assign

Re: [PATCH next 5/5] locking/osq_lock: Optimise vcpu_is_preempted() check.

2023-12-29 Thread Waiman Long
queue */ @@ -205,6 +202,7 @@ bool osq_lock(struct optimistic_spin_queue *lock) * it will wait in Step-A. */ + WRITE_ONCE(next->prev_cpu, prev->cpu - 1); WRITE_ONCE(next->prev, prev); WRITE_ONCE(prev->next, next); Reviewed-by: Waiman Long

Re: [PATCH next 4/5] locking/osq_lock: Optimise per-cpu data accesses.

2023-12-29 Thread Waiman Long
On 12/29/23 15:57, David Laight wrote: this_cpu_ptr() is rather more expensive than raw_cpu_read() since the latter can use an 'offset from register' (%gs for x86-84). Add a 'self' field to 'struct optimistic_spin_node' that can be read with raw_cpu_read(), initialise on first call. Signed-off-

Re: [PATCH next 3/5] locking/osq_lock: Clarify osq_wait_next()

2023-12-29 Thread Waiman Long
On 12/29/23 15:56, David Laight wrote: osq_wait_next() is passed 'prev' from osq_lock() and NULL from osq_unlock() but only needs the 'cpu' value to write to lock->tail. Just pass prev->cpu or OSQ_UNLOCKED_VAL instead. Also directly return NULL or 'next' instead of breaking the loop. Should h

Re: [PATCH next 1/5] locking/osq_lock: Move the definition of optimistic_spin_node into osf_lock.c

2023-12-29 Thread Waiman Long
c". After the fix, you can add Acked-by: Waiman Long

Re: [PATCH v2 00/12] device-core: Enable device_lock() lockdep validation

2022-04-13 Thread Waiman Long
On 4/13/22 02:01, Dan Williams wrote: Changes since v1 [1]: - Improve the clarity of the cover letter and changelogs of the major patches (Patch2 and Patch12) (Pierre, Kevin, and Dave) - Fix device_lock_interruptible() false negative deadlock detection (Kevin) - Fix off-by-one error in the

[PATCH-next v5 2/4] mm/memcg: Cache vmstat data in percpu memcg_stock_pcp

2021-04-20 Thread Waiman Long
calls to __mod_objcg_state() by more than 80%. Signed-off-by: Waiman Long Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 86 +++-- 1 file changed, 83 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 7cd7187a017c

[PATCH-next v5 3/4] mm/memcg: Improve refill_obj_stock() performance

2021-04-20 Thread Waiman Long
(cgroup v2). Signed-off-by: Waiman Long --- mm/memcontrol.c | 20 ++-- 1 file changed, 14 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index 292b4783b1a7..2f87d0b05092 100644 --- a/mm/memcontrol.c +++ b/mm/memcontrol.c @@ -3153,10 +3153,12

[PATCH-next v5 0/4] mm/memcg: Reduce kmemcache memory accounting overhead

2021-04-20 Thread Waiman Long
only modest. Patch 4 helps in cgroup v2, but performs worse in cgroup v1 as eliminating the irq_disable/irq_enable overhead seems to aggravate the cacheline contention. [1] https://lore.kernel.org/linux-mm/20210408193948.vfktg3azh2wrt56t@gabell/T/#u [2] https://lore.kernel.org/lkml/20210114025151.

[PATCH-next v5 1/4] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-20 Thread Waiman Long
The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c so that further optimization can be done to it in later patches without exposing unnecessary details to other mm components. Signed-off-by: Waiman Long Acked-by: Johannes Weiner --- mm/memcontrol.c | 13 + mm

[PATCH-next v5 4/4] mm/memcg: Optimize user context object stock access

2021-04-20 Thread Waiman Long
price to pay for better performance. Signed-off-by: Waiman Long Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 94 +++-- 1 file changed, 68 insertions(+), 26 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index

Re: [PATCH v4 4/5] mm/memcg: Save both reclaimable & unreclaimable bytes in object stock

2021-04-20 Thread Waiman Long
On 4/19/21 12:55 PM, Johannes Weiner wrote: On Sun, Apr 18, 2021 at 08:00:31PM -0400, Waiman Long wrote: Currently, the object stock structure caches either reclaimable vmstat bytes or unreclaimable vmstat bytes in its object stock structure. The hit rate can be improved if both types of vmstat

Re: [PATCH v4 2/5] mm/memcg: Cache vmstat data in percpu memcg_stock_pcp

2021-04-19 Thread Waiman Long
On 4/19/21 12:38 PM, Johannes Weiner wrote: On Sun, Apr 18, 2021 at 08:00:29PM -0400, Waiman Long wrote: Before the new slab memory controller with per object byte charging, charging and vmstat data update happen only when new slab pages are allocated or freed. Now they are done with every

Re: [PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-19 Thread Waiman Long
On 4/19/21 5:11 PM, Johannes Weiner wrote: BTW, have you ever thought of moving the cgroup-v1 specific functions out into a separate memcontrol-v1.c file just like kernel/cgroup/cgroup-v1.c? I thought of that before, but memcontrol.c is a frequently changed file and so a bit hard to do. I hav

Re: [PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-19 Thread Waiman Long
On 4/19/21 1:19 PM, Waiman Long wrote: On 4/19/21 1:13 PM, Johannes Weiner wrote: On Mon, Apr 19, 2021 at 12:18:29PM -0400, Waiman Long wrote: On 4/19/21 11:21 AM, Waiman Long wrote: On 4/19/21 11:14 AM, Johannes Weiner wrote: On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote: The

Re: [PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-19 Thread Waiman Long
On 4/19/21 1:13 PM, Johannes Weiner wrote: On Mon, Apr 19, 2021 at 12:18:29PM -0400, Waiman Long wrote: On 4/19/21 11:21 AM, Waiman Long wrote: On 4/19/21 11:14 AM, Johannes Weiner wrote: On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote: The mod_objcg_state() function is moved

Re: [PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-19 Thread Waiman Long
On 4/19/21 11:21 AM, Waiman Long wrote: On 4/19/21 11:14 AM, Johannes Weiner wrote: On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote: The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c so that further optimization can be done to it in later patches without

Re: [External] [PATCH v4 5/5] mm/memcg: Improve refill_obj_stock() performance

2021-04-19 Thread Waiman Long
On 4/19/21 2:06 AM, Muchun Song wrote: On Mon, Apr 19, 2021 at 8:01 AM Waiman Long wrote: There are two issues with the current refill_obj_stock() code. First of all, when nr_bytes reaches over PAGE_SIZE, it calls drain_obj_stock() to atomically flush out remaining bytes to obj_cgroup, clear

Re: [PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-19 Thread Waiman Long
On 4/19/21 11:14 AM, Johannes Weiner wrote: On Sun, Apr 18, 2021 at 08:00:28PM -0400, Waiman Long wrote: The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c so that further optimization can be done to it in later patches without exposing unnecessary details to other mm

Re: [External] [PATCH v4 5/5] mm/memcg: Improve refill_obj_stock() performance

2021-04-19 Thread Waiman Long
On 4/19/21 11:00 AM, Shakeel Butt wrote: On Sun, Apr 18, 2021 at 11:07 PM Muchun Song wrote: On Mon, Apr 19, 2021 at 8:01 AM Waiman Long wrote: There are two issues with the current refill_obj_stock() code. First of all, when nr_bytes reaches over PAGE_SIZE, it calls drain_obj_stock() to

[PATCH v4 4/5] mm/memcg: Save both reclaimable & unreclaimable bytes in object stock

2021-04-18 Thread Waiman Long
tup. However, the miss rate for parallel kernel build remained about the same probably because most of the touched kmemcache objects were reclaimable inodes and dentries. Signed-off-by: Waiman Long --- mm/memcontrol.c | 79 +++-- 1 file changed,

[PATCH v4 5/5] mm/memcg: Improve refill_obj_stock() performance

2021-04-18 Thread Waiman Long
, a new overfill flag is added to refill_obj_stock() which will be set when called from obj_cgroup_charge(). Signed-off-by: Waiman Long --- mm/memcontrol.c | 23 +-- 1 file changed, 17 insertions(+), 6 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index

[PATCH v4 0/5] mm/memcg: Reduce kmemcache memory accounting overhead

2021-04-18 Thread Waiman Long
as this code path isn't being exercised. The large object test, however, sees a pretty good performance improvement with this patch. [1] https://lore.kernel.org/linux-mm/20210408193948.vfktg3azh2wrt56t@gabell/T/#u [2] https://lore.kernel.org/lkml/20210114025151.GA22932@xsang-OptiPlex-9020/

[PATCH v4 3/5] mm/memcg: Optimize user context object stock access

2021-04-18 Thread Waiman Long
price to pay for better performance. Signed-off-by: Waiman Long Acked-by: Roman Gushchin Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 94 +++-- 1 file changed, 68 insertions(+), 26 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index

[PATCH v4 2/5] mm/memcg: Cache vmstat data in percpu memcg_stock_pcp

2021-04-18 Thread Waiman Long
calls to __mod_objcg_state() by more than 80%. Signed-off-by: Waiman Long Reviewed-by: Shakeel Butt --- mm/memcontrol.c | 64 ++--- 1 file changed, 61 insertions(+), 3 deletions(-) diff --git a/mm/memcontrol.c b/mm/memcontrol.c index dc9032f28f2e

[PATCH v4 1/5] mm/memcg: Move mod_objcg_state() to memcontrol.c

2021-04-18 Thread Waiman Long
The mod_objcg_state() function is moved from mm/slab.h to mm/memcontrol.c so that further optimization can be done to it in later patches without exposing unnecessary details to other mm components. Signed-off-by: Waiman Long --- mm/memcontrol.c | 13 + mm/slab.h | 16

[PATCH v5] sched/debug: Use sched_debug_lock to serialize use of cgroup_path[] only

2021-04-15 Thread Waiman Long
printed if truncation may have happened. The cgroup path name is provided for informational purpose only, so occasional path name truncation should not be a big problem. Fixes: efe25c2c7b3a ("sched: Reinstate group names in /proc/sched_debug") Suggested-by: Peter Zijlstra Signed-off-by:

Re: [PATCH v3 2/5] mm/memcg: Introduce obj_cgroup_uncharge_mod_state()

2021-04-15 Thread Waiman Long
On 4/15/21 3:40 PM, Johannes Weiner wrote: On Thu, Apr 15, 2021 at 02:47:31PM -0400, Waiman Long wrote: On 4/15/21 2:10 PM, Johannes Weiner wrote: On Thu, Apr 15, 2021 at 12:35:45PM -0400, Waiman Long wrote: On 4/15/21 12:30 PM, Johannes Weiner wrote: On Tue, Apr 13, 2021 at 09:20:24PM -0400

Re: [PATCH v3 5/5] mm/memcg: Optimize user context object stock access

2021-04-15 Thread Waiman Long
On 4/15/21 2:53 PM, Johannes Weiner wrote: On Thu, Apr 15, 2021 at 02:16:17PM -0400, Waiman Long wrote: On 4/15/21 1:53 PM, Johannes Weiner wrote: On Tue, Apr 13, 2021 at 09:20:27PM -0400, Waiman Long wrote: Most kmem_cache_alloc() calls are from user context. With instrumentation enabled

Re: [PATCH v3 2/5] mm/memcg: Introduce obj_cgroup_uncharge_mod_state()

2021-04-15 Thread Waiman Long
On 4/15/21 2:10 PM, Johannes Weiner wrote: On Thu, Apr 15, 2021 at 12:35:45PM -0400, Waiman Long wrote: On 4/15/21 12:30 PM, Johannes Weiner wrote: On Tue, Apr 13, 2021 at 09:20:24PM -0400, Waiman Long wrote: In memcg_slab_free_hook()/pcpu_memcg_free_hook(), obj_cgroup_uncharge() is followed

Re: [PATCH v3 5/5] mm/memcg: Optimize user context object stock access

2021-04-15 Thread Waiman Long
On 4/15/21 1:53 PM, Johannes Weiner wrote: On Tue, Apr 13, 2021 at 09:20:27PM -0400, Waiman Long wrote: Most kmem_cache_alloc() calls are from user context. With instrumentation enabled, the measured amount of kmem_cache_alloc() calls from non-task context was about 0.01% of the total. The irq

Re: [PATCH v2] locking/qrwlock: Fix ordering in queued_write_lock_slowpath

2021-04-15 Thread Waiman Long
- } while (atomic_cmpxchg_relaxed(&lock->cnts, _QW_WAITING, + atomic_cond_read_relaxed(&lock->cnts, VAL == _QW_WAITING); + } while (atomic_cmpxchg_acquire(&lock->cnts, _QW_WAITING, _QW_LOCKED) != _QW_WAITING); unlock: arch_spin_unlock(&lock->wait_lock); Acked-by: Waiman Long

Re: [PATCH v3 0/5] mm/memcg: Reduce kmemcache memory accounting overhead

2021-04-15 Thread Waiman Long
On 4/15/21 1:10 PM, Matthew Wilcox wrote: On Tue, Apr 13, 2021 at 09:20:22PM -0400, Waiman Long wrote: With memory accounting disable, the run time was 2.848s. With memory accounting enabled, the run times with the application of various patches in the patchset were: Applied patches Run

Re: [PATCH v3 3/5] mm/memcg: Cache vmstat data in percpu memcg_stock_pcp

2021-04-15 Thread Waiman Long
On 4/15/21 12:50 PM, Johannes Weiner wrote: On Tue, Apr 13, 2021 at 09:20:25PM -0400, Waiman Long wrote: Before the new slab memory controller with per object byte charging, charging and vmstat data update happen only when new slab pages are allocated or freed. Now they are done with every

Re: [PATCH v3 1/5] mm/memcg: Pass both memcg and lruvec to mod_memcg_lruvec_state()

2021-04-15 Thread Waiman Long
On 4/15/21 12:40 PM, Johannes Weiner wrote: On Tue, Apr 13, 2021 at 09:20:23PM -0400, Waiman Long wrote: The caller of mod_memcg_lruvec_state() has both memcg and lruvec readily available. So both of them are now passed to mod_memcg_lruvec_state() and __mod_memcg_lruvec_state(). The

Re: [PATCH] locking/qrwlock: Fix ordering in queued_write_lock_slowpath

2021-04-15 Thread Waiman Long
On 4/15/21 12:45 PM, Will Deacon wrote: With that in mind, it would probably be a good idea to eyeball the qspinlock slowpath as well, as that uses both atomic_cond_read_acquire() and atomic_try_cmpxchg_relaxed(). It seems plausible that the same thing could occur here in qspinlock:

  1   2   3   4   5   6   7   8   9   10   >