The rss_stat trace event allows userspace tools, like Perfetto [1],
to inspect per-process RSS metric changes over time.

The curr field was introduced to rss_stat in commit e4dcad204d3a
("rss_stat: add support to detect RSS updates of external mm").
It's intent is to  indicate whether the RSS update is for the
mm_struct of the current execution context; and is set to false
when operating on a remote mm_struct (e.g., via kswapd or a
direct reclaimer).

However, an issue arises when a kernel thread temporarily adopts
a user process's mm_struct. Kernel threads do not have their own
mm_struct and normally have current->mm set to NULL. To operate
on user memory, they can "borrow" a memory context using
kthread_use_mm(), which sets current->mm to the user process's mm.

This can be observed, for example, in the USB Function Filesystem
(FFS) driver. The ffs_user_copy_worker() handles AIO completions
and uses kthread_use_mm() to copy data to a user-space buffer.
If a page fault occurs during this copy, the fault handler executes
in the kthread's context.

At this point, current is the kthread, but current->mm points to the
user process's mm. Since the rss_stat event (from the page fault)
is for that same mm, the condition current->mm == mm becomes true,
causing curr to be incorrectly set to true when the trace event is
emitted.

This is misleading because it suggests the mm belongs to the kthread,
confusing userspace tools that track per-process RSS changes and
corrupting their mm_id-to-process association.

Fix this by ensuring curr is always false when the trace event is
emitted from a kthread context by checking for the PF_KTHREAD flag.

[1] https://perfetto.dev/

Fixes: e4dcad204d3a ("rss_stat: add support to detect RSS updates of external 
mm")
Cc: Andrew Morton <[email protected]>
Cc: "David Hildenbrand (Arm)" <[email protected]>
Cc: Joel Fernandes <[email protected]>
Cc: Lorenzo Stoakes <[email protected]>
Cc: Minchan Kim <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Suren Baghdasaryan <[email protected]>
Signed-off-by: Kalesh Singh <[email protected]>
---
 include/trace/events/kmem.h | 8 +++++++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index 7f93e754da5c..cd7920c81f85 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -440,7 +440,13 @@ TRACE_EVENT(rss_stat,
 
        TP_fast_assign(
                __entry->mm_id = mm_ptr_to_hash(mm);
-               __entry->curr = !!(current->mm == mm);
+               /*
+                * curr is true if the mm matches the current task's mm_struct.
+                * Since kthreads (PF_KTHREAD) have no mm_struct of their own
+                * but can borrow one via kthread_use_mm(), we must filter them
+                * out to avoid incorrectly attributing the RSS update to them.
+                */
+               __entry->curr = current->mm == mm && !(current->flags & 
PF_KTHREAD);
                __entry->member = member;
                __entry->size = 
(percpu_counter_sum_positive(&mm->rss_stat[member])
                                                            << PAGE_SHIFT);

base-commit: 8bf22c33e7a172fbc72464f4cc484d23a6b412ba
-- 
2.53.0.371.g1d285c8824-goog


Reply via email to