Re: [PATCH] tracing: Fix tracing_marker may trigger page fault during preempt_disable

Google Mon, 01 Sep 2025 08:56:58 -0700

On Fri, 29 Aug 2025 08:26:04 -0400
Steven Rostedt <rost...@goodmis.org> wrote:


> 
> [ Adding arm64 maintainers ]
> 
> On Fri, 29 Aug 2025 16:29:07 +0800
> Luo Gengkun <luogeng...@huaweicloud.com> wrote:
> 
> > On 2025/8/20 1:50, Steven Rostedt wrote:
> > > On Tue, 19 Aug 2025 10:51:52 +0000
> > > Luo Gengkun <luogeng...@huaweicloud.com> wrote:
> > >  
> > >> Both tracing_mark_write and tracing_mark_raw_write call
> > >> __copy_from_user_inatomic during preempt_disable. But in some case,
> > >> __copy_from_user_inatomic may trigger page fault, and will call 
> > >> schedule()
> > >> subtly. And if a task is migrated to other cpu, the following warning 
> > >> will  
> > > Wait! What?
> > >
> > > __copy_from_user_inatomic() is allowed to be called from in atomic 
> > > context.
> > > Hence the name it has. How the hell can it sleep? If it does, it's totally
> > > broken!
> > >
> > > Now, I'm not against using nofault() as it is better named, but I want to
> > > know why you are suggesting this change. Did you actually trigger a bug 
> > > here?  
> > 
> > yes, I trigger this bug in arm64.
> 
> And I still think this is an arm64 bug.

I think it could be.

> > >  
> > >> be trigger:
> > >>          if (RB_WARN_ON(cpu_buffer,
> > >>                         !local_read(&cpu_buffer->committing)))
> > >>
> > >> An example can illustrate this issue:

You've missed an important part.

> > >>
> > >> process flow                                             CPU
> > >> ---------------------------------------------------------------------
> > >>
> > >> tracing_mark_raw_write():                                cpu:0
> > >>     ...
> > >>     ring_buffer_lock_reserve():                          cpu:0
> > >>        ...

        preempt_disable_notrace(); --> this is unlocked by 
ring_buffer_unlock_commit()

> > >>        cpu = raw_smp_processor_id()                      cpu:0
> > >>        cpu_buffer = buffer->buffers[cpu]                 cpu:0
> > >>        ...
> > >>     ...
> > >>     __copy_from_user_inatomic():                         cpu:0

So this is called under preempt-disabled.

> > >>        ...
> > >>        # page fault
> > >>        do_mem_abort():                                   cpu:0  
> > > Sounds to me that arm64 __copy_from_user_inatomic() may be broken.
> > >  
> > >>           ...
> > >>           # Call schedule
> > >>           schedule()                                     cpu:0

If this does not check the preempt flag, it is a problem.
Maybe arm64 needs to do fixup and abort instead of do_mem_abort()?

> > >>   ...
> > >>     # the task schedule to cpu1
> > >>     __buffer_unlock_commit():                            cpu:1
> > >>        ...
> > >>        ring_buffer_unlock_commit():                      cpu:1
> > >>   ...
> > >>   cpu = raw_smp_processor_id()                   cpu:1
> > >>   cpu_buffer = buffer->buffers[cpu]              cpu:1

        preempt_enable_notrace(); <-- here we enable preempt again.

> > >>
> > >> As shown above, the process will acquire cpuid twice and the return 
> > >> values
> > >> are not the same.
> > >>
> > >> To fix this problem using copy_from_user_nofault instead of
> > >> __copy_from_user_inatomic, as the former performs 'access_ok' before
> > >> copying.
> > >>
> > >> Fixes: 656c7f0d2d2b ("tracing: Replace kmap with copy_from_user() in 
> > >> trace_marker writing")  
> > > The above commit was intorduced in 2016. copy_from_user_nofault() was
> > > introduced in 2020. I don't think this would be the fix for that kernel.
> > >
> > > So no, I'm not taking this patch. If you see __copy_from_user_inatomic()
> > > sleeping, it's users are not the issue. That function is.


BTW, the biggest difference between __copy_from_user() and
 __copy_from_user_inatomic() is `might_fault()` and `should_fail_usercopy()`.
The latter is a fault injection, so we can ignore it. But since
the `might_fail()` is NOT in __copy_from_user_inatomic(), it is designed
not to cause fault as Steve said?

Thank you,

-- 
Masami Hiramatsu (Google) <mhira...@kernel.org>

Re: [PATCH] tracing: Fix tracing_marker may trigger page fault during preempt_disable

Reply via email to