On Tue, 2 Sep 2025 11:47:32 +0800 Luo Gengkun <luogeng...@huaweicloud.com> wrote:
> > On 2025/9/1 23:56, Masami Hiramatsu (Google) wrote: > > On Fri, 29 Aug 2025 08:26:04 -0400 > > Steven Rostedt <rost...@goodmis.org> wrote: > > > >> [ Adding arm64 maintainers ] > >> > >> On Fri, 29 Aug 2025 16:29:07 +0800 > >> Luo Gengkun <luogeng...@huaweicloud.com> wrote: > >> > >>> On 2025/8/20 1:50, Steven Rostedt wrote: > >>>> On Tue, 19 Aug 2025 10:51:52 +0000 > >>>> Luo Gengkun <luogeng...@huaweicloud.com> wrote: > >>>> > >>>>> Both tracing_mark_write and tracing_mark_raw_write call > >>>>> __copy_from_user_inatomic during preempt_disable. But in some case, > >>>>> __copy_from_user_inatomic may trigger page fault, and will call > >>>>> schedule() > >>>>> subtly. And if a task is migrated to other cpu, the following warning > >>>>> will > >>>> Wait! What? > >>>> > >>>> __copy_from_user_inatomic() is allowed to be called from in atomic > >>>> context. > >>>> Hence the name it has. How the hell can it sleep? If it does, it's > >>>> totally > >>>> broken! > >>>> > >>>> Now, I'm not against using nofault() as it is better named, but I want to > >>>> know why you are suggesting this change. Did you actually trigger a bug > >>>> here? > >>> yes, I trigger this bug in arm64. > >> And I still think this is an arm64 bug. > > I think it could be. > > > >>>> > >>>>> be trigger: > >>>>> if (RB_WARN_ON(cpu_buffer, > >>>>> !local_read(&cpu_buffer->committing))) > >>>>> > >>>>> An example can illustrate this issue: > > You've missed an important part. > > > >>>>> process flow CPU > >>>>> --------------------------------------------------------------------- > >>>>> > >>>>> tracing_mark_raw_write(): cpu:0 > >>>>> ... > >>>>> ring_buffer_lock_reserve(): cpu:0 > >>>>> ... > > preempt_disable_notrace(); --> this is unlocked by > > ring_buffer_unlock_commit() > > > >>>>> cpu = raw_smp_processor_id() cpu:0 > >>>>> cpu_buffer = buffer->buffers[cpu] cpu:0 > >>>>> ... > >>>>> ... > >>>>> __copy_from_user_inatomic(): cpu:0 > > So this is called under preempt-disabled. > > > >>>>> ... > >>>>> # page fault > >>>>> do_mem_abort(): cpu:0 > >>>> Sounds to me that arm64 __copy_from_user_inatomic() may be broken. > >>>> > >>>>> ... > >>>>> # Call schedule > >>>>> schedule() cpu:0 > > If this does not check the preempt flag, it is a problem. > > Maybe arm64 needs to do fixup and abort instead of do_mem_abort()? > > My kernel was built without CONFIG_PREEMPT_COUNT, so the preempt_disable() > does nothing more than act as a barrier. In this case, it can pass the > check by schedule(). Perhaps this is another issue? OK, I got it. Indeed, in that case, we have no way to check this happens in the preempt critical section. Anyway, as in discussed here, __copy_from_user_inatomic() is for the internal function, so I'm also OK to this patch. Reviewed-by: Masami Hiramatsu (Google) <mhira...@kernel.org> BTW, currently we just write a fault message if the __copy_from_user_*() hits a fault, but I think we can retry with normal __copy_from_user() to a kernel buffer and copy it in the ring buffer as slow path. Thank you, -- Masami Hiramatsu (Google) <mhira...@kernel.org>