On 3/9/26 12:05, Saket Kumar Bhaskar wrote:
> On Thu, Mar 05, 2026 at 01:45:44PM +0100, Viktor Malik wrote:
>> On 3/3/26 15:58, Saket Kumar Bhaskar wrote:
>>> On Fri, Feb 27, 2026 at 09:25:02AM +0100, Viktor Malik wrote:
>>>> It may happen that mm is already released, which leads to kernel panic.
>>>> This adds the NULL check for current->mm, similarly to 20afc60f892d
>>>> ("x86, perf: Check that current->mm is alive before getting user
>>>> callchain").
>>>>
>>>> I was getting this panic when running a profiling BPF program
>>>> (profile.py from bcc-tools):
>>>>
>>>> [26215.051935] Kernel attempted to read user page (588) - exploit
>>>> attempt? (uid: 0)
>>>> [26215.051950] BUG: Kernel NULL pointer dereference on read at
>>>> 0x00000588
>>>> [26215.051952] Faulting instruction address: 0xc00000000020fac0
>>>> [26215.051957] Oops: Kernel access of bad area, sig: 11 [#1]
>>>> [...]
>>>> [26215.052049] Call Trace:
>>>> [26215.052050] [c000000061da6d30] [c00000000020fc10]
>>>> perf_callchain_user_64+0x2d0/0x490 (unreliable)
>>>> [26215.052054] [c000000061da6dc0] [c00000000020f92c]
>>>> perf_callchain_user+0x1c/0x30
>>>> [26215.052057] [c000000061da6de0] [c0000000005ab2a0]
>>>> get_perf_callchain+0x100/0x360
>>>> [26215.052063] [c000000061da6e70] [c000000000573bc8]
>>>> bpf_get_stackid+0x88/0xf0
>>>> [26215.052067] [c000000061da6ea0] [c008000000042258]
>>>> bpf_prog_16d4ab9ab662f669_do_perf_event+0xf8/0x274
>>>> [...]
>>>>
>>>> Fixes: 20002ded4d93 ("perf_counter: powerpc: Add callchain support")
>>>> Signed-off-by: Viktor Malik <[email protected]>
>>>> ---
>>>> arch/powerpc/perf/callchain_32.c | 3 +++
>>>> arch/powerpc/perf/callchain_64.c | 3 +++
>>>> 2 files changed, 6 insertions(+)
>>>>
>>>> diff --git a/arch/powerpc/perf/callchain_32.c
>>>> b/arch/powerpc/perf/callchain_32.c
>>>> index ddcc2d8aa64a..b46e21679566 100644
>>>> --- a/arch/powerpc/perf/callchain_32.c
>>>> +++ b/arch/powerpc/perf/callchain_32.c
>>>> @@ -144,6 +144,9 @@ void perf_callchain_user_32(struct
>>>> perf_callchain_entry_ctx *entry,
>>>> sp = regs->gpr[1];
>>>> perf_callchain_store(entry, next_ip);
>>>>
>>>> + if (!current->mm)
>>>> + return;
>>>> +
>>>> while (entry->nr < entry->max_stack) {
>>>> fp = (unsigned int __user *) (unsigned long) sp;
>>>> if (invalid_user_sp(sp) || read_user_stack_32(fp, &next_sp))
>>>> diff --git a/arch/powerpc/perf/callchain_64.c
>>>> b/arch/powerpc/perf/callchain_64.c
>>>> index 115d1c105e8a..eaaadd6fa81b 100644
>>>> --- a/arch/powerpc/perf/callchain_64.c
>>>> +++ b/arch/powerpc/perf/callchain_64.c
>>>> @@ -79,6 +79,9 @@ void perf_callchain_user_64(struct
>>>> perf_callchain_entry_ctx *entry,
>>>> sp = regs->gpr[1];
>>>> perf_callchain_store(entry, next_ip);
>>>>
>>>> + if (!current->mm)
>>>> + return;
>>>> +
>>>> while (entry->nr < entry->max_stack) {
>>>> fp = (unsigned long __user *) sp;
>>>> if (invalid_user_sp(sp) || read_user_stack_64(fp, &next_sp))
>>>> --
>>>> 2.53.0
>>>>
>>> Sorry, I missed adding cc list for the last conversation so adding this for
>>> reference:
>>>
>>>> Wouldn't be good if we check this in perf_callchain_user() as it will
>>>> cover both cases.
>>>
>>> to which Viktor replied:
>>> I considered it but in that case, we'd also miss the top-level stack
>>> frame (the perf_callchain_store call above). Other arches include it so
>>> I followed the behavior for powerpc.
>>>
>>> Viktor, agreed with your first point. I have another concern:
>>>
>>> I was hitting this issue with stacktrace_build_id_nmi in bpf and
>>> applied this patch
>>> https://lore.kernel.org/bpf/[email protected]/T/#mf901967ebe77506f1bd6e3d876c2a85824d9519d
>>>
>>> Wondering if the above generic fix is working do we need to add this
>>> check in powerpc specific code?
>>
>> I tried to apply that patch series but, unfortunately, keep getting the
>> panic when running the BCC profile tool.
>>
>> Also, looking at the patch, it seems that it would only solve the issue
>> when perf_callchain_user is called from a BPF context, however, I assume
>> that it may be called from other contexts, too.
>>
>> Since perf_callchain_user_{32,64} are dereferencing current->mm while
>> walking the stack, I think that an explicit protection against
>> current->mm being NULL makes sense here, even in the presence of the
>> above patch. Especially since other arches have it, too.
>>
>> Viktor
>>
> Ok that looks convincing then, another thing is that, how about moving
> perf_callchain_store
> to perf_callchain_user and checking current->mm == NULL there for both
> perf_callchain_user_32/64.
> next_ip, lr and sp can be passed to perf_callchain_user_32/64.
Yeah, that should be possible. I'll send v2.
Viktor
>
> Thanks,
> Saket
>