On Mon, Jun 10, 2024 at 04:06:32PM GMT, Steven Rostedt wrote:
> On Mon, 10 Jun 2024 14:08:16 +0530
> Naveen N Rao <nav...@kernel.org> wrote:
> 
> > On 32-bit powerpc, gcc generates a three instruction sequence for
> > function profiling:
> >     mflr    r0
> >     stw     r0, 4(r1)
> >     bl      _mcount
> > 
> > On kernel boot, the call to _mcount() is nop-ed out, to be patched back
> > in when ftrace is actually enabled. The 'stw' instruction therefore is
> > not necessary unless ftrace is enabled. Nop it out during ftrace init.
> > 
> > When ftrace is enabled, we want the 'stw' so that stack unwinding works
> > properly. Perform the same within the ftrace handler, similar to 64-bit
> > powerpc.
> > 
> > For 64-bit powerpc, early versions of gcc used to emit a three
> > instruction sequence for function profiling (with -mprofile-kernel) with
> > a 'std' instruction to mimic the 'stw' above. Address that scenario also
> > by nop-ing out the 'std' instruction during ftrace init.
> > 
> > Signed-off-by: Naveen N Rao <nav...@kernel.org>
> 
> Isn't there still the race that there's a preemption between the:
> 
>       stw     r0, 4(r1)
> and
>       bl      _mcount
> 
> And if this breaks stack unwinding, couldn't this cause an issue for live
> kernel patching?
> 
> I know it's very unlikely, but in theory, I think the race exists.

I *think* you are assuming that we will be patching back the 'stw' 
instruction here? So, there could be an issue if a cpu has executed the 
nop instead of 'stw' and then sees the call to _mcount().

But, we don't patch back the 'stw' instruction. That is instead done as 
part of ftrace_caller(), along with setting up an additional stack frame 
to ensure reliable stack unwinding. Commit 41a506ef71eb 
("powerpc/ftrace: Create a dummy stackframe to fix stack unwind") has 
more details.

The primary motivation for this patch is to address differences in the 
function profile sequence with various toolchains. Since commit 
0f71dcfb4aef ("powerpc/ftrace: Add support for 
-fpatchable-function-entry"), we use the same two-instruction profile 
sequence across 32-bit and 64-bit powerpc:
        mflr    r0
        bl      ftrace_caller

This has also been true on 64-bit powerpc with -mprofile-kernel, except 
the very early versions of gcc that supported that option (gcc v5).

On 32-bit powerpc, we used to use the three instruction sequence before 
support for -fpatchable-function-entry was introduced.

In this patch, we move all toolchain variants to use the two-instruction 
sequence for consistency.


Thanks,
Naveen

Reply via email to