Naveen N. Rao wrote:
With -mprofile-kernel, gcc emits 'mflr r0', followed by 'bl _mcount' to
enable function tracing and profiling. So far, with dynamic ftrace, we
used to only patch out the branch to _mcount(). However, mflr is
executed by the branch unit that can only execute one per cycle on
POWER9 and shared with branches, so it would be nice to avoid it where
possible.
We cannot simply nop out the mflr either. When enabling function
tracing, there can be a race if tracing is enabled when some thread was
interrupted after executing a nop'ed out mflr. In this case, the thread
would execute the now-patched-in branch to _mcount() without having
executed the preceding mflr.
To solve this, we now enable function tracing in 2 steps: patch in the
mflr instruction, use 'smp_call_function(isync);
synchronize_rcu_tasks()' to ensure all existing threads make progress,
and then patch in the branch to _mcount(). We override
ftrace_replace_code() with a powerpc64 variant for this purpose.
Suggested-by: Nicholas Piggin <npig...@gmail.com>
Reviewed-by: Nicholas Piggin <npig...@gmail.com>
Signed-off-by: Naveen N. Rao <naveen.n....@linux.vnet.ibm.com>
---
arch/powerpc/kernel/trace/ftrace.c | 258 ++++++++++++++++++++++++++---
1 file changed, 236 insertions(+), 22 deletions(-)
I missed adding a comment here to explain the changes. As discussed in
the previous series, I think we are ok with this patch from a CMODX
perspective. For smp_call_function(), I decided to have it included in
this patch since we know that we need it here for sure. I am not
entirely sure we want to do that in patch_instruction() since ftrace
doesn't seem to need it elsewhere. As Nick Piggin pointed out, we may
want to have users of patch_instruction() (kprobes) add the necessary
synchronization.
Thanks,
Naveen