On Thu, 21 Dec 2023 10:46:08 +0000 Christophe Leroy <christophe.le...@csgroup.eu> wrote:
> > To enable ftrace, the nop at function entry is changed to an > > unconditional branch to 'tramp'. The call to ftrace_caller() may be > > updated to ftrace_regs_caller() depending on the registered ftrace ops. > > On 64-bit powerpc, we additionally change the instruction at 'tramp' to > > 'mflr r0' from an unconditional branch back to func+4. This is so that > > functions entered through the GEP can skip the function profile sequence > > unless ftrace is enabled. > > > > With the context_switch microbenchmark on a P9 machine, there is a > > performance improvement of ~6% with this patch applied, going from 650k > > context switches to 690k context switches without ftrace enabled. With > > ftrace enabled, the performance was similar at 86k context switches. > > Wondering how significant that context_switch micorbenchmark is. > > I ran it on both mpc885 and mpc8321 and I'm a bit puzzled by some of the > results: > # ./context_switch --no-fp > Using threads with yield on cpus 0/0 touching FP:no altivec:no vector:no > vdso:no > > On 885, I get the following results before and after your patch. > > CONFIG_FTRACE not selected : 44,9k > CONFIG_FTRACE selected, before : 32,8k > CONFIG_FTRACE selected, after : 33,6k > > All this is with CONFIG_INIT_STACK_ALL_ZERO which is the default. But > when I select CONFIG_INIT_STACK_NONE, the CONFIG_FTRACE not selected > result is only 34,4. > > On 8321: > > CONFIG_FTRACE not selected : 100,3k > CONFIG_FTRACE selected, before : 72,5k > CONFIG_FTRACE selected, after : 116k > > So the results look odd to me. BTW, CONFIG_FTRACE just enables the tracing system (I would like to change that to CONFIG_TRACING, but not sure if I can without breaking .configs all over the place). The nops for ftrace is enabled with CONFIG_FUNCTION_TRACER. -- Steve