Le 23/05/2023 à 11:31, Naveen N Rao a écrit : > Christophe Leroy wrote: >> >> That's better, but still more time than original implementation: >> >> +20% to activate function tracer (was +40% with your RFC) >> +21% to activate nop tracer (was +24% with your RFC) >> >> perf record (without strict kernel rwx) : >> >> 17.75% echo [kernel.kallsyms] [k] ftrace_check_record >> 9.76% echo [kernel.kallsyms] [k] ftrace_replace_code >> 6.53% echo [kernel.kallsyms] [k] patch_instruction >> 5.21% echo [kernel.kallsyms] [k] __ftrace_hash_rec_update >> 4.26% echo [kernel.kallsyms] [k] ftrace_get_addr_curr >> 4.18% echo [kernel.kallsyms] [k] ftrace_get_call_inst.isra.0 >> 3.45% echo [kernel.kallsyms] [k] ftrace_get_addr_new >> 3.08% echo [kernel.kallsyms] [k] function_trace_call >> 2.20% echo [kernel.kallsyms] [k] >> __rb_reserve_next.constprop.0 >> 2.05% echo [kernel.kallsyms] [k] copy_page >> 1.91% echo [kernel.kallsyms] [k] >> ftrace_create_branch_inst.constprop.0 >> 1.83% echo [kernel.kallsyms] [k] ftrace_rec_iter_next >> 1.83% echo [kernel.kallsyms] [k] rb_commit >> 1.69% echo [kernel.kallsyms] [k] ring_buffer_lock_reserve >> 1.54% echo [kernel.kallsyms] [k] trace_function >> 1.39% echo [kernel.kallsyms] [k] >> __call_rcu_common.constprop.0 >> 1.25% echo ld-2.23.so [.] do_lookup_x >> 1.17% echo [kernel.kallsyms] [k] ftrace_rec_iter_record >> 1.03% echo [kernel.kallsyms] [k] unmap_page_range >> 0.95% echo [kernel.kallsyms] [k] flush_dcache_icache_page >> 0.95% echo [kernel.kallsyms] [k] ftrace_lookup_ip > > Ok, I simplified this further, and this is as close to the previous fast > path as we can get (applies atop the original RFC). The only difference > left is the ftrace_rec iterator.
That's not better, that's even slightly worse (less than 1%). I will try to investigate why. Christophe