On Mon, 23 Feb 2026 16:51:12 -0500
Andrey Grodzovsky <[email protected]> wrote:

> When ftrace_lookup_symbols() is called with a single symbol (cnt == 1),
> use kallsyms_lookup_name() for O(log N) binary search instead of the
> full linear scan via kallsyms_on_each_symbol().

So this patch looks like it should go through the tracing tree, not bpf.

> 
> ftrace_lookup_symbols() was designed for batch resolution of many
> symbols in a single pass.  For large cnt this is efficient: a single
> O(N) walk over all symbols with O(log cnt) binary search into the
> sorted input array.  But for cnt == 1 it still decompresses all ~200K
> kernel symbols only to match one.
> 
> kallsyms_lookup_name() uses the sorted kallsyms index and needs only
> ~17 decompressions for a single lookup.
> 
> This is the common path for kprobe.session with exact function names,
> where libbpf sends one symbol per BPF_LINK_CREATE syscall.
> 
> If binary lookup fails (duplicate symbol names where the first match
> is not ftrace-instrumented, or module symbols), the function falls
> through to the existing linear scan path.
> 
> Before (cnt=1, 50 kprobe.session programs):
>   Attach: 858 ms  (kallsyms_expand_symbol 25% of CPU)
> 
> After:
>   Attach:  52 ms  (16x faster)
> 
> Signed-off-by: Andrey Grodzovsky <[email protected]>
> ---
>  kernel/trace/ftrace.c | 28 ++++++++++++++++++++++++++++
>  1 file changed, 28 insertions(+)
> 
> diff --git a/kernel/trace/ftrace.c b/kernel/trace/ftrace.c
> index 827fb9a0bf0d..bfd7670669c2 100644
> --- a/kernel/trace/ftrace.c
> +++ b/kernel/trace/ftrace.c
> @@ -9263,6 +9263,19 @@ static int kallsyms_callback(void *data, const char 
> *name, unsigned long addr)
>   * @addrs array, which needs to be big enough to store at least @cnt
>   * addresses.
>   *
> + * For a single symbol (cnt == 1), uses kallsyms_lookup_name() which
> + * performs an O(log N) binary search via the sorted kallsyms index.
> + * This avoids the full O(N) linear scan over all kernel symbols that
> + * the multi-symbol path requires.
> + *
> + * For multiple symbols, uses a single-pass linear scan via
> + * kallsyms_on_each_symbol() with binary search into the sorted input
> + * array.

The above is fine.

>  While individual lookups are O(log N), doing K lookups
> + * totals O(K * log N) which loses to a single sequential O(N) pass
> + * at scale due to cache-friendly memory access patterns of the linear
> + * walk.  Empirical testing shows the linear scan is faster for batch
> + * lookups even well below 10K symbols.

The above is unneeded for a comment in the code and just belongs in the
change log.

-- Steve

> + *
>   * Returns: 0 if all provided symbols are found, -ESRCH otherwise.
>   */
>  int ftrace_lookup_symbols(const char **sorted_syms, size_t cnt, unsigned 
> long *addrs)
> @@ -9270,6 +9283,21 @@ int ftrace_lookup_symbols(const char **sorted_syms, 
> size_t cnt, unsigned long *a
>       struct kallsyms_data args;
>       int found_all;
>  
> +     /* Fast path: single symbol uses O(log N) binary search */
> +     if (cnt == 1) {
> +             addrs[0] = kallsyms_lookup_name(sorted_syms[0]);
> +             if (addrs[0])
> +                     addrs[0] = ftrace_location(addrs[0]);
> +             if (addrs[0])
> +                     return 0;
> +             /*
> +              * Binary lookup can fail for duplicate symbol names
> +              * where the first match is not ftrace-instrumented,
> +              * or for module symbols.  Retry with linear scan.
> +              */
> +     }
> +
> +     /* Batch path: single-pass O(N) linear scan */
>       memset(addrs, 0, sizeof(*addrs) * cnt);
>       args.addrs = addrs;
>       args.syms = sorted_syms;


Reply via email to