On Mon, Feb 23, 2026 at 03:18:17PM +0000, Puranjay Mohan wrote:
> On Sat, Feb 21, 2026 at 12:15 PM Conor Dooley <[email protected]> wrote:
> >
> > Hey,
> >
> > On Tue, Apr 08, 2025 at 02:08:34AM +0800, Andy Chiu wrote:
> > > From: Puranjay Mohan <[email protected]>
> > >
> > > This patch enables support for DYNAMIC_FTRACE_WITH_CALL_OPS on RISC-V.
> > > This allows each ftrace callsite to provide an ftrace_ops to the common
> > > ftrace trampoline, allowing each callsite to invoke distinct tracer
> > > functions without the need to fall back to list processing or to
> > > allocate custom trampolines for each callsite. This significantly speeds
> > > up cases where multiple distinct trace functions are used and callsites
> > > are mostly traced by a single tracer.
> > >
> > > The idea and most of the implementation is taken from the ARM64's
> > > implementation of the same feature. The idea is to place a pointer to
> > > the ftrace_ops as a literal at a fixed offset from the function entry
> > > point, which can be recovered by the common ftrace trampoline.
> > >
> > > We use -fpatchable-function-entry to reserve 8 bytes above the function
> > > entry by emitting 2 4 byte or 4 2 byte  nops depending on the presence of
> > > CONFIG_RISCV_ISA_C. These 8 bytes are patched at runtime with a pointer
> > > to the associated ftrace_ops for that callsite. Functions are aligned to
> > > 8 bytes to make sure that the accesses to this literal are atomic.
> > >
> > > This approach allows for directly invoking ftrace_ops::func even for
> > > ftrace_ops which are dynamically-allocated (or part of a module),
> > > without going via ftrace_ops_list_func.
> > >
> > > We've benchamrked this with the ftrace_ops sample module on Spacemit K1
> > > Jupiter:
> > >
> > > Without this patch:
> > >
> > > baseline (Linux rivos 6.14.0-09584-g7d06015d936c #3 SMP Sat Mar 29
> > > +-----------------------+-----------------+----------------------------+
> > > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > > |-----------------------+-----------------+----------------------------|
> > > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        0 |          0 |        1357958 |          13 |             - |
> > > |        0 |          1 |        1302375 |          13 |             - |
> > > |        0 |          2 |        1302375 |          13 |             - |
> > > |        0 |         10 |        1379084 |          13 |             - |
> > > |        0 |        100 |        1302458 |          13 |             - |
> > > |        0 |        200 |        1302333 |          13 |             - |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |       13677833 |         136 |           123 |
> > > |        1 |          1 |       18500916 |         185 |           172 |
> > > |        1 |          2 |       22856459 |         228 |           215 |
> > > |        1 |         10 |       58824709 |         588 |           575 |
> > > |        1 |        100 |      505141584 |        5051 |          5038 |
> > > |        1 |        200 |     1580473126 |       15804 |         15791 |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |       13561000 |         135 |           122 |
> > > |        2 |          0 |       19707292 |         197 |           184 |
> > > |       10 |          0 |       67774750 |         677 |           664 |
> > > |      100 |          0 |      714123125 |        7141 |          7128 |
> > > |      200 |          0 |     1918065668 |       19180 |         19167 |
> > > +----------+------------+-----------------+------------+---------------+
> > >
> > > Note: per-call overhead is estimated relative to the baseline case with
> > > 0 relevant tracers and 0 irrelevant tracers.
> > >
> > > With this patch:
> > >
> > > v4-rc4 (Linux rivos 6.14.0-09598-gd75747611c93 #4 SMP Sat Mar 29
> > > +-----------------------+-----------------+----------------------------+
> > > |  Number of tracers    | Total time (ns) | Per-call average time      |
> > > |-----------------------+-----------------+----------------------------|
> > > | Relevant | Irrelevant |    100000 calls | Total (ns) | Overhead (ns) |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        0 |          0 |         1459917 |         14 |             - |
> > > |        0 |          1 |         1408000 |         14 |             - |
> > > |        0 |          2 |         1383792 |         13 |             - |
> > > |        0 |         10 |         1430709 |         14 |             - |
> > > |        0 |        100 |         1383791 |         13 |             - |
> > > |        0 |        200 |         1383750 |         13 |             - |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |         5238041 |         52 |            38 |
> > > |        1 |          1 |         5228542 |         52 |            38 |
> > > |        1 |          2 |         5325917 |         53 |            40 |
> > > |        1 |         10 |         5299667 |         52 |            38 |
> > > |        1 |        100 |         5245250 |         52 |            39 |
> > > |        1 |        200 |         5238459 |         52 |            39 |
> > > |----------+------------+-----------------+------------+---------------|
> > > |        1 |          0 |         5239083 |         52 |            38 |
> > > |        2 |          0 |        19449417 |        194 |           181 |
> > > |       10 |          0 |        67718584 |        677 |           663 |
> > > |      100 |          0 |       709840708 |       7098 |          7085 |
> > > |      200 |          0 |      2203580626 |      22035 |         22022 |
> > > +----------+------------+-----------------+------------+---------------+
> > >
> > > Note: per-call overhead is estimated relative to the baseline case with
> > > 0 relevant tracers and 0 irrelevant tracers.
> > >
> > > As can be seen from the above:
> > >
> > >  a) Whenever there is a single relevant tracer function associated with a
> > >     tracee, the overhead of invoking the tracer is constant, and does not
> > >     scale with the number of tracers which are *not* associated with that
> > >     tracee.
> > >
> > >  b) The overhead for a single relevant tracer has dropped to ~1/3 of the
> > >     overhead prior to this series (from 122ns to 38ns). This is largely
> > >     due to permitting calls to dynamically-allocated ftrace_ops without
> > >     going through ftrace_ops_list_func.
> > >
> > > Signed-off-by: Puranjay Mohan <[email protected]>
> > >
> > > [update kconfig, asm, refactor]
> > >
> > > Signed-off-by: Andy Chiu <[email protected]>
> > > Tested-by: Björn Töpel <[email protected]>
> >
> > I bisected a boot failure to this commit [c217157bcd1df ("riscv:
> > Implement HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS")] yesterday, that appears
> > to be affecting all LLVM versions that I currently have installed. From
> > some initial testing of Kconfig options, it looks like the issue is
> > CFI_CLANG related because when I disable CFI_CLANG things work once
> > more. Since this option depends on !CFI_CLANG, but is def_bool y, I
> > modified Kconfig to force disable it at all times and tested
> > !DYNAMIC_FTRACE_WITH_CALL_OPS && !CFG_CLANG, which did boot.
> >
> > I dunno anything about what's going on in this patch, but so little in
> > it relates to having DYNAMIC_FTRACE_WITH_CALL_OPS, that I was able to
> > figure out that the problem is -fpatchable-function-entry=8,4
> >
> 
> DYNAMIC_FTRACE_WITH_CALL_OPS can't work together with CFI_CLANG.
> 
> arm64 has:
> 
> select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS \
> if (DYNAMIC_FTRACE_WITH_ARGS && !CFI && \
>    (CC_IS_CLANG || !CC_OPTIMIZE_FOR_SIZE))
> 
> would need something similar for riscv if not already done.


I think you've misunderstood my email. We already have:

        select HAVE_DYNAMIC_FTRACE_WITH_CALL_OPS if (DYNAMIC_FTRACE_WITH_ARGS 
&& !CFI)

The problem is that the patch broke using CFI_CLANG, due to the
fpatchable-function-entry change.

Cheers,
Conor.

Attachment: signature.asc
Description: PGP signature

Reply via email to