On Wed, Apr 3, 2024 at 6:02 AM Masami Hiramatsu <mhira...@kernel.org> wrote: > > On Tue, 02 Apr 2024 15:02:41 +0200 > Björn Töpel <bj...@kernel.org> wrote: > > > Puranjay Mohan <puranja...@gmail.com> writes: > > > > > This commit replaces riscv's support for FTRACE_WITH_REGS with support > > > for FTRACE_WITH_ARGS. This is required for the ongoing effort to stop > > > relying on stop_machine() for RISCV's implementation of ftrace. > > > > > > The main relevant benefit that this change will bring for the above > > > use-case is that now we don't have separate ftrace_caller and > > > ftrace_regs_caller trampolines. This will allow the callsite to call > > > ftrace_caller by modifying a single instruction. Now the callsite can > > > do something similar to: > > > > > > When not tracing: | When tracing: > > > > > > func: func: > > > auipc t0, ftrace_caller_top auipc t0, ftrace_caller_top > > > nop <=========<Enable/Disable>=========> jalr t0, > > > ftrace_caller_bottom > > > [...] [...] > > > > > > The above assumes that we are dropping the support of calling a direct > > > trampoline from the callsite. We need to drop this as the callsite can't > > > change the target address to call, it can only enable/disable a call to > > > a preset target (ftrace_caller in the above diagram). > > > > > > Currently, ftrace_regs_caller saves all CPU registers in the format of > > > struct pt_regs and allows the tracer to modify them. We don't need to > > > save all of the CPU registers because at function entry only a subset of > > > pt_regs is live: > > > > > > |----------+----------+---------------------------------------------| > > > | Register | ABI Name | Description | > > > |----------+----------+---------------------------------------------| > > > | x1 | ra | Return address for traced function | > > > | x2 | sp | Stack pointer | > > > | x5 | t0 | Return address for ftrace_caller trampoline | > > > | x8 | s0/fp | Frame pointer | > > > | x10-11 | a0-1 | Function arguments/return values | > > > | x12-17 | a2-7 | Function arguments | > > > |----------+----------+---------------------------------------------| > > > > > > See RISCV calling convention[1] for the above table. > > > > > > Saving just the live registers decreases the amount of stack space > > > required from 288 Bytes to 112 Bytes. > > > > > > Basic testing was done with this on the VisionFive 2 development board. > > > > > > Note: > > > - Moving from REGS to ARGS will mean that RISCV will stop supporting > > > KPROBES_ON_FTRACE as it requires full pt_regs to be saved. > > > > ...and FPROBES, but Masami is (still?) working on a series to address > > that [1]. > > Yes, even with this patch, FPROBE can be worked with my series. > So I'm OK for this change. > > Thank you, > > > > > My perspective; This is following the work Mark et al has done for > > arm64, and it does make sense for RISC-V as well. I'm in favor having > > this change as part of the bigger call_ops/text patch change for RISC-V. > > > > [...] > > > > > diff --git a/arch/riscv/include/asm/ftrace.h > > > b/arch/riscv/include/asm/ftrace.h > > > index 1276d7d9ca8b..1e95bf4ded6c 100644 > > > --- a/arch/riscv/include/asm/ftrace.h > > > +++ b/arch/riscv/include/asm/ftrace.h > > > @@ -124,20 +124,87 @@ struct dyn_ftrace; > > > int ftrace_init_nop(struct module *mod, struct dyn_ftrace *rec); > > > #define ftrace_init_nop ftrace_init_nop > > > > > > -#ifdef CONFIG_DYNAMIC_FTRACE_WITH_REGS > > > +#ifdef CONFIG_DYNAMIC_FTRACE_WITH_ARGS > > > +#define arch_ftrace_get_regs(regs) NULL > > > struct ftrace_ops; > > > -struct ftrace_regs; > > > +struct ftrace_regs { > > > + unsigned long epc; > > > + unsigned long ra; > > > + unsigned long sp; > > > + unsigned long s0; > > > + unsigned long t1; > > > + union { > > > + unsigned long args[8]; > > > + struct { > > > + unsigned long a0; > > > + unsigned long a1; > > > + unsigned long a2; > > > + unsigned long a3; > > > + unsigned long a4; > > > + unsigned long a5; > > > + unsigned long a6; > > > + unsigned long a7; > > > + }; > > > + }; > > > +}; > > > + > > > +static __always_inline unsigned long > > > +ftrace_regs_get_instruction_pointer(const struct ftrace_regs *fregs) > > > +{ > > > + return fregs->epc; > > > +} > > > + > > > +static __always_inline void > > > +ftrace_regs_set_instruction_pointer(struct ftrace_regs *fregs, > > > + unsigned long pc) > > > +{ > > > + fregs->epc = pc; > > > +} > > > + > > > +static __always_inline unsigned long > > > +ftrace_regs_get_stack_pointer(const struct ftrace_regs *fregs) > > > +{ > > > + return fregs->sp; > > > +} > > > + > > > +static __always_inline unsigned long > > > +ftrace_regs_get_argument(struct ftrace_regs *fregs, unsigned int n) > > > +{ > > > + if (n < 8) > > > + return fregs->args[n]; > > > + return 0; > > > +} > > > + > > > +static __always_inline unsigned long > > > +ftrace_regs_get_return_value(const struct ftrace_regs *fregs) > > > +{ > > > + return fregs->a0; > > > +} > > > + > > > +static __always_inline void > > > +ftrace_regs_set_return_value(struct ftrace_regs *fregs, > > > + unsigned long ret) > > > +{ > > > + fregs->a0 = ret; > > > +} > > > + > > > +static __always_inline void > > > +ftrace_override_function_with_return(struct ftrace_regs *fregs) > > > +{ > > > + fregs->epc = fregs->ra; > > > +} > > > > Style/nit: All above; Try to use the full 100 chars, and keep the > > function name return value on the same line for grepability. > > > > > > Björn > > > > [1] > > https://lore.kernel.org/all/170887410337.564249.6360118840946697039.stgit@devnote2/ > > > -- > Masami Hiramatsu (Google) <mhira...@kernel.org>
I will send it without RFC and will fix the problems pointed out by Bjorn. Thanks, Puranjay