Hi,all
On Mon, 4 Jan 2021 at 13:35, Louis Wang <liang26...@gmail.com> wrote: > > From: "louis.wang1" <louis.wa...@unisoc.com> > > Enabling function_graph tracer on ARM causes kernel panic, because the > function graph tracer updates the "return address" of a function in order > to insert a trace callback on function exit, it saves the function's > original return address in a return trace stack, but cpu_suspend() may not > return through the normal return path. > > cpu_suspend() will resume directly via the cpu_resume path, but the return > trace stack has been set-up by the subfunctions of cpu_suspend(), which > makes the "return address" inconsistent with cpu_suspend(). > > This patch refers to Commit de818bd4522c40ea02a81b387d2fa86f989c9623 > ("arm64: kernel: pause/unpause function graph tracer in cpu_suspend()"), > fixes the issue by pausing/resuming the function graph tracer on the thread > executing cpu_suspend(), so that the function graph tracer state is kept > consistent across functions that enter power down states and never return > by effectively disabling graph tracer while they are executing. > > Signed-off-by: louis.wang1 <louis.wa...@unisoc.com> > --- > arch/arm/kernel/suspend.c | 19 ++++++++++++++++++- > 1 file changed, 18 insertions(+), 1 deletion(-) > > diff --git a/arch/arm/kernel/suspend.c b/arch/arm/kernel/suspend.c > index 24bd205..43f0a3e 100644 > --- a/arch/arm/kernel/suspend.c > +++ b/arch/arm/kernel/suspend.c > @@ -1,4 +1,5 @@ > // SPDX-License-Identifier: GPL-2.0 > +#include <linux/ftrace.h> > #include <linux/init.h> > #include <linux/slab.h> > #include <linux/mm_types.h> > @@ -26,12 +27,22 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned > long)) > return -EINVAL; > > /* > + * Function graph tracer state gets incosistent when the kernel > + * calls functions that never return (aka suspend finishers) hence > + * disable graph tracing during their execution. > + */ > + pause_graph_tracing(); > + > + /* > * Provide a temporary page table with an identity mapping for > * the MMU-enable code, required for resuming. On successful > * resume (indicated by a zero return code), we need to switch > * back to the correct page tables. > */ > ret = __cpu_suspend(arg, fn, __mpidr); > + > + unpause_graph_tracing(); > + > if (ret == 0) { > cpu_switch_mm(mm->pgd, mm); > local_flush_bp_all(); > @@ -45,7 +56,13 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned > long)) > int cpu_suspend(unsigned long arg, int (*fn)(unsigned long)) > { > u32 __mpidr = cpu_logical_map(smp_processor_id()); > - return __cpu_suspend(arg, fn, __mpidr); > + int ret; > + > + pause_graph_tracing(); > + ret = __cpu_suspend(arg, fn, __mpidr); > + unpause_graph_tracing(); > + > + return ret; > } > #define idmap_pgd NULL > #endif > -- > 2.7.4 > ftrace function_graph tracer always cause kernel panic on my ARM device with multiple CPUs, I found a solution for the problem on ARM64, refers to the patch above, I was wondering why this bugfix on ARM64 hasn't been upstreamed to ARM, Does anyone have a similar problem and can share information with me? Thanks.