On Thu, Feb 11, 2021 at 03:33:52PM +0000, Vincenzo Frascino wrote: > When MTE async mode is enabled TFSR_EL1 contains the accumulative > asynchronous tag check faults for EL1 and EL0. > > During the suspend/resume operations the firmware might perform some > operations that could change the state of the register resulting in > a spurious tag check fault report. > > Report asynchronous tag faults before suspend and clear the TFSR_EL1 > register after resume to prevent this to happen. > > Cc: Catalin Marinas <catalin.mari...@arm.com> > Cc: Will Deacon <w...@kernel.org> > Cc: Lorenzo Pieralisi <lorenzo.pieral...@arm.com> > Signed-off-by: Vincenzo Frascino <vincenzo.frasc...@arm.com> > --- > arch/arm64/include/asm/mte.h | 4 ++++ > arch/arm64/kernel/mte.c | 20 ++++++++++++++++++++ > arch/arm64/kernel/suspend.c | 3 +++ > 3 files changed, 27 insertions(+) > > diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h > index 43169b978cd3..33e88a470357 100644 > --- a/arch/arm64/include/asm/mte.h > +++ b/arch/arm64/include/asm/mte.h > @@ -41,6 +41,7 @@ void mte_sync_tags(pte_t *ptep, pte_t pte); > void mte_copy_page_tags(void *kto, const void *kfrom); > void flush_mte_state(void); > void mte_thread_switch(struct task_struct *next); > +void mte_suspend_enter(void); > void mte_suspend_exit(void); > long set_mte_ctrl(struct task_struct *task, unsigned long arg); > long get_mte_ctrl(struct task_struct *task); > @@ -66,6 +67,9 @@ static inline void flush_mte_state(void) > static inline void mte_thread_switch(struct task_struct *next) > { > } > +static inline void mte_suspend_enter(void) > +{ > +} > static inline void mte_suspend_exit(void) > { > } > diff --git a/arch/arm64/kernel/mte.c b/arch/arm64/kernel/mte.c > index f5aa5bea6dfe..de905102245a 100644 > --- a/arch/arm64/kernel/mte.c > +++ b/arch/arm64/kernel/mte.c > @@ -258,12 +258,32 @@ void mte_thread_switch(struct task_struct *next) > mte_check_tfsr_el1(); > } > > +void mte_suspend_enter(void) > +{ > + if (!system_supports_mte()) > + return; > + > + /* > + * The barriers are required to guarantee that the indirect writes > + * to TFSR_EL1 are synchronized before we report the state. > + */ > + dsb(nsh); > + isb(); > + > + /* Report SYS_TFSR_EL1 before suspend entry */ > + mte_check_tfsr_el1(); > +} > + > void mte_suspend_exit(void) > { > if (!system_supports_mte()) > return; > > update_gcr_el1_excl(gcr_kernel_excl); > + > + /* Clear SYS_TFSR_EL1 after suspend exit */ > + write_sysreg_s(0, SYS_TFSR_EL1);
AFAICS it is not needed, it is done already in __cpu_setup() (that is called by cpu_resume on return from cpu_suspend() from firmware). However, I have a question. We are relying on context switch to set sctlr_el1_tfc0 right ? If that's the case, till the thread resuming from low power switches context we are running with SCTLR_EL1_TCF0 not reflecting the actual value. Just making sure that I understand it correctly, I need to check the resume from suspend-to-RAM path, it is something that came up with perf save/restore already in the past. Lorenzo > + > } > > long set_mte_ctrl(struct task_struct *task, unsigned long arg) > diff --git a/arch/arm64/kernel/suspend.c b/arch/arm64/kernel/suspend.c > index a67b37a7a47e..25a02926ad88 100644 > --- a/arch/arm64/kernel/suspend.c > +++ b/arch/arm64/kernel/suspend.c > @@ -91,6 +91,9 @@ int cpu_suspend(unsigned long arg, int (*fn)(unsigned long)) > unsigned long flags; > struct sleep_stack_data state; > > + /* Report any MTE async fault before going to suspend */ > + mte_suspend_enter(); > + > /* > * From this point debug exceptions are disabled to prevent > * updates to mdscr register (saved and restored along with > -- > 2.30.0 >