On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote: > mte_assign_mem_tag_range() is called on production KASAN HW hot > paths. It makes sense to optimize it in an attempt to reduce the > overhead. > > Optimize mte_assign_mem_tag_range() based on the indications provided at > [1].
... what exactly is the optimization? I /think/ you're just trying to have it inlined, but you should mention that explicitly. > > [1] > https://lore.kernel.org/r/caaehk+wco+j7d1_t89dg+jjrplk3x9rsgfkxjgd0zcufjqt...@mail.gmail.com/ > > Cc: Catalin Marinas <catalin.mari...@arm.com> > Cc: Will Deacon <w...@kernel.org> > Signed-off-by: Vincenzo Frascino <vincenzo.frasc...@arm.com> > --- > arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++- > arch/arm64/lib/mte.S | 15 --------------- > 2 files changed, 25 insertions(+), 16 deletions(-) > > diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h > index 1a715963d909..9730f2b07b79 100644 > --- a/arch/arm64/include/asm/mte.h > +++ b/arch/arm64/include/asm/mte.h > @@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task); > int mte_ptrace_copy_tags(struct task_struct *child, long request, > unsigned long addr, unsigned long data); > > -void mte_assign_mem_tag_range(void *addr, size_t size); > +static inline void mte_assign_mem_tag_range(void *addr, size_t size) > +{ > + u64 _addr = (u64)addr; > + u64 _end = _addr + size; > + > + /* > + * This function must be invoked from an MTE enabled context. > + * > + * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and > + * size must be non-zero and MTE_GRANULE_SIZE aligned. > + */ > + do { > + /* > + * 'asm volatile' is required to prevent the compiler to move > + * the statement outside of the loop. > + */ > + asm volatile(__MTE_PREAMBLE "stg %0, [%0]" > + : > + : "r" (_addr) > + : "memory"); > + > + _addr += MTE_GRANULE_SIZE; > + } while (_addr < _end); Is there any chance that this can be used for the last bytes of the virtual address space? This might need to change to `_addr == _end` if that is possible, otherwise it'll terminate early in that case. > +} What does the code generation look like for this, relative to the assembly version? Thanks, Mark. > + > > #else /* CONFIG_ARM64_MTE */ > > diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S > index 9e1a12e10053..a0a650451510 100644 > --- a/arch/arm64/lib/mte.S > +++ b/arch/arm64/lib/mte.S > @@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags) > ret > SYM_FUNC_END(mte_restore_page_tags) > > -/* > - * Assign allocation tags for a region of memory based on the pointer tag > - * x0 - source pointer > - * x1 - size > - * > - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and > - * size must be non-zero and MTE_GRANULE_SIZE aligned. > - */ > -SYM_FUNC_START(mte_assign_mem_tag_range) > -1: stg x0, [x0] > - add x0, x0, #MTE_GRANULE_SIZE > - subs x1, x1, #MTE_GRANULE_SIZE > - b.gt 1b > - ret > -SYM_FUNC_END(mte_assign_mem_tag_range) > -- > 2.30.0 >