On Fri, Jan 15, 2021 at 12:00:43PM +0000, Vincenzo Frascino wrote:
> mte_assign_mem_tag_range() is called on production KASAN HW hot
> paths. It makes sense to optimize it in an attempt to reduce the
> overhead.
> 
> Optimize mte_assign_mem_tag_range() based on the indications provided at
> [1].

... what exactly is the optimization?

I /think/ you're just trying to have it inlined, but you should mention
that explicitly.

> 
> [1] 
> https://lore.kernel.org/r/caaehk+wco+j7d1_t89dg+jjrplk3x9rsgfkxjgd0zcufjqt...@mail.gmail.com/
> 
> Cc: Catalin Marinas <catalin.mari...@arm.com>
> Cc: Will Deacon <w...@kernel.org>
> Signed-off-by: Vincenzo Frascino <vincenzo.frasc...@arm.com>
> ---
>  arch/arm64/include/asm/mte.h | 26 +++++++++++++++++++++++++-
>  arch/arm64/lib/mte.S         | 15 ---------------
>  2 files changed, 25 insertions(+), 16 deletions(-)
> 
> diff --git a/arch/arm64/include/asm/mte.h b/arch/arm64/include/asm/mte.h
> index 1a715963d909..9730f2b07b79 100644
> --- a/arch/arm64/include/asm/mte.h
> +++ b/arch/arm64/include/asm/mte.h
> @@ -49,7 +49,31 @@ long get_mte_ctrl(struct task_struct *task);
>  int mte_ptrace_copy_tags(struct task_struct *child, long request,
>                        unsigned long addr, unsigned long data);
>  
> -void mte_assign_mem_tag_range(void *addr, size_t size);
> +static inline void mte_assign_mem_tag_range(void *addr, size_t size)
> +{
> +     u64 _addr = (u64)addr;
> +     u64 _end = _addr + size;
> +
> +     /*
> +      * This function must be invoked from an MTE enabled context.
> +      *
> +      * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
> +      * size must be non-zero and MTE_GRANULE_SIZE aligned.
> +      */
> +     do {
> +             /*
> +              * 'asm volatile' is required to prevent the compiler to move
> +              * the statement outside of the loop.
> +              */
> +             asm volatile(__MTE_PREAMBLE "stg %0, [%0]"
> +                          :
> +                          : "r" (_addr)
> +                          : "memory");
> +
> +             _addr += MTE_GRANULE_SIZE;
> +     } while (_addr < _end);

Is there any chance that this can be used for the last bytes of the
virtual address space? This might need to change to `_addr == _end` if
that is possible, otherwise it'll terminate early in that case.

> +}

What does the code generation look like for this, relative to the
assembly version?

Thanks,
Mark.

> +
>  
>  #else /* CONFIG_ARM64_MTE */
>  
> diff --git a/arch/arm64/lib/mte.S b/arch/arm64/lib/mte.S
> index 9e1a12e10053..a0a650451510 100644
> --- a/arch/arm64/lib/mte.S
> +++ b/arch/arm64/lib/mte.S
> @@ -150,18 +150,3 @@ SYM_FUNC_START(mte_restore_page_tags)
>       ret
>  SYM_FUNC_END(mte_restore_page_tags)
>  
> -/*
> - * Assign allocation tags for a region of memory based on the pointer tag
> - *   x0 - source pointer
> - *   x1 - size
> - *
> - * Note: The address must be non-NULL and MTE_GRANULE_SIZE aligned and
> - * size must be non-zero and MTE_GRANULE_SIZE aligned.
> - */
> -SYM_FUNC_START(mte_assign_mem_tag_range)
> -1:   stg     x0, [x0]
> -     add     x0, x0, #MTE_GRANULE_SIZE
> -     subs    x1, x1, #MTE_GRANULE_SIZE
> -     b.gt    1b
> -     ret
> -SYM_FUNC_END(mte_assign_mem_tag_range)
> -- 
> 2.30.0
> 

Reply via email to