Hi Christophe,

On Fri, Oct 17, 2025 at 12:21:06PM +0200, Christophe Leroy wrote:
> Masked user access avoids the address/size verification by access_ok().
> Allthough its main purpose is to skip the speculation in the
> verification of user address and size hence avoid the need of spec
> mitigation, it also has the advantage of reducing the amount of
> instructions required so it even benefits to platforms that don't
> need speculation mitigation, especially when the size of the copy is
> not know at build time.
> 
> So implement masked user access on powerpc. The only requirement is
> to have memory gap that faults between the top user space and the
> real start of kernel area.
> 
> On 64 bits platforms the address space is divided that way:
> 
>       0xffffffffffffffff      +------------------+
>                               |                  |
>                               |   kernel space   |
>                               |                  |
>       0xc000000000000000      +------------------+  <== PAGE_OFFSET
>                               |//////////////////|
>                               |//////////////////|
>       0x8000000000000000      |//////////////////|
>                               |//////////////////|
>                               |//////////////////|
>       0x0010000000000000      +------------------+  <== TASK_SIZE_MAX
>                               |                  |
>                               |    user space    |
>                               |                  |
>       0x0000000000000000      +------------------+
> 
> Kernel is always above 0x8000000000000000 and user always
> below, with a gap in-between. It leads to a 3 instructions sequence:
> 
>   20: 7c 69 fe 76     sradi   r9,r3,63
>   24: 7c 69 48 78     andc    r9,r3,r9
>   28: 79 23 00 4c     rldimi  r3,r9,0,1
> 

Actually there is an even simpler (more obvious) sequence:

sradi r9,r3,63
srdi r9,r9,1  
andc r3,r3,r9

(the second instruction could also be clrldi r9,r9,1)

which translates back to C as:

[snipped]
> +static inline void __user *mask_user_address_simple(const void __user *ptr)
> +{
> +     unsigned long addr = (unsigned long)ptr;
> +     unsigned long sh = BITS_PER_LONG - 1;
> +     unsigned long mask = (unsigned long)((long)addr >> sh);
> +
> +     addr = ((addr & ~mask) & ((1UL << sh) - 1)) | ((mask & 1UL) << sh);
> +
> +     return (void __user *)addr;
> +}
> +

either (srdi):
        unsigned long mask = ((unsigned long)((long)addr >> sh)) >> 1;
or (clrldi):
        unsigned long mask = (unsigned long)(((long)addr >> sh) & LONG_MAX);

followed by:
        return (void __user *)(addr & ~ mask);

the result is the same but I find it easier to read, and it may be
easier for the compiler than to recognize an rl?imi insruction.

Cheers,
Gabriel

 


Reply via email to