[PATCH v6 14/25] powerpc: Provide do_ppc64_personality helper
Avoid duplication in future patch that will define the ppc64_personality syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, by extracting the common body of ppc64_personality into a helper function. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V3: New commit. V5: Remove 'inline'. --- arch/powerpc/kernel/syscalls.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 9830957498b0..135a0b9108d5 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -75,7 +75,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, } #ifdef CONFIG_PPC64 -long ppc64_personality(unsigned long personality) +static long do_ppc64_personality(unsigned long personality) { long ret; @@ -87,6 +87,10 @@ long ppc64_personality(unsigned long personality) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; } +long ppc64_personality(unsigned long personality) +{ + return do_ppc64_personality(personality); +} #endif long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, -- 2.34.1
[PATCH v6 13/25] powerpc: Remove direct call to mmap2 syscall handlers
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Move the compatibility syscall definition for mmap2 to syscalls.c, so that all mmap implementations can share a helper function. Remove 'inline' on static mmap helper. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2: Move mmap2 compat implementation to asm/kernel/syscalls.c. V4: Move to be applied before syscall wrapper introduced. V5: Remove 'inline' in helper. --- arch/powerpc/kernel/sys_ppc32.c | 9 - arch/powerpc/kernel/syscalls.c | 17 ++--- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index d961634976d8..776ae7565fc5 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include #include @@ -48,14 +47,6 @@ #include #include -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff) -{ - /* This should remain 12 even if PAGE_SIZE changes */ - return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); -} - compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, u32 reg6, u32 pos1, u32 pos2) { diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index a04c97faa21a..9830957498b0 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -36,9 +36,9 @@ #include #include -static inline long do_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long off, int shift) +static long do_mmap2(unsigned long addr, size_t len, +unsigned long prot, unsigned long flags, +unsigned long fd, unsigned long off, int shift) { if (!arch_validate_prot(prot, addr)) return -EINVAL; @@ -56,6 +56,17 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, pgoff, PAGE_SHIFT-12); } +#ifdef CONFIG_COMPAT +COMPAT_SYSCALL_DEFINE6(mmap2, + unsigned long, addr, size_t, len, + unsigned long, prot, unsigned long, flags, + unsigned long, fd, unsigned long, pgoff) +{ + /* This should remain 12 even if PAGE_SIZE changes */ + return do_mmap2(addr, len, prot, flags, fd, pgoff << 12, PAGE_SHIFT-12); +} +#endif + SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, unsigned long, prot, unsigned long, flags, unsigned long, fd, off_t, offset) -- 2.34.1
[PATCH v6 12/25] powerpc: Remove direct call to personality syscall handler
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Fortunately, in the case of ppc64_personality, its call to sys_personality can be replaced with an invocation to the equivalent ksys_personality inline helper in . Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2: Use inline helper to deduplicate bodies in compat/regular implementations. V4: Move to be applied before syscall wrapper. --- arch/powerpc/kernel/syscalls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 34e1ae88e15b..a04c97faa21a 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -71,7 +71,7 @@ long ppc64_personality(unsigned long personality) if (personality(current->personality) == PER_LINUX32 && personality(personality) == PER_LINUX) personality = (personality & ~PER_MASK) | PER_LINUX32; - ret = sys_personality(personality); + ret = ksys_personality(personality); if (personality(ret) == PER_LINUX32) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; -- 2.34.1
[PATCH v6 11/25] powerpc/32: Remove powerpc select specialisation
Syscall #82 has been implemented for 32-bit platforms in a unique way on powerpc systems. This hack will in effect guess whether the caller is expecting new select semantics or old select semantics. It does so via a guess, based off the first parameter. In new select, this parameter represents the length of a user-memory array of file descriptors, and in old select this is a pointer to an arguments structure. The heuristic simply interprets sufficiently large values of its first parameter as being a call to old select. The following is a discussion on how this syscall should be handled. Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a...@csgroup.eu/ As discussed in this thread, the existence of such a hack suggests that for whatever powerpc binaries may predate glibc, it is most likely that they would have taken use of the old select semantics. x86 and arm64 both implement this syscall with oldselect semantics. Remove the powerpc implementation, and update syscall.tbl to refer to emit a reference to sys_old_select and compat_sys_old_select for 32-bit binaries, in keeping with how other architectures support syscall #82. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2: Remove arch-specific select handler V3: Remove ppc_old_select prototype in . Move to earlier in patch series V5: Use compat_sys_old_select on 64-bit systems. --- arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/kernel/syscalls.c| 17 - arch/powerpc/kernel/syscalls/syscall.tbl | 2 +- .../arch/powerpc/entry/syscalls/syscall.tbl | 2 +- 4 files changed, 2 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 960b3871db72..20cbd29b1228 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -30,8 +30,6 @@ long sys_mmap2(unsigned long addr, size_t len, unsigned long fd, unsigned long pgoff); long ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, - fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low); diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index abc3fbb3c490..34e1ae88e15b 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -63,23 +63,6 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, offset, PAGE_SHIFT); } -#ifdef CONFIG_PPC32 -/* - * Due to some executables calling the wrong select we sometimes - * get wrong args. This determines how the args are being passed - * (a single ptr to them all args passed) then calls - * sys_select() with the appropriate args. -- Cort - */ -int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp) -{ - if ((unsigned long)n >= 4096) - return sys_old_select((void __user *)n); - - return sys_select(n, inp, outp, exp, tvp); -} -#endif - #ifdef CONFIG_PPC64 long ppc64_personality(unsigned long personality) { diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 2600b4237292..64f27cbbdd2c 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select compat_sys_old_select 82 64 select sys_ni_syscall 82 spu select sys_ni_syscall 83 common symlink sys_symlink diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl index 2600b4237292..64f27cbbdd2c 100644 --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select compat_sys_old_select 82 64 sel
[PATCH v6 10/25] powerpc: Use generic fallocate compatibility syscall
The powerpc fallocate compat syscall handler is identical to the generic implementation provided by commit 59c10c52f573f ("riscv: compat: syscall: Add compat_sys_call_table implementation"), and as such can be removed in favour of the generic implementation. A future patch series will replace more architecture-defined syscall handlers with generic implementations, dependent on introducing generic implementations that are compatible with powerpc and arm's parameter reorderings. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure Reviewed-by: Arnd Bergmann --- V2: Remove arch-specific fallocate handler. V3: Remove generic fallocate prototype. Move to beginning of. V5: Remove implementation as well which I somehow failed to do. Replace local BE compat_arg_u64 with generic. --- arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/include/asm/unistd.h | 1 + arch/powerpc/kernel/sys_ppc32.c | 7 --- 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 16b668515d15..960b3871db72 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -51,8 +51,6 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 int compat_sys_truncate64(const char __user *path, u32 reg4, unsigned long len1, unsigned long len2); -long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2); - int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index b1129b4ef57d..659a996c75aa 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -45,6 +45,7 @@ #define __ARCH_WANT_SYS_UTIME #define __ARCH_WANT_SYS_NEWFSTATAT #define __ARCH_WANT_COMPAT_STAT +#define __ARCH_WANT_COMPAT_FALLOCATE #define __ARCH_WANT_COMPAT_SYS_SENDFILE #endif #define __ARCH_WANT_SYS_FORK diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index ba363328da2b..d961634976d8 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -79,13 +79,6 @@ int compat_sys_truncate64(const char __user * path, u32 reg4, return ksys_truncate(path, merge_64(len1, len2)); } -long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, -u32 len1, u32 len2) -{ - return ksys_fallocate(fd, mode, merge_64(offset1, offset2), -merge_64(len1, len2)); -} - int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2) { -- 2.34.1
[PATCH v6 09/25] asm-generic: compat: Support BE for long long args in 32-bit ABIs
32-bit ABIs support passing 64-bit integers by registers via argument translation. Commit 59c10c52f573 ("riscv: compat: syscall: Add compat_sys_call_table implementation") implements the compat_arg_u64 macro for efficiently defining little endian compatibility syscalls. Architectures supporting big endianness may benefit from reciprocal argument translation, but are welcome also to implement their own. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin Reviewed-by: Arnd Bergmann --- V5: New patch. --- include/asm-generic/compat.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/compat.h b/include/asm-generic/compat.h index d06308a2a7a8..aeb257ad3d1a 100644 --- a/include/asm-generic/compat.h +++ b/include/asm-generic/compat.h @@ -14,12 +14,17 @@ #define COMPAT_OFF_T_MAX 0x7fff #endif -#if !defined(compat_arg_u64) && !defined(CONFIG_CPU_BIG_ENDIAN) +#ifndef compat_arg_u64 +#ifdef CONFIG_CPU_BIG_ENDIAN #define compat_arg_u64(name) u32 name##_lo, u32 name##_hi #define compat_arg_u64_dual(name) u32, name##_lo, u32, name##_hi +#else +#define compat_arg_u64(name) u32 name##_hi, u32 name##_lo +#define compat_arg_u64_dual(name) u32, name##_hi, u32, name##_lo +#endif #define compat_arg_u64_glue(name) (((u64)name##_lo & 0xUL) | \ ((u64)name##_hi << 32)) -#endif +#endif /* compat_arg_u64 */ /* These types are common across all compat ABIs */ typedef u32 compat_size_t; -- 2.34.1
[PATCH v6 08/25] powerpc: Fix fallocate and fadvise64_64 compat parameter combination
As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate compatibility handlers assume parameters are passed with 32-bit big-endian ABI. This affects the assignment of odd-even parameter pairs to the high or low words of a 64-bit syscall parameter. Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower 32 bits conditioned on endianness. A future patch will replace the arch-specific compat fallocate with an asm-generic implementation. This patch is intended for ease of back-port. [1]: https://lore.kernel.org/all/be29926f-226e-48dc-871a-e29a54e80...@www.fastmail.com/ Fixes: 57f48b4b74e7 ("powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode") Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure Reviewed-by: Arnd Bergmann Reviewed-by: Nicholas Piggin --- V5: New patch. --- arch/powerpc/include/asm/syscalls.h | 12 arch/powerpc/kernel/sys_ppc32.c | 14 +- arch/powerpc/kernel/syscalls.c | 4 ++-- 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 21c2faaa2957..16b668515d15 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,6 +8,18 @@ #include #include +/* + * long long munging: + * The 32 bit ABI passes long longs in an odd even register pair. + * High and low parts are swapped depending on endian mode, + * so define a macro (similar to mips linux32) to handle that. + */ +#ifdef __LITTLE_ENDIAN__ +#define merge_64(low, high) ((u64)high << 32) | low +#else +#define merge_64(high, low) ((u64)high << 32) | low +#endif + struct rtas_args; long sys_mmap(unsigned long addr, size_t len, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index f4edcc9489fb..ba363328da2b 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -56,18 +56,6 @@ unsigned long compat_sys_mmap2(unsigned long addr, size_t len, return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); } -/* - * long long munging: - * The 32 bit ABI passes long longs in an odd even register pair. - * High and low parts are swapped depending on endian mode, - * so define a macro (similar to mips linux32) to handle that. - */ -#ifdef __LITTLE_ENDIAN__ -#define merge_64(low, high) ((u64)high << 32) | low -#else -#define merge_64(high, low) ((u64)high << 32) | low -#endif - compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, u32 reg6, u32 pos1, u32 pos2) { @@ -94,7 +82,7 @@ int compat_sys_truncate64(const char __user * path, u32 reg4, long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2) { - return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2, + return ksys_fallocate(fd, mode, merge_64(offset1, offset2), merge_64(len1, len2)); } diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index fc999140bc27..abc3fbb3c490 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -98,8 +98,8 @@ long ppc64_personality(unsigned long personality) long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low) { - return ksys_fadvise64_64(fd, (u64)offset_high << 32 | offset_low, -(u64)len_high << 32 | len_low, advice); + return ksys_fadvise64_64(fd, merge_64(offset_high, offset_low), +merge_64(len_high, len_low), advice); } SYSCALL_DEFINE0(switch_endian) -- 2.34.1
[PATCH v6 07/25] powerpc/64s: Fix comment on interrupt handler prologue
Interrupt handlers on 64s systems will often need to save register state from the interrupted process to make space for loading special purpose registers or for internal state. Fix a comment documenting a common code path macro in the beginning of interrupt handlers where r10 is saved to the PACA to afford space for the value of the CFAR. Comment is currently written as if r10-r12 are saved to PACA, but in fact only r10 is saved, with r11-r12 saved much later. The distance in code between these saves has grown over the many revisions of this macro. Fix this by signalling with a comment where r11-r12 are saved to the PACA. Signed-off-by: Rohan McLure Reported-by: Nicholas Piggin Reviewed-by: Nicholas Piggin --- V2: Given its own commit V3: Annotate r11-r12 save locations with comment. --- arch/powerpc/kernel/exceptions-64s.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 3d0dc133a9ae..a3b51441b039 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -281,7 +281,7 @@ BEGIN_FTR_SECTION mfspr r9,SPRN_PPR END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) HMT_MEDIUM - std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ + std r10,IAREA+EX_R10(r13) /* save r10 */ .if ICFAR BEGIN_FTR_SECTION mfspr r10,SPRN_CFAR @@ -321,7 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mfctr r10 std r10,IAREA+EX_CTR(r13) mfcrr9 - std r11,IAREA+EX_R11(r13) + std r11,IAREA+EX_R11(r13) /* save r11 - r12 */ std r12,IAREA+EX_R12(r13) /* -- 2.34.1
[PATCH v6 06/25] powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS
The common interrupt handler prologue macro and the bad_stack trampolines include consecutive sequences of register saves, and some register clears. Neaten such instances by expanding use of the SAVE_GPRS macro and employing the ZEROIZE_GPR macro when appropriate. Also simplify an invocation of SAVE_GPRS targetting all non-volatile registers to SAVE_NVGPRS. Signed-off-by: Rohan Mclure Reported-by: Nicholas Piggin Reviewed-by: Nicholas Piggin --- V4: New commit. V6: Split REST_GPRS(0, 1, r1) to remove dependence on macro implementation ordering the r0 restore before r1 restore. --- arch/powerpc/kernel/exceptions-64e.S | 28 +++--- 1 file changed, 12 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 67dc4e3179a0..f1d714acc945 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -216,17 +216,15 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) mtlrr10 mtcrr11 - ld r10,GPR10(r1) - ld r11,GPR11(r1) - ld r12,GPR12(r1) + REST_GPRS(10, 12, r1) mtspr \scratch,r0 std r10,\paca_ex+EX_R10(r13); std r11,\paca_ex+EX_R11(r13); ld r10,_NIP(r1) ld r11,_MSR(r1) - ld r0,GPR0(r1) - ld r1,GPR1(r1) + REST_GPR(0, r1) + REST_GPR(1, r1) mtspr \srr0,r10 mtspr \srr1,r11 ld r10,\paca_ex+EX_R10(r13) @@ -372,16 +370,15 @@ ret_from_mc_except: /* Core exception code for all exceptions except TLB misses. */ #define EXCEPTION_COMMON_LVL(n, scratch, excf) \ exc_##n##_common: \ - std r0,GPR0(r1);/* save r0 in stackframe */ \ - std r2,GPR2(r1);/* save r2 in stackframe */ \ - SAVE_GPRS(3, 9, r1);/* save r3 - r9 in stackframe */\ + SAVE_GPR(0, r1);/* save r0 in stackframe */ \ + SAVE_GPRS(2, 9, r1);/* save r2 - r9 in stackframe */\ std r10,_NIP(r1); /* save SRR0 to stackframe */ \ std r11,_MSR(r1); /* save SRR1 to stackframe */ \ beq 2f; /* if from kernel mode */ \ 2: ld r3,excf+EX_R10(r13);/* get back r10 */ \ ld r4,excf+EX_R11(r13);/* get back r11 */ \ mfspr r5,scratch; /* get back r13 */ \ - std r12,GPR12(r1); /* save r12 in stackframe */\ + SAVE_GPR(12, r1); /* save r12 in stackframe */\ ld r2,PACATOC(r13);/* get kernel TOC into r2 */\ mflrr6; /* save LR in stackframe */ \ mfctr r7; /* save CTR in stackframe */\ @@ -390,7 +387,7 @@ exc_##n##_common: \ lwz r10,excf+EX_CR(r13);/* load orig CR back from PACA */ \ lbz r11,PACAIRQSOFTMASK(r13); /* get current IRQ softe */ \ ld r12,exception_marker@toc(r2); \ - li r0,0; \ + ZEROIZE_GPR(0); \ std r3,GPR10(r1); /* save r10 to stackframe */\ std r4,GPR11(r1); /* save r11 to stackframe */\ std r5,GPR13(r1); /* save it to stackframe */ \ @@ -1056,15 +1053,14 @@ bad_stack_book3e: mfspr r11,SPRN_ESR std r10,_DEAR(r1) std r11,_ESR(r1) - std r0,GPR0(r1);/* save r0 in stackframe */ \ - std r2,GPR2(r1);/* save r2 in stackframe */ \ - SAVE_GPRS(3, 9, r1);/* save r3 - r9 in stackframe */\ + SAVE_GPR(0, r1);/* save r0 in stackframe */ \ + SAVE_GPRS(2, 9, r1);/* save r2 - r9 in stackframe */\ ld r3,PACA_EXGEN+EX_R10(r13);/* get back r10 */\ ld r4,PACA_EXGEN+EX_R11(r13);/* get back r11 */\ mfspr r5,SPRN_SPRG_GEN_SCRATCH;/* get back r13 XXX can be wrong */ \ std r3,GPR10(r1); /* save r10 to stackframe */\ std r4,GPR11(r1); /* save r11 to stackframe */\ - std r12,GPR12(r1); /* save r12 in stackframe */\ + SAVE_GPR(12, r1); /* save r12 in stackframe */\ std r5,GPR13(r1); /* save it to stackframe */ \ mflrr10 mfctr r11 @@ -1072,12 +1068,12 @@ bad_stack_book3e: std r10
[PATCH v6 05/25] powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel stack frame. Issue the REST_GPR{,S} macros to clearly signal when this is happening, and bunch together restores at the end of the interrupt handler where the saved value is not consumed earlier in the handler code. Signed-off-by: Rohan McLure Reported-by: Christophe Leroy Reviewed-by: Nicholas Piggin --- V3: New patch. V4: Minimise restores in the unrecoverable window between restoring SRR0/1 and return from interrupt. --- arch/powerpc/kernel/entry_32.S | 33 +--- 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 44dfce9a60c5..e4b694cebc44 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -68,7 +68,7 @@ prepare_transfer_to_handler: lwz r9,_MSR(r11)/* if sleeping, clear MSR.EE */ rlwinm r9,r9,0,~MSR_EE lwz r12,_LINK(r11) /* and return to address in LR */ - lwz r2, GPR2(r11) + REST_GPR(2, r11) b fast_exception_return _ASM_NOKPROBE_SYMBOL(prepare_transfer_to_handler) #endif /* CONFIG_PPC_BOOK3S_32 || CONFIG_E500 */ @@ -144,7 +144,7 @@ ret_from_syscall: lwz r7,_NIP(r1) lwz r8,_MSR(r1) cmpwi r3,0 - lwz r3,GPR3(r1) + REST_GPR(3, r1) syscall_exit_finish: mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 @@ -152,8 +152,8 @@ syscall_exit_finish: bne 3f mtcrr5 -1: lwz r2,GPR2(r1) - lwz r1,GPR1(r1) +1: REST_GPR(2, r1) + REST_GPR(1, r1) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -165,10 +165,8 @@ syscall_exit_finish: REST_NVGPRS(r1) mtctr r4 mtxer r5 - lwz r0,GPR0(r1) - lwz r3,GPR3(r1) - REST_GPRS(4, 11, r1) - lwz r12,GPR12(r1) + REST_GPR(0, r1) + REST_GPRS(3, 12, r1) b 1b #ifdef CONFIG_44x @@ -260,9 +258,8 @@ fast_exception_return: beq 3f /* if not, we've got problems */ #endif -2: REST_GPRS(3, 6, r11) - lwz r10,_CCR(r11) - REST_GPRS(1, 2, r11) +2: lwz r10,_CCR(r11) + REST_GPRS(1, 6, r11) mtcrr10 lwz r10,_LINK(r11) mtlrr10 @@ -277,7 +274,7 @@ fast_exception_return: mtspr SPRN_SRR0,r12 REST_GPR(9, r11) REST_GPR(12, r11) - lwz r11,GPR11(r11) + REST_GPR(11, r11) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -454,9 +451,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r3,_MSR(r1);\ andi. r3,r3,MSR_PR; \ bne interrupt_return; \ - lwz r0,GPR0(r1);\ - lwz r2,GPR2(r1);\ - REST_GPRS(3, 8, r1);\ + REST_GPR(0, r1);\ + REST_GPRS(2, 8, r1);\ lwz r10,_XER(r1); \ lwz r11,_CTR(r1); \ mtspr SPRN_XER,r10; \ @@ -475,11 +471,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r12,_MSR(r1); \ mtspr exc_lvl_srr0,r11; \ mtspr exc_lvl_srr1,r12; \ - lwz r9,GPR9(r1);\ - lwz r12,GPR12(r1); \ - lwz r10,GPR10(r1); \ - lwz r11,GPR11(r1); \ - lwz r1,GPR1(r1);\ + REST_GPRS(9, 12, r1); \ + REST_GPR(1, r1);\ exc_lvl_rfi;\ b .; /* prevent prefetch past exc_lvl_rfi */ -- 2.34.1
[PATCH v6 04/25] powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers
Use the convenience macros for saving/clearing/restoring gprs in keeping with syscall calling conventions. The plural variants of these macros can store a range of registers for concision. This works well when the user gpr value we are hoping to save is still live. In the syscall interrupt handlers, user register state is sometimes juggled between registers. Hold-off from issuing the SAVE_GPR macro for applicable neighbouring lines to highlight the delicate register save logic. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2: Update summary V3: Update summary regarding exclusions for the SAVE_GPR marco. ledge new name for ZEROIZE_GPR{,S} macros. V5: Move to beginning of series --- arch/powerpc/kernel/interrupt_64.S | 43 ++-- 1 file changed, 9 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index 71d2d9497283..7d92a7a54727 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -71,12 +71,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) mfcrr12 li r11,0 /* Can we avoid saving r3-r8 in common case? */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -149,17 +144,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) /* Could zero these as per ABI, but we may consider a stricter ABI * which preserves these if libc implementations can benefit, so * restore them for now until further measurement is done. */ - ld r0,GPR0(r1) - ld r4,GPR4(r1) - ld r5,GPR5(r1) - ld r6,GPR6(r1) - ld r7,GPR7(r1) - ld r8,GPR8(r1) + REST_GPR(0, r1) + REST_GPRS(4, 8, r1) /* Zero volatile regs that may contain sensitive kernel data */ - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPRS(9, 12) mtspr SPRN_XER,r0 /* @@ -182,7 +170,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r5,_XER(r1) REST_NVGPRS(r1) - ld r0,GPR0(r1) + REST_GPR(0, r1) mtcrr2 mtctr r3 mtlrr4 @@ -250,12 +238,7 @@ END_BTB_FLUSH_SECTION mfcrr12 li r11,0 /* Can we avoid saving r3-r8 in common case? */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -345,16 +328,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ - li r0,0 - li r4,0 - li r5,0 - li r6,0 - li r7,0 - li r8,0 - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPR(0) + ZEROIZE_GPRS(4, 12) mtctr r0 mtspr SPRN_XER,r0 .Lsyscall_restore_regs_cont: @@ -380,7 +355,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) REST_NVGPRS(r1) mtctr r3 mtspr SPRN_XER,r4 - ld r0,GPR0(r1) + REST_GPR(0, r1) REST_GPRS(4, 12, r1) b .Lsyscall_restore_regs_cont .Lsyscall_rst_end: -- 2.34.1
[PATCH v6 03/25] powerpc: Add ZEROIZE_GPRS macros for register clears
Provide register zeroing macros, following the same convention as existing register stack save/restore macros, to be used in later change to concisely zero a sequence of consecutive gprs. The resulting macros are called ZEROIZE_GPRS and ZEROIZE_NVGPRS, keeping with the naming of the accompanying restore and save macros, and usage of zeroize to describe this operation elsewhere in the kernel. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2: Change 'ZERO' usage in naming to 'NULLIFY', a more obvious verb V3: Change 'NULLIFY' usage in naming to 'ZEROIZE', which has ent in kernel and explicitly specifies that we are zeroing. V4: Update commit message to use zeroize. V5: The reason for the patch is to add zeroize macros. Move that to first paragraph in patch description. --- arch/powerpc/include/asm/ppc_asm.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 83c02f5a7f2a..b95689ada59c 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -33,6 +33,20 @@ .endr .endm +/* + * This expands to a sequence of register clears for regs start to end + * inclusive, of the form: + * + * li rN, 0 + */ +.macro ZEROIZE_REGS start, end + .Lreg=\start + .rept (\end - \start + 1) + li .Lreg, 0 + .Lreg=.Lreg+1 + .endr +.endm + /* * Macros for storing registers into and loading registers from * exception frames. @@ -49,6 +63,14 @@ #define REST_NVGPRS(base) REST_GPRS(13, 31, base) #endif +#defineZEROIZE_GPRS(start, end)ZEROIZE_REGS start, end +#ifdef __powerpc64__ +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(14, 31) +#else +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(13, 31) +#endif +#defineZEROIZE_GPR(n) ZEROIZE_GPRS(n, n) + #define SAVE_GPR(n, base) SAVE_GPRS(n, n, base) #define REST_GPR(n, base) REST_GPRS(n, n, base) -- 2.34.1
[PATCH v6 02/25] powerpc: Save caller r3 prior to system_call_exception
This reverts commit 8875f47b7681 ("powerpc/syscall: Save r3 in regs->orig_r3 "). Save caller's original r3 state to the kernel stackframe before entering system_call_exception. This allows for user registers to be cleared by the time system_call_exception is entered, reducing the influence of user registers on speculation within the kernel. Prior to this commit, orig_r3 was saved at the beginning of system_call_exception. Instead, save orig_r3 while the user value is still live in r3. Also replicate this early save in 32-bit. A similar save was removed in commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") when 32-bit adopted system_call_exception. Revert its removal of orig_r3 saves. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V3: New commit. V5: New commit message, as we do more than just revert 8875f47b7681. --- arch/powerpc/kernel/entry_32.S | 1 + arch/powerpc/kernel/interrupt_64.S | 2 ++ arch/powerpc/kernel/syscall.c | 1 - 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 1d599df6f169..44dfce9a60c5 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -101,6 +101,7 @@ __kuep_unlock: .globl transfer_to_syscall transfer_to_syscall: + stw r3, ORIG_GPR3(r1) stw r11, GPR1(r1) stw r11, 0(r1) mflrr12 diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index ce25b28cf418..71d2d9497283 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -91,6 +91,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) li r11,\trapnr std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ @@ -275,6 +276,7 @@ END_BTB_FLUSH_SECTION std r10,_LINK(r1) std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c index 81ace9e8b72b..64102a64fd84 100644 --- a/arch/powerpc/kernel/syscall.c +++ b/arch/powerpc/kernel/syscall.c @@ -25,7 +25,6 @@ notrace long system_call_exception(long r3, long r4, long r5, kuap_lock(); add_random_kstack_offset(); - regs->orig_gpr3 = r3; if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); -- 2.34.1
[PATCH v6 01/25] powerpc: Remove asmlinkage from syscall handler definitions
The asmlinkage macro has no special meaning in powerpc, and prior to this patch is used sporadically on some syscall handler definitions. On architectures that do not define asmlinkage, it resolves to extern "C" for C++ compilers and a nop otherwise. The current invocations of asmlinkage provide far from complete support for C++ toolchains, and so the macro serves no purpose in powerpc. Remove all invocations of asmlinkage in arch/powerpc. These incidentally only occur in syscall definitions and prototypes. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin Reviewed-by: Andrew Donnellan --- V3: new patch --- arch/powerpc/include/asm/syscalls.h | 16 arch/powerpc/kernel/sys_ppc32.c | 8 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index a2b13e55254f..21c2faaa2957 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -10,14 +10,14 @@ struct rtas_args; -asmlinkage long sys_mmap(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, off_t offset); -asmlinkage long sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -asmlinkage long ppc64_personality(unsigned long personality); -asmlinkage long sys_rtas(struct rtas_args __user *uargs); +long sys_mmap(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, off_t offset); +long sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long ppc64_personality(unsigned long personality); +long sys_rtas(struct rtas_args __user *uargs); int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 16ff0399a257..f4edcc9489fb 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -85,20 +85,20 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 return ksys_readahead(fd, merge_64(offset1, offset2), count); } -asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4, +int compat_sys_truncate64(const char __user * path, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_truncate(path, merge_64(len1, len2)); } -asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, +long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2) { return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2, merge_64(len1, len2)); } -asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, +int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_ftruncate(fd, merge_64(len1, len2)); @@ -111,7 +111,7 @@ long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, advice); } -asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags, +long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned offset1, unsigned offset2, unsigned nbytes1, unsigned nbytes2) { -- 2.34.1
[PATCH v6 00/25] powerpc: Syscall wrapper and register clearing
V5 available here: Link: https://lore.kernel.org/all/20220916053300.786330-2-rmcl...@linux.ibm.com/T/ Implement a syscall wrapper, causing arguments to handlers to be passed via a struct pt_regs on the stack. The syscall wrapper is implemented for all platforms other than the Cell processor, from which SPUs expect the ability to directly call syscall handler symbols with the regular in-register calling convention. Adopting syscall wrappers requires redefinition of architecture-specific syscalls and compatibility syscalls to use the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, as well as removal of direct-references to the emitted syscall-handler symbols from within the kernel. This work lead to the following modernisations of powerpc's syscall handlers: - Replace syscall 82 semantics with sys_old_select and remove ppc_select handler, which features direct call to both sys_old_select and sys_select. - Use a generic fallocate compatibility syscall Replace asm implementation of syscall table with C implementation for more compile-time checks. Many compatibility syscalls are candidates to be removed in favour of generically defined handlers, but exhibit different parameter orderings and numberings due to 32-bit ABI support for 64-bit parameters. The parameter reorderings are however consistent with arm. A future patch series will serve to modernise syscalls by providing generic implementations featuring these reorderings. The design of this syscall is very similar to the s390, x86 and arm64 implementations. See also Commit 4378a7d4be30 (arm64: implement syscall wrappers). The motivation for this change is that it allows for the clearing of register state when entering the kernel via through interrupt handlers on 64-bit servers. This serves to reduce the influence of values in registers carried over from the interrupted process, e.g. syscall parameters from user space, or user state at the site of a pagefault. All values in registers are saved and zeroized at the entry to an interrupt handler and restored afterward. While this may sound like a heavy-weight mitigation, many gprs are already saved and restored on handling of an interrupt, and the mmap_bench benchmark on Power 9 guest, repeatedly invoking the pagefault handler suggests at most ~0.8% regression in performance. Realistic workloads are not constantly producing interrupts, and so this does not indicate realistic slowdown. Using wrapped syscalls yields to a performance improvement of ~5.6% on the null_syscall benchmark on pseries guests, by removing the need for system_call_exception to allocate its own stack frame. This amortises the additional costs of saving and restoring non-volatile registers (register clearing is cheap on super scalar platforms), and so the final mitigation actually yields a net performance improvement of ~0.6% on the null_syscall benchmark. The clearing of general purpose registers on interrupts other than syscalls is enabled by default only on Book3E 64-bit systems (where the mitigation is inexpensive), but available to other 64-bit systems via the INTERRUPT_SANITIZE_REGISTERS Kconfig option. This mitigation is optional, as the speculation influence of interrupts is likely less than that of syscalls. Patch Changelog: - Clears and restores of NVGPRS that depend on either SANITIZE or !SANITIZE have convenience macros. - Split syscall wrapper patch with change to system_call_exception calling convention. - In 64e, exchange misleading REST_GPRS(0, 1, r1) to clearly restore r0 first and avoid clobbering the stack pointer. - Remove extraneous clears of the high-order words of syscall parameters, which is now redundant thanks to use of COMPAT_SYSCALL_DEFINE everywhere. Rohan McLure (25): powerpc: Remove asmlinkage from syscall handler definitions powerpc: Save caller r3 prior to system_call_exception powerpc: Add ZEROIZE_GPRS macros for register clears powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS powerpc/64s: Fix comment on interrupt handler prologue powerpc: Fix fallocate and fadvise64_64 compat parameter combination asm-generic: compat: Support BE for long long args in 32-bit ABIs powerpc: Use generic fallocate compatibility syscall powerpc/32: Remove powerpc select specialisation powerpc: Remove direct call to personality syscall handler powerpc: Remove direct call to mmap2 syscall handlers powerpc: Provide do_ppc64_personality helper powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers powerpc: Include all arch-specific syscall prototypes powerpc: Enable compile-time check for syscall handlers powerpc: Use common syscall handler type powerpc: Remove high-order word clearing on compat syscall entry powerpc: Change system_call_exception calling conve
Re: [PATCH 20/23] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
> On 20 Sep 2022, at 2:54 pm, Rohan McLure wrote: > >> On 20 Sep 2022, at 12:03 pm, Nicholas Piggin wrote: >> >> On Fri Sep 16, 2022 at 3:32 PM AEST, Rohan McLure wrote: >>> Clear user state in gprs (assign to zero) to reduce the influence of user >>> registers on speculation within kernel syscall handlers. Clears occur >>> at the very beginning of the sc and scv 0 interrupt handlers, with >>> restores occurring following the execution of the syscall handler. >>> >>> Signed-off-by: Rohan McLure >>> --- >>> V1 -> V2: Update summary >>> V2 -> V3: Remove erroneous summary paragraph on syscall_exit_prepare >>> V3 -> V4: Use ZEROIZE instead of NULLIFY. Clear r0 also. >>> V4 -> V5: Move to end of patch series. >>> --- >> >> I think it looks okay. I'll have to take a better look with the series >> applied. > > > Your comments alerted me to the fact that general interrupt and syscalls > share their exit code in interrupt_return and its derivatives. Meaning > that disabling INTERRUPT_SANITIZE_REGISTERS also reverts restores of NVGPRS > to being optional, which makes it possible to clobber NVGPRS and then not > restore them. The cleanest way forward I belive is going to be to cause > INTERRUPT_SANITIZE_REGISTERS to perform sanitisation on all interrupt sources > rather than continuing with syscalls as their own special case. I’ll put > this out in a v6 soon. I think I managed to confuse myself here. Syscall handlers directly RFID, while other interrupt sources share a common exit path. >> >>> arch/powerpc/kernel/interrupt_64.S | 9 ++--- >>> 1 file changed, 6 insertions(+), 3 deletions(-) >>> >>> diff --git a/arch/powerpc/kernel/interrupt_64.S >>> b/arch/powerpc/kernel/interrupt_64.S >>> index 16a1b44088e7..40147558e1a6 100644 >>> --- a/arch/powerpc/kernel/interrupt_64.S >>> +++ b/arch/powerpc/kernel/interrupt_64.S >>> @@ -70,7 +70,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) >>> ld r2,PACATOC(r13) >>> mfcrr12 >>> li r11,0 >>> - /* Can we avoid saving r3-r8 in common case? */ >>> + /* Save syscall parameters in r3-r8 */ >> >> These two comment changes could go in your system_call_exception API >> change patch though. >> >> Thanks, >> Nick >> >>> SAVE_GPRS(3, 8, r1) >>> /* Zero r9-r12, this should only be required when restoring all GPRs */ >>> std r11,GPR9(r1) >>> @@ -110,6 +110,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >>> * Zero user registers to prevent influencing speculative execution >>> * state of kernel code. >>> */ >>> + ZEROIZE_GPR(0) >>> ZEROIZE_GPRS(5, 12) >>> ZEROIZE_NVGPRS() >>> bl system_call_exception >>> @@ -140,6 +141,7 @@ BEGIN_FTR_SECTION >>> HMT_MEDIUM_LOW >>> END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >>> >>> + REST_NVGPRS(r1) >>> cmpdi r3,0 >>> bne .Lsyscall_vectored_\name\()_restore_regs >>> >>> @@ -243,7 +245,7 @@ END_BTB_FLUSH_SECTION >>> ld r2,PACATOC(r13) >>> mfcrr12 >>> li r11,0 >>> - /* Can we avoid saving r3-r8 in common case? */ >>> + /* Save syscall parameters in r3-r8 */ >>> SAVE_GPRS(3, 8, r1) >>> /* Zero r9-r12, this should only be required when restoring all GPRs */ >>> std r11,GPR9(r1) >>> @@ -295,6 +297,7 @@ END_BTB_FLUSH_SECTION >>> * Zero user registers to prevent influencing speculative execution >>> * state of kernel code. >>> */ >>> + ZEROIZE_GPR(0) >>> ZEROIZE_GPRS(5, 12) >>> ZEROIZE_NVGPRS() >>> bl system_call_exception >>> @@ -337,6 +340,7 @@ BEGIN_FTR_SECTION >>> stdcx. r0,0,r1 /* to clear the reservation */ >>> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >>> >>> + REST_NVGPRS(r1) >>> cmpdi r3,0 >>> bne .Lsyscall_restore_regs >>> /* Zero volatile regs that may contain sensitive kernel data */ >>> @@ -364,7 +368,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >>> .Lsyscall_restore_regs: >>> ld r3,_CTR(r1) >>> ld r4,_XER(r1) >>> - REST_NVGPRS(r1) >>> mtctr r3 >>> mtspr SPRN_XER,r4 >>> REST_GPR(0, r1) >>> -- >>> 2.34.1
Re: [PATCH 19/23] powerpc: Provide syscall wrapper
> On 20 Sep 2022, at 11:59 am, Nicholas Piggin wrote: > > On Fri Sep 16, 2022 at 3:32 PM AEST, Rohan McLure wrote: >> Implement syscall wrapper as per s390, x86, arm64. When enabled >> cause handlers to accept parameters from a stack frame rather than >> from user scratch register state. This allows for user registers to be >> safely cleared in order to reduce caller influence on speculation >> within syscall routine. The wrapper is a macro that emits syscall >> handler symbols that call into the target handler, obtaining its >> parameters from a struct pt_regs on the stack. >> >> As registers are already saved to the stack prior to calling >> system_call_exception, it appears that this function is executed more >> efficiently with the new stack-pointer convention than with parameters >> passed by registers, avoiding the allocation of a stack frame for this >> method. On a 32-bit system, we see >20% performance increases on the >> null_syscall microbenchmark, and on a Power 8 the performance gains >> amortise the cost of clearing and restoring registers which is >> implemented at the end of this series, seeing final result of ~5.6% >> performance improvement on null_syscall. >> >> Syscalls are wrapped in this fashion on all platforms except for the >> Cell processor as this commit does not provide SPU support. This can be >> quickly fixed in a successive patch, but requires spu_sys_callback to >> allocate a pt_regs structure to satisfy the wrapped calling convention. >> >> Co-developed-by: Andrew Donnellan >> Signed-off-by: Andrew Donnellan >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Generate prototypes for symbols produced by the wrapper. >> V2 -> V3: Rebased to remove conflict with 1547db7d1f44 >> ("powerpc: Move system_call_exception() to syscall.c"). Also remove copy >> from gpr3 save slot on stackframe to orig_r3's slot. Fix whitespace with >> preprocessor defines in system_call_exception. >> V4 -> V5: Move systbl.c syscall wrapper support to this patch. Swap >> calling convention for system_call_exception to be (®s, r0) >> --- >> arch/powerpc/Kconfig | 1 + >> arch/powerpc/include/asm/interrupt.h | 3 +- >> arch/powerpc/include/asm/syscall.h | 4 + >> arch/powerpc/include/asm/syscall_wrapper.h | 84 >> arch/powerpc/include/asm/syscalls.h| 30 ++- >> arch/powerpc/kernel/entry_32.S | 6 +- >> arch/powerpc/kernel/interrupt_64.S | 28 +-- >> arch/powerpc/kernel/syscall.c | 31 +++- > > Ah, this is where it went :) > > I wouldn't normally mind except the previous patch might break the > build because it uses the same name, will it? > > This *could* be two patches, one to change system_call_exception API, > the next to add the wrapper. > >> arch/powerpc/kernel/systbl.c | 8 ++ >> arch/powerpc/kernel/vdso.c | 2 + >> 10 files changed, 164 insertions(+), 33 deletions(-) >> >> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig >> index 4c466acdc70d..ef6c83e79c9b 100644 >> --- a/arch/powerpc/Kconfig >> +++ b/arch/powerpc/Kconfig >> @@ -137,6 +137,7 @@ config PPC >> select ARCH_HAS_STRICT_KERNEL_RWX if (PPC_BOOK3S || PPC_8xx || >> 40x) && !HIBERNATION >> select ARCH_HAS_STRICT_KERNEL_RWX if FSL_BOOKE && !HIBERNATION && >> !RANDOMIZE_BASE >> select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX >> +select ARCH_HAS_SYSCALL_WRAPPER if !SPU_BASE >> select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST >> select ARCH_HAS_UACCESS_FLUSHCACHE >> select ARCH_HAS_UBSAN_SANITIZE_ALL >> diff --git a/arch/powerpc/include/asm/interrupt.h >> b/arch/powerpc/include/asm/interrupt.h >> index 8069dbc4b8d1..48eec9cd1429 100644 >> --- a/arch/powerpc/include/asm/interrupt.h >> +++ b/arch/powerpc/include/asm/interrupt.h >> @@ -665,8 +665,7 @@ static inline void >> interrupt_cond_local_irq_enable(struct pt_regs *regs) >> local_irq_enable(); >> } >> >> -long system_call_exception(long r3, long r4, long r5, long r6, long r7, >> long r8, >> - unsigned long r0, struct pt_regs *regs); >> +long system_call_exception(struct pt_regs *regs, unsigned long r0); >> notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs >> *regs, long scv); >> notrace unsigned long interrupt_exit_user_p
Re: [PATCH 20/23] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
> On 20 Sep 2022, at 12:03 pm, Nicholas Piggin wrote: > > On Fri Sep 16, 2022 at 3:32 PM AEST, Rohan McLure wrote: >> Clear user state in gprs (assign to zero) to reduce the influence of user >> registers on speculation within kernel syscall handlers. Clears occur >> at the very beginning of the sc and scv 0 interrupt handlers, with >> restores occurring following the execution of the syscall handler. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Update summary >> V2 -> V3: Remove erroneous summary paragraph on syscall_exit_prepare >> V3 -> V4: Use ZEROIZE instead of NULLIFY. Clear r0 also. >> V4 -> V5: Move to end of patch series. >> --- > > I think it looks okay. I'll have to take a better look with the series > applied. Your comments alerted me to the fact that general interrupt and syscalls share their exit code in interrupt_return and its derivatives. Meaning that disabling INTERRUPT_SANITIZE_REGISTERS also reverts restores of NVGPRS to being optional, which makes it possible to clobber NVGPRS and then not restore them. The cleanest way forward I belive is going to be to cause INTERRUPT_SANITIZE_REGISTERS to perform sanitisation on all interrupt sources rather than continuing with syscalls as their own special case. I’ll put this out in a v6 soon. > >> arch/powerpc/kernel/interrupt_64.S | 9 ++--- >> 1 file changed, 6 insertions(+), 3 deletions(-) >> >> diff --git a/arch/powerpc/kernel/interrupt_64.S >> b/arch/powerpc/kernel/interrupt_64.S >> index 16a1b44088e7..40147558e1a6 100644 >> --- a/arch/powerpc/kernel/interrupt_64.S >> +++ b/arch/powerpc/kernel/interrupt_64.S >> @@ -70,7 +70,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) >> ld r2,PACATOC(r13) >> mfcrr12 >> li r11,0 >> -/* Can we avoid saving r3-r8 in common case? */ >> +/* Save syscall parameters in r3-r8 */ > > These two comment changes could go in your system_call_exception API > change patch though. > > Thanks, > Nick > >> SAVE_GPRS(3, 8, r1) >> /* Zero r9-r12, this should only be required when restoring all GPRs */ >> std r11,GPR9(r1) >> @@ -110,6 +110,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> * Zero user registers to prevent influencing speculative execution >> * state of kernel code. >> */ >> +ZEROIZE_GPR(0) >> ZEROIZE_GPRS(5, 12) >> ZEROIZE_NVGPRS() >> bl system_call_exception >> @@ -140,6 +141,7 @@ BEGIN_FTR_SECTION >> HMT_MEDIUM_LOW >> END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> >> +REST_NVGPRS(r1) >> cmpdi r3,0 >> bne .Lsyscall_vectored_\name\()_restore_regs >> >> @@ -243,7 +245,7 @@ END_BTB_FLUSH_SECTION >> ld r2,PACATOC(r13) >> mfcrr12 >> li r11,0 >> -/* Can we avoid saving r3-r8 in common case? */ >> +/* Save syscall parameters in r3-r8 */ >> SAVE_GPRS(3, 8, r1) >> /* Zero r9-r12, this should only be required when restoring all GPRs */ >> std r11,GPR9(r1) >> @@ -295,6 +297,7 @@ END_BTB_FLUSH_SECTION >> * Zero user registers to prevent influencing speculative execution >> * state of kernel code. >> */ >> +ZEROIZE_GPR(0) >> ZEROIZE_GPRS(5, 12) >> ZEROIZE_NVGPRS() >> bl system_call_exception >> @@ -337,6 +340,7 @@ BEGIN_FTR_SECTION >> stdcx. r0,0,r1 /* to clear the reservation */ >> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> >> +REST_NVGPRS(r1) >> cmpdi r3,0 >> bne .Lsyscall_restore_regs >> /* Zero volatile regs that may contain sensitive kernel data */ >> @@ -364,7 +368,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> .Lsyscall_restore_regs: >> ld r3,_CTR(r1) >> ld r4,_XER(r1) >> -REST_NVGPRS(r1) >> mtctr r3 >> mtspr SPRN_XER,r4 >> REST_GPR(0, r1) >> -- >> 2.34.1
Re: [PATCH 00/23] powerpc: Syscall wrapper and register clearing
> On 16 Sep 2022, at 3:32 pm, Rohan McLure wrote: > > V4 available here: > > Link: > https://lore.kernel.org/all/20220824020548.62625-1-rmcl...@linux.ibm.com/ > > Implement a syscall wrapper, causing arguments to handlers to be passed > via a struct pt_regs on the stack. The syscall wrapper is implemented > for all platforms other than the Cell processor, from which SPUs expect > the ability to directly call syscall handler symbols with the regular > in-register calling convention. > > Adopting syscall wrappers requires redefinition of architecture-specific > syscalls and compatibility syscalls to use the SYSCALL_DEFINE and > COMPAT_SYSCALL_DEFINE macros, as well as removal of direct-references to > the emitted syscall-handler symbols from within the kernel. This work > lead to the following modernisations of powerpc's syscall handlers: > > - Replace syscall 82 semantics with sys_old_select and remove > ppc_select handler, which features direct call to both sys_old_select > and sys_select. > - Use a generic fallocate compatibility syscall > > Replace asm implementation of syscall table with C implementation for > more compile-time checks. > > Many compatibility syscalls are candidates to be removed in favour of > generically defined handlers, but exhibit different parameter orderings > and numberings due to 32-bit ABI support for 64-bit parameters. The > parameter reorderings are however consistent with arm. A future patch > series will serve to modernise syscalls by providing generic > implementations featuring these reorderings. > > The design of this syscall is very similar to the s390, x86 and arm64 > implementations. See also Commit 4378a7d4be30 (arm64: implement syscall > wrappers). > The motivation for this change is that it allows for the clearing of > register state when entering the kernel via through interrupt handlers > on 64-bit servers. This serves to reduce the influence of values in > registers carried over from the interrupted process, e.g. syscall > parameters from user space, or user state at the site of a pagefault. > All values in registers are saved and zeroized at the entry to an > interrupt handler and restored afterward. While this may sound like a > heavy-weight mitigation, many gprs are already saved and restored on > handling of an interrupt, and the mmap_bench benchmark on Power 9 guest, > repeatedly invoking the pagefault handler suggests at most ~0.8% > regression in performance. Realistic workloads are not constantly > producing interrupts, and so this does not indicate realistic slowdown. > > Using wrapped syscalls yields to a performance improvement of ~5.6% on > the null_syscall benchmark on pseries guests, by removing the need for > system_call_exception to allocate its own stack frame. This amortises > the additional costs of saving and restoring non-volatile registers > (register clearing is cheap on super scalar platforms), and so the > final mitigation actually yields a net performance improvement of ~0.6% > on the null_syscall benchmark. > > The clearing of general purpose registers on interrupts other than > syscalls is enabled by default only on Book3E 64-bit systems (where the > mitigation is inexpensive), but available to other 64-bit systems via > the INTERRUPT_SANITIZE_REGISTERS Kconfig option. This mitigation is > optional, as the speculation influence of interrupts is likely less than > that of syscalls. > > Patch Changelog: > > - Format orig_r3 handling as its own patch rather than just a revert. > - Provide asm-generic BE implementation of long-long munging syscall > compatiblity arguments. > - Syscall #82 now refers to generic sys_old_select or > comptat_sys_old_select. > - Drop 'inline' on static helper functions for mmap, personality. > - Remove arch-specific sys fallocate implementation that was meant to > have been removed in V2. > - Remove references to syscall wrapper until it is introduced. > - Rearrange patch series so the last five patches are syscall wrapper > > syscall register clears > interrupt register clears. > - Whether non-syscall interrupts should clear registers is now > configurable by INTERRUPT_SANITIZE_REGISTERS. > > Rohan McLure (23): > powerpc: Remove asmlinkage from syscall handler definitions > powerpc: Save caller r3 prior to system_call_exception > powerpc: Add ZEROIZE_GPRS macros for register clears > powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers > powerpc/32: Clarify interrupt restores with REST_GPR macro in >entry_32.S > powerpc/64e: Clarify register saves and clears with >{SAVE,ZEROIZE}_GPRS > powerpc/64s: Fix comment on interrupt handler prologue > powerpc: F
Re: [PATCH 15/23] powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers
> On 16 Sep 2022, at 3:32 pm, Rohan McLure wrote: > > Arch-specific implementations of syscall handlers are currently used > over generic implementations for the following reasons: > > 1. Semantics unique to powerpc > 2. Compatibility syscalls require 'argument padding' to comply with > 64-bit argument convention in ELF32 abi. > 3. Parameter types or order is different in other architectures. > > These syscall handlers have been defined prior to this patch series > without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with > custom input and output types. We remove every such direct definition in > favour of the aforementioned macros. > > Also update syscalls.tbl in order to refer to the symbol names generated > by each of these macros. Since ppc64_personality can be called by both > 64 bit and 32 bit binaries through compatibility, we must generate both > both compat_sys_ and sys_ symbols for this handler. > > As an aside: > A number of architectures including arm and powerpc agree on an > alternative argument order and numbering for most of these arch-specific > handlers. A future patch series may allow for asm/unistd.h to signal > through its defines that a generic implementation of these syscall > handlers with the correct calling convention be emitted, through the > __ARCH_WANT_COMPAT_SYS_... convention. > > Signed-off-by: Rohan McLure > --- > V1 -> V2: All syscall handlers wrapped by this macro. > V2 -> V3: Move creation of do_ppc64_personality helper to prior patch. > V3 -> V4: Fix parenthesis alignment. Don't emit sys_*** symbols. > V4 -> V5: Use 'aside' in the asm-generic rant in commit message. > --- > arch/powerpc/include/asm/syscalls.h | 10 ++--- > arch/powerpc/kernel/sys_ppc32.c | 38 +++--- > arch/powerpc/kernel/syscalls.c | 17 ++-- > arch/powerpc/kernel/syscalls/syscall.tbl | 22 +- > .../arch/powerpc/entry/syscalls/syscall.tbl | 22 +- > 5 files changed, 64 insertions(+), 45 deletions(-) > > diff --git a/arch/powerpc/include/asm/syscalls.h > b/arch/powerpc/include/asm/syscalls.h > index 20cbd29b1228..525d2aa0c8ca 100644 > --- a/arch/powerpc/include/asm/syscalls.h > +++ b/arch/powerpc/include/asm/syscalls.h > @@ -28,10 +28,10 @@ long sys_mmap(unsigned long addr, size_t len, > long sys_mmap2(unsigned long addr, size_t len, > unsigned long prot, unsigned long flags, > unsigned long fd, unsigned long pgoff); > -long ppc64_personality(unsigned long personality); > +long sys_ppc64_personality(unsigned long personality); > long sys_rtas(struct rtas_args __user *uargs); > -long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, > - u32 len_high, u32 len_low); > +long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 > offset_low, > + u32 len_high, u32 len_low); > > #ifdef CONFIG_COMPAT > unsigned long compat_sys_mmap2(unsigned long addr, size_t len, > @@ -52,8 +52,8 @@ int compat_sys_truncate64(const char __user *path, u32 reg4, > int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, > unsigned long len2); > > -long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, > - size_t len, int advice); > +long compat_sys_ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, > + size_t len, int advice); > > long compat_sys_sync_file_range2(int fd, unsigned int flags, >unsigned int offset1, unsigned int offset2, > diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c > index 776ae7565fc5..dcc3c9fd4cfd 100644 > --- a/arch/powerpc/kernel/sys_ppc32.c > +++ b/arch/powerpc/kernel/sys_ppc32.c > @@ -47,45 +47,55 @@ > #include > #include > > -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, > compat_size_t count, > - u32 reg6, u32 pos1, u32 pos2) > +COMPAT_SYSCALL_DEFINE6(ppc_pread64, > +unsigned int, fd, > +char __user *, ubuf, compat_size_t, count, > +u32, reg6, u32, pos1, u32, pos2) > { > return ksys_pread64(fd, ubuf, count, merge_64(pos1, pos2)); > } > > -compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, > compat_size_t count, > - u32 reg6, u32 pos1, u32 pos2) > +COMPAT_SYSCALL_DEFINE6(ppc_pwrite64, > +unsigned int, fd, > +const char __user *, ubuf, compat_size_t, count, > +u32, reg6, u32, p
[PATCH 00/23] powerpc: Syscall wrapper and register clearing
V4 available here: Link: https://lore.kernel.org/all/20220824020548.62625-1-rmcl...@linux.ibm.com/ Implement a syscall wrapper, causing arguments to handlers to be passed via a struct pt_regs on the stack. The syscall wrapper is implemented for all platforms other than the Cell processor, from which SPUs expect the ability to directly call syscall handler symbols with the regular in-register calling convention. Adopting syscall wrappers requires redefinition of architecture-specific syscalls and compatibility syscalls to use the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, as well as removal of direct-references to the emitted syscall-handler symbols from within the kernel. This work lead to the following modernisations of powerpc's syscall handlers: - Replace syscall 82 semantics with sys_old_select and remove ppc_select handler, which features direct call to both sys_old_select and sys_select. - Use a generic fallocate compatibility syscall Replace asm implementation of syscall table with C implementation for more compile-time checks. Many compatibility syscalls are candidates to be removed in favour of generically defined handlers, but exhibit different parameter orderings and numberings due to 32-bit ABI support for 64-bit parameters. The parameter reorderings are however consistent with arm. A future patch series will serve to modernise syscalls by providing generic implementations featuring these reorderings. The design of this syscall is very similar to the s390, x86 and arm64 implementations. See also Commit 4378a7d4be30 (arm64: implement syscall wrappers). The motivation for this change is that it allows for the clearing of register state when entering the kernel via through interrupt handlers on 64-bit servers. This serves to reduce the influence of values in registers carried over from the interrupted process, e.g. syscall parameters from user space, or user state at the site of a pagefault. All values in registers are saved and zeroized at the entry to an interrupt handler and restored afterward. While this may sound like a heavy-weight mitigation, many gprs are already saved and restored on handling of an interrupt, and the mmap_bench benchmark on Power 9 guest, repeatedly invoking the pagefault handler suggests at most ~0.8% regression in performance. Realistic workloads are not constantly producing interrupts, and so this does not indicate realistic slowdown. Using wrapped syscalls yields to a performance improvement of ~5.6% on the null_syscall benchmark on pseries guests, by removing the need for system_call_exception to allocate its own stack frame. This amortises the additional costs of saving and restoring non-volatile registers (register clearing is cheap on super scalar platforms), and so the final mitigation actually yields a net performance improvement of ~0.6% on the null_syscall benchmark. The clearing of general purpose registers on interrupts other than syscalls is enabled by default only on Book3E 64-bit systems (where the mitigation is inexpensive), but available to other 64-bit systems via the INTERRUPT_SANITIZE_REGISTERS Kconfig option. This mitigation is optional, as the speculation influence of interrupts is likely less than that of syscalls. Patch Changelog: - Format orig_r3 handling as its own patch rather than just a revert. - Provide asm-generic BE implementation of long-long munging syscall compatiblity arguments. - Syscall #82 now refers to generic sys_old_select or comptat_sys_old_select. - Drop 'inline' on static helper functions for mmap, personality. - Remove arch-specific sys fallocate implementation that was meant to have been removed in V2. - Remove references to syscall wrapper until it is introduced. - Rearrange patch series so the last five patches are syscall wrapper > syscall register clears > interrupt register clears. - Whether non-syscall interrupts should clear registers is now configurable by INTERRUPT_SANITIZE_REGISTERS. Rohan McLure (23): powerpc: Remove asmlinkage from syscall handler definitions powerpc: Save caller r3 prior to system_call_exception powerpc: Add ZEROIZE_GPRS macros for register clears powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS powerpc/64s: Fix comment on interrupt handler prologue powerpc: Fix fallocate and fadvise64_64 compat parameter combination asm-generic: compat: Support BE for long long args in 32-bit ABIs powerpc: Use generic fallocate compatibility syscall powerpc/32: Remove powerpc select specialisation powerpc: Remove direct call to personality syscall handler powerpc: Remove direct call to mmap2 syscall handlers powerpc: Provide do_ppc64_personality helper powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers powerpc: Incl
[PATCH 12/23] powerpc: Remove direct call to personality syscall handler
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Fortunately, in the case of ppc64_personality, its call to sys_personality can be replaced with an invocation to the equivalent ksys_personality inline helper in . Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V1 -> V2: Use inline helper to deduplicate bodies in compat/regular implementations. V3 -> V4: Move to be applied before syscall wrapper. --- arch/powerpc/kernel/syscalls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 34e1ae88e15b..a04c97faa21a 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -71,7 +71,7 @@ long ppc64_personality(unsigned long personality) if (personality(current->personality) == PER_LINUX32 && personality(personality) == PER_LINUX) personality = (personality & ~PER_MASK) | PER_LINUX32; - ret = sys_personality(personality); + ret = ksys_personality(personality); if (personality(ret) == PER_LINUX32) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; -- 2.34.1
[PATCH 20/23] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
Clear user state in gprs (assign to zero) to reduce the influence of user registers on speculation within kernel syscall handlers. Clears occur at the very beginning of the sc and scv 0 interrupt handlers, with restores occurring following the execution of the syscall handler. Signed-off-by: Rohan McLure --- V1 -> V2: Update summary V2 -> V3: Remove erroneous summary paragraph on syscall_exit_prepare V3 -> V4: Use ZEROIZE instead of NULLIFY. Clear r0 also. V4 -> V5: Move to end of patch series. --- arch/powerpc/kernel/interrupt_64.S | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index 16a1b44088e7..40147558e1a6 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -70,7 +70,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) ld r2,PACATOC(r13) mfcrr12 li r11,0 - /* Can we avoid saving r3-r8 in common case? */ + /* Save syscall parameters in r3-r8 */ SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) @@ -110,6 +110,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) * Zero user registers to prevent influencing speculative execution * state of kernel code. */ + ZEROIZE_GPR(0) ZEROIZE_GPRS(5, 12) ZEROIZE_NVGPRS() bl system_call_exception @@ -140,6 +141,7 @@ BEGIN_FTR_SECTION HMT_MEDIUM_LOW END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) + REST_NVGPRS(r1) cmpdi r3,0 bne .Lsyscall_vectored_\name\()_restore_regs @@ -243,7 +245,7 @@ END_BTB_FLUSH_SECTION ld r2,PACATOC(r13) mfcrr12 li r11,0 - /* Can we avoid saving r3-r8 in common case? */ + /* Save syscall parameters in r3-r8 */ SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) @@ -295,6 +297,7 @@ END_BTB_FLUSH_SECTION * Zero user registers to prevent influencing speculative execution * state of kernel code. */ + ZEROIZE_GPR(0) ZEROIZE_GPRS(5, 12) ZEROIZE_NVGPRS() bl system_call_exception @@ -337,6 +340,7 @@ BEGIN_FTR_SECTION stdcx. r0,0,r1 /* to clear the reservation */ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) + REST_NVGPRS(r1) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ @@ -364,7 +368,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) .Lsyscall_restore_regs: ld r3,_CTR(r1) ld r4,_XER(r1) - REST_NVGPRS(r1) mtctr r3 mtspr SPRN_XER,r4 REST_GPR(0, r1) -- 2.34.1
[PATCH 11/23] powerpc/32: Remove powerpc select specialisation
Syscall #82 has been implemented for 32-bit platforms in a unique way on powerpc systems. This hack will in effect guess whether the caller is expecting new select semantics or old select semantics. It does so via a guess, based off the first parameter. In new select, this parameter represents the length of a user-memory array of file descriptors, and in old select this is a pointer to an arguments structure. The heuristic simply interprets sufficiently large values of its first parameter as being a call to old select. The following is a discussion on how this syscall should be handled. Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a...@csgroup.eu/ As discussed in this thread, the existence of such a hack suggests that for whatever powerpc binaries may predate glibc, it is most likely that they would have taken use of the old select semantics. x86 and arm64 both implement this syscall with oldselect semantics. Remove the powerpc implementation, and update syscall.tbl to refer to emit a reference to sys_old_select and compat_sys_old_select for 32-bit binaries, in keeping with how other architectures support syscall #82. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V1 -> V2: Remove arch-specific select handler V2 -> V3: Remove ppc_old_select prototype in . Move to earlier in patch series V4 -> V5: Use compat_sys_old_select on 64-bit systems. --- arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/kernel/syscalls.c| 17 - arch/powerpc/kernel/syscalls/syscall.tbl | 2 +- .../arch/powerpc/entry/syscalls/syscall.tbl | 2 +- 4 files changed, 2 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 960b3871db72..20cbd29b1228 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -30,8 +30,6 @@ long sys_mmap2(unsigned long addr, size_t len, unsigned long fd, unsigned long pgoff); long ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, - fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low); diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index abc3fbb3c490..34e1ae88e15b 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -63,23 +63,6 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, offset, PAGE_SHIFT); } -#ifdef CONFIG_PPC32 -/* - * Due to some executables calling the wrong select we sometimes - * get wrong args. This determines how the args are being passed - * (a single ptr to them all args passed) then calls - * sys_select() with the appropriate args. -- Cort - */ -int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp) -{ - if ((unsigned long)n >= 4096) - return sys_old_select((void __user *)n); - - return sys_select(n, inp, outp, exp, tvp); -} -#endif - #ifdef CONFIG_PPC64 long ppc64_personality(unsigned long personality) { diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 2600b4237292..64f27cbbdd2c 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select compat_sys_old_select 82 64 select sys_ni_syscall 82 spu select sys_ni_syscall 83 common symlink sys_symlink diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl index 2600b4237292..64f27cbbdd2c 100644 --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select compat_sys_old_select 82 64
[PATCH 07/23] powerpc/64s: Fix comment on interrupt handler prologue
Interrupt handlers on 64s systems will often need to save register state from the interrupted process to make space for loading special purpose registers or for internal state. Fix a comment documenting a common code path macro in the beginning of interrupt handlers where r10 is saved to the PACA to afford space for the value of the CFAR. Comment is currently written as if r10-r12 are saved to PACA, but in fact only r10 is saved, with r11-r12 saved much later. The distance in code between these saves has grown over the many revisions of this macro. Fix this by signalling with a comment where r11-r12 are saved to the PACA. Signed-off-by: Rohan McLure Reported-by: Nicholas Piggin --- V1 -> V2: Given its own commit V2 -> V3: Annotate r11-r12 save locations with comment. --- arch/powerpc/kernel/exceptions-64s.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 3d0dc133a9ae..a3b51441b039 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -281,7 +281,7 @@ BEGIN_FTR_SECTION mfspr r9,SPRN_PPR END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) HMT_MEDIUM - std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ + std r10,IAREA+EX_R10(r13) /* save r10 */ .if ICFAR BEGIN_FTR_SECTION mfspr r10,SPRN_CFAR @@ -321,7 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mfctr r10 std r10,IAREA+EX_CTR(r13) mfcrr9 - std r11,IAREA+EX_R11(r13) + std r11,IAREA+EX_R11(r13) /* save r11 - r12 */ std r12,IAREA+EX_R12(r13) /* -- 2.34.1
[PATCH 15/23] powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers
Arch-specific implementations of syscall handlers are currently used over generic implementations for the following reasons: 1. Semantics unique to powerpc 2. Compatibility syscalls require 'argument padding' to comply with 64-bit argument convention in ELF32 abi. 3. Parameter types or order is different in other architectures. These syscall handlers have been defined prior to this patch series without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with custom input and output types. We remove every such direct definition in favour of the aforementioned macros. Also update syscalls.tbl in order to refer to the symbol names generated by each of these macros. Since ppc64_personality can be called by both 64 bit and 32 bit binaries through compatibility, we must generate both both compat_sys_ and sys_ symbols for this handler. As an aside: A number of architectures including arm and powerpc agree on an alternative argument order and numbering for most of these arch-specific handlers. A future patch series may allow for asm/unistd.h to signal through its defines that a generic implementation of these syscall handlers with the correct calling convention be emitted, through the __ARCH_WANT_COMPAT_SYS_... convention. Signed-off-by: Rohan McLure --- V1 -> V2: All syscall handlers wrapped by this macro. V2 -> V3: Move creation of do_ppc64_personality helper to prior patch. V3 -> V4: Fix parenthesis alignment. Don't emit sys_*** symbols. V4 -> V5: Use 'aside' in the asm-generic rant in commit message. --- arch/powerpc/include/asm/syscalls.h | 10 ++--- arch/powerpc/kernel/sys_ppc32.c | 38 +++--- arch/powerpc/kernel/syscalls.c | 17 ++-- arch/powerpc/kernel/syscalls/syscall.tbl | 22 +- .../arch/powerpc/entry/syscalls/syscall.tbl | 22 +- 5 files changed, 64 insertions(+), 45 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 20cbd29b1228..525d2aa0c8ca 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -28,10 +28,10 @@ long sys_mmap(unsigned long addr, size_t len, long sys_mmap2(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, unsigned long fd, unsigned long pgoff); -long ppc64_personality(unsigned long personality); +long sys_ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, - u32 len_high, u32 len_low); +long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, + u32 len_high, u32 len_low); #ifdef CONFIG_COMPAT unsigned long compat_sys_mmap2(unsigned long addr, size_t len, @@ -52,8 +52,8 @@ int compat_sys_truncate64(const char __user *path, u32 reg4, int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); -long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, -size_t len, int advice); +long compat_sys_ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, + size_t len, int advice); long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned int offset1, unsigned int offset2, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 776ae7565fc5..dcc3c9fd4cfd 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -47,45 +47,55 @@ #include #include -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, -u32 reg6, u32 pos1, u32 pos2) +COMPAT_SYSCALL_DEFINE6(ppc_pread64, + unsigned int, fd, + char __user *, ubuf, compat_size_t, count, + u32, reg6, u32, pos1, u32, pos2) { return ksys_pread64(fd, ubuf, count, merge_64(pos1, pos2)); } -compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2) +COMPAT_SYSCALL_DEFINE6(ppc_pwrite64, + unsigned int, fd, + const char __user *, ubuf, compat_size_t, count, + u32, reg6, u32, pos1, u32, pos2) { return ksys_pwrite64(fd, ubuf, count, merge_64(pos1, pos2)); } -compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count) +COMPAT_SYSCALL_DEFINE5(ppc_readahead, + int, fd, u32, r4, + u32, offset1, u32, offset2, u32, count) { return ksys_readahead(fd, merge_64(offset1, offset2), count); } -int compat_sys_truncate64(const char __user * path, u32 reg4, -
[PATCH 08/23] powerpc: Fix fallocate and fadvise64_64 compat parameter combination
As reported[1] by Arnd, the arch-specific fadvise64_64 and fallocate compatibility handlers assume parameters are passed with 32-bit big-endian ABI. This affects the assignment of odd-even parameter pairs to the high or low words of a 64-bit syscall parameter. Fix fadvise64_64 fallocate compat handlers to correctly swap upper/lower 32 bits conditioned on endianness. A future patch will replace the arch-specific compat fallocate with an asm-generic implementation. This patch is intended for ease of back-port. [1]: https://lore.kernel.org/all/be29926f-226e-48dc-871a-e29a54e80...@www.fastmail.com/ Fixes: 57f48b4b74e7 ("powerpc/compat_sys: swap hi/lo parts of 64-bit syscall args in LE mode") Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure --- V4 -> V5: New patch. --- arch/powerpc/include/asm/syscalls.h | 12 arch/powerpc/kernel/sys_ppc32.c | 14 +- arch/powerpc/kernel/syscalls.c | 4 ++-- 3 files changed, 15 insertions(+), 15 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 21c2faaa2957..16b668515d15 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,6 +8,18 @@ #include #include +/* + * long long munging: + * The 32 bit ABI passes long longs in an odd even register pair. + * High and low parts are swapped depending on endian mode, + * so define a macro (similar to mips linux32) to handle that. + */ +#ifdef __LITTLE_ENDIAN__ +#define merge_64(low, high) ((u64)high << 32) | low +#else +#define merge_64(high, low) ((u64)high << 32) | low +#endif + struct rtas_args; long sys_mmap(unsigned long addr, size_t len, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index f4edcc9489fb..ba363328da2b 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -56,18 +56,6 @@ unsigned long compat_sys_mmap2(unsigned long addr, size_t len, return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); } -/* - * long long munging: - * The 32 bit ABI passes long longs in an odd even register pair. - * High and low parts are swapped depending on endian mode, - * so define a macro (similar to mips linux32) to handle that. - */ -#ifdef __LITTLE_ENDIAN__ -#define merge_64(low, high) ((u64)high << 32) | low -#else -#define merge_64(high, low) ((u64)high << 32) | low -#endif - compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, u32 reg6, u32 pos1, u32 pos2) { @@ -94,7 +82,7 @@ int compat_sys_truncate64(const char __user * path, u32 reg4, long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2) { - return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2, + return ksys_fallocate(fd, mode, merge_64(offset1, offset2), merge_64(len1, len2)); } diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index fc999140bc27..abc3fbb3c490 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -98,8 +98,8 @@ long ppc64_personality(unsigned long personality) long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low) { - return ksys_fadvise64_64(fd, (u64)offset_high << 32 | offset_low, -(u64)len_high << 32 | len_low, advice); + return ksys_fadvise64_64(fd, merge_64(offset_high, offset_low), +merge_64(len_high, len_low), advice); } SYSCALL_DEFINE0(switch_endian) -- 2.34.1
[PATCH 17/23] powerpc: Enable compile-time check for syscall handlers
The table of syscall handlers and registered compatibility syscall handlers has in past been produced using assembly, with function references resolved at link time. This moves link-time errors to compile-time, by rewriting systbl.S in C, and including the linux/syscalls.h, linux/compat.h and asm/syscalls.h headers for prototypes. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure Reported-by: Nicholas Piggin --- V1 -> V2: New patch. V4 -> V5: For this patch only, represent handler function pointers as unsigned long. Remove reference to syscall wrappers. Use asm/syscalls.h which implies asm/syscall.h --- arch/powerpc/kernel/{systbl.S => systbl.c} | 28 1 file changed, 11 insertions(+), 17 deletions(-) diff --git a/arch/powerpc/kernel/systbl.S b/arch/powerpc/kernel/systbl.c similarity index 61% rename from arch/powerpc/kernel/systbl.S rename to arch/powerpc/kernel/systbl.c index 6c1db3b6de2d..ce52bd2ec292 100644 --- a/arch/powerpc/kernel/systbl.S +++ b/arch/powerpc/kernel/systbl.c @@ -10,32 +10,26 @@ * PPC64 updates by Dave Engebretsen (engeb...@us.ibm.com) */ -#include +#include +#include +#include +#include -.section .rodata,"a" +#define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) +#define __SYSCALL(nr, entry) [nr] = (unsigned long) &entry, -#ifdef CONFIG_PPC64 - .p2align3 -#define __SYSCALL(nr, entry) .8byte entry -#else - .p2align2 -#define __SYSCALL(nr, entry) .long entry -#endif - -#define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, native) -.globl sys_call_table -sys_call_table: +const unsigned long sys_call_table[] = { #ifdef CONFIG_PPC64 #include #else #include #endif +}; #ifdef CONFIG_COMPAT #undef __SYSCALL_WITH_COMPAT #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, compat) -.globl compat_sys_call_table -compat_sys_call_table: -#define compat_sys_sigsuspend sys_sigsuspend +const unsigned long compat_sys_call_table[] = { #include -#endif +}; +#endif /* CONFIG_COMPAT */ -- 2.34.1
[PATCH 10/23] powerpc: Use generic fallocate compatibility syscall
The powerpc fallocate compat syscall handler is identical to the generic implementation provided by commit 59c10c52f573f ("riscv: compat: syscall: Add compat_sys_call_table implementation"), and as such can be removed in favour of the generic implementation. A future patch series will replace more architecture-defined syscall handlers with generic implementations, dependent on introducing generic implementations that are compatible with powerpc and arm's parameter reorderings. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure --- V1 -> V2: Remove arch-specific fallocate handler. V2 -> V3: Remove generic fallocate prototype. Move to beginning of series. V4 -> V5: Remove implementation as well which I somehow failed to do. Replace local BE compat_arg_u64 with generic. --- arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/include/asm/unistd.h | 1 + arch/powerpc/kernel/sys_ppc32.c | 7 --- 3 files changed, 1 insertion(+), 9 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 16b668515d15..960b3871db72 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -51,8 +51,6 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 int compat_sys_truncate64(const char __user *path, u32 reg4, unsigned long len1, unsigned long len2); -long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2); - int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index b1129b4ef57d..659a996c75aa 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -45,6 +45,7 @@ #define __ARCH_WANT_SYS_UTIME #define __ARCH_WANT_SYS_NEWFSTATAT #define __ARCH_WANT_COMPAT_STAT +#define __ARCH_WANT_COMPAT_FALLOCATE #define __ARCH_WANT_COMPAT_SYS_SENDFILE #endif #define __ARCH_WANT_SYS_FORK diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index ba363328da2b..d961634976d8 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -79,13 +79,6 @@ int compat_sys_truncate64(const char __user * path, u32 reg4, return ksys_truncate(path, merge_64(len1, len2)); } -long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, -u32 len1, u32 len2) -{ - return ksys_fallocate(fd, mode, merge_64(offset1, offset2), -merge_64(len1, len2)); -} - int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2) { -- 2.34.1
[PATCH 22/23] powerpc/64s: Clear gprs on interrupt routine entry in Book3S
Zero GPRS r0, r2-r11, r14-r31, on entry into the kernel for all other interrupt sources to limit influence of user-space values in potential speculation gadgets. The remaining gprs are overwritten by entry macros to interrupt handlers, irrespective of whether or not a given handler consumes these register values. Prior to this commit, r14-r31 are restored on a per-interrupt basis at exit, but now they are always restored. Remove explicit REST_NVGPRS invocations as non-volatiles must now always be restored. 32-bit systems do not clear user registers on interrupt, and continue to depend on the return value of interrupt_exit_user_prepare to determine whether or not to restore non-volatiles. The mmap_bench benchmark in selftests should rapidly invoke pagefaults. See ~0.8% performance regression with this mitigation, but this indicates the worst-case performance due to heavier-weight interrupt handlers. This mitigation is disabled by default, but enabled with CONFIG_INTERRUPT_SANITIZE_REGISTERS. Signed-off-by: Rohan McLure --- V1 -> V2: Add benchmark data V2 -> V3: Use ZEROIZE_GPR{,S} macro renames, clarify interrupt_exit_user_prepare changes in summary. V4 -> V5: Configurable now with INTERRUPT_SANITIZE_REGISTERS. Zero r12 (containing MSR) from common macro on per-interrupt basis with IOPTION. --- arch/powerpc/kernel/exceptions-64s.S | 37 -- arch/powerpc/kernel/interrupt_64.S | 10 +++ 2 files changed, 45 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index a3b51441b039..be5e72caada1 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -111,6 +111,7 @@ name: #define ISTACK .L_ISTACK_\name\() /* Set regular kernel stack */ #define __ISTACK(name) .L_ISTACK_ ## name #define IKUAP .L_IKUAP_\name\() /* Do KUAP lock */ +#define IMSR_R12 .L_IMSR_R12_\name\()/* Assumes MSR saved to r12 */ #define INT_DEFINE_BEGIN(n)\ .macro int_define_ ## n name @@ -176,6 +177,9 @@ do_define_int n .ifndef IKUAP IKUAP=1 .endif + .ifndef IMSR_R12 + IMSR_R12=0 + .endif .endm /* @@ -502,6 +506,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real, text) std r10,0(r1) /* make stack chain pointer */ std r0,GPR0(r1) /* save r0 in stackframe*/ std r10,GPR1(r1)/* save r1 in stackframe*/ + ZEROIZE_GPR(0) /* Mark our [H]SRRs valid for return */ li r10,1 @@ -544,8 +549,16 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) std r9,GPR11(r1) std r10,GPR12(r1) std r11,GPR13(r1) + .if !IMSR_R12 + ZEROIZE_GPRS(9, 12) + .else + ZEROIZE_GPRS(9, 11) + .endif SAVE_NVGPRS(r1) +#ifdef CONFIG_INTERRUPT_SANITIZE_REGISTERS + ZEROIZE_NVGPRS() +#endif .if IDAR .if IISIDE @@ -577,8 +590,8 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r10,IAREA+EX_CTR(r13) std r10,_CTR(r1) - std r2,GPR2(r1) /* save r2 in stackframe*/ - SAVE_GPRS(3, 8, r1) /* save r3 - r8 in stackframe */ + SAVE_GPRS(2, 8, r1) /* save r2 - r8 in stackframe */ + ZEROIZE_GPRS(2, 8) mflrr9 /* Get LR, later save to stack */ ld r2,PACATOC(r13) /* get kernel TOC into r2 */ std r9,_LINK(r1) @@ -696,6 +709,9 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mtlrr9 ld r9,_CCR(r1) mtcrr9 +#ifdef CONFIG_INTERRUPT_SANITIZE_REGISTERS + REST_NVGPRS(r1) +#endif REST_GPRS(2, 13, r1) REST_GPR(0, r1) /* restore original r1. */ @@ -1368,11 +1384,13 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) b interrupt_return_srr 1: bl do_break +#ifndef CONFIG_INTERRUPT_SANITIZE_REGISTERS /* * do_break() may have changed the NV GPRS while handling a breakpoint. * If so, we need to restore them with their updated values. */ REST_NVGPRS(r1) +#endif b interrupt_return_srr @@ -1598,7 +1616,9 @@ EXC_COMMON_BEGIN(alignment_common) GEN_COMMON alignment addir3,r1,STACK_FRAME_OVERHEAD bl alignment_exception +#ifndef CONFIG_INTERRUPT_SANITIZE_REGISTERS REST_NVGPRS(r1) /* instruction emulation may change GPRs */ +#endif b interrupt_return_srr @@ -1708,7 +1728,9 @@ EXC_COMMON_BEGIN(program_check_common) .Ldo_program_check: addir3,r1,STACK_FRAME_OVERHEAD bl program_check_exception +#ifndef CONFIG_INTERRUPT_SANITIZE_REGISTERS REST_NVGPRS(r1) /* instruction emulation may change GPRs */ +#endif
[PATCH 21/23] powerpc/64: Add INTERRUPT_SANITIZE_REGISTERS Kconfig
Add Kconfig option for enabling clearing of registers on arrival in an interrupt handler. This reduces the speculation influence of registers on kernel internals. The option will be consumed by 64-bit systems that feature speculation and wish to implement this mitigation. This patch only introduces the Kconfig option, no actual mitigations. The primary overhead of this mitigation lies in an increased number of registers that must be saved and restored by interrupt handlers on Book3S systems. Enable by default on Book3E systems, which prior to this patch eagerly save and restore register state, meaning that the mitigation when implemented will have minimal overhead. Signed-off-by: Rohan McLure --- V4 -> V5: New patch --- arch/powerpc/Kconfig | 9 + 1 file changed, 9 insertions(+) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index ef6c83e79c9b..a643ebd83349 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -528,6 +528,15 @@ config HOTPLUG_CPU Say N if you are unsure. +config INTERRUPT_SANITIZE_REGISTERS + bool "Clear gprs on interrupt arrival" + depends on PPC64 && ARCH_HAS_SYSCALL_WRAPPER + default PPC_BOOK3E_64 + help + Reduce the influence of user register state on interrupt handlers and + syscalls through clearing user state from registers before handling + the exception. + config PPC_QUEUED_SPINLOCKS bool "Queued spinlocks" if EXPERT depends on SMP -- 2.34.1
[PATCH 05/23] powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel stack frame. Issue the REST_GPR{,S} macros to clearly signal when this is happening, and bunch together restores at the end of the interrupt handler where the saved value is not consumed earlier in the handler code. Signed-off-by: Rohan McLure Reported-by: Christophe Leroy --- V2 -> V3: New patch. V3 -> V4: Minimise restores in the unrecoverable window between restoring SRR0/1 and return from interrupt. --- arch/powerpc/kernel/entry_32.S | 33 +--- 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 44dfce9a60c5..e4b694cebc44 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -68,7 +68,7 @@ prepare_transfer_to_handler: lwz r9,_MSR(r11)/* if sleeping, clear MSR.EE */ rlwinm r9,r9,0,~MSR_EE lwz r12,_LINK(r11) /* and return to address in LR */ - lwz r2, GPR2(r11) + REST_GPR(2, r11) b fast_exception_return _ASM_NOKPROBE_SYMBOL(prepare_transfer_to_handler) #endif /* CONFIG_PPC_BOOK3S_32 || CONFIG_E500 */ @@ -144,7 +144,7 @@ ret_from_syscall: lwz r7,_NIP(r1) lwz r8,_MSR(r1) cmpwi r3,0 - lwz r3,GPR3(r1) + REST_GPR(3, r1) syscall_exit_finish: mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 @@ -152,8 +152,8 @@ syscall_exit_finish: bne 3f mtcrr5 -1: lwz r2,GPR2(r1) - lwz r1,GPR1(r1) +1: REST_GPR(2, r1) + REST_GPR(1, r1) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -165,10 +165,8 @@ syscall_exit_finish: REST_NVGPRS(r1) mtctr r4 mtxer r5 - lwz r0,GPR0(r1) - lwz r3,GPR3(r1) - REST_GPRS(4, 11, r1) - lwz r12,GPR12(r1) + REST_GPR(0, r1) + REST_GPRS(3, 12, r1) b 1b #ifdef CONFIG_44x @@ -260,9 +258,8 @@ fast_exception_return: beq 3f /* if not, we've got problems */ #endif -2: REST_GPRS(3, 6, r11) - lwz r10,_CCR(r11) - REST_GPRS(1, 2, r11) +2: lwz r10,_CCR(r11) + REST_GPRS(1, 6, r11) mtcrr10 lwz r10,_LINK(r11) mtlrr10 @@ -277,7 +274,7 @@ fast_exception_return: mtspr SPRN_SRR0,r12 REST_GPR(9, r11) REST_GPR(12, r11) - lwz r11,GPR11(r11) + REST_GPR(11, r11) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -454,9 +451,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r3,_MSR(r1);\ andi. r3,r3,MSR_PR; \ bne interrupt_return; \ - lwz r0,GPR0(r1);\ - lwz r2,GPR2(r1);\ - REST_GPRS(3, 8, r1);\ + REST_GPR(0, r1);\ + REST_GPRS(2, 8, r1);\ lwz r10,_XER(r1); \ lwz r11,_CTR(r1); \ mtspr SPRN_XER,r10; \ @@ -475,11 +471,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r12,_MSR(r1); \ mtspr exc_lvl_srr0,r11; \ mtspr exc_lvl_srr1,r12; \ - lwz r9,GPR9(r1);\ - lwz r12,GPR12(r1); \ - lwz r10,GPR10(r1); \ - lwz r11,GPR11(r1); \ - lwz r1,GPR1(r1);\ + REST_GPRS(9, 12, r1); \ + REST_GPR(1, r1);\ exc_lvl_rfi;\ b .; /* prevent prefetch past exc_lvl_rfi */ -- 2.34.1
[PATCH 02/23] powerpc: Save caller r3 prior to system_call_exception
This reverts commit 8875f47b7681 ("powerpc/syscall: Save r3 in regs->orig_r3 "). Save caller's original r3 state to the kernel stackframe before entering system_call_exception. This allows for user registers to be cleared by the time system_call_exception is entered, reducing the influence of user registers on speculation within the kernel. Prior to this commit, orig_r3 was saved at the beginning of system_call_exception. Instead, save orig_r3 while the user value is still live in r3. Also replicate this early save in 32-bit. A similar save was removed in commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") when 32-bit adopted system_call_exception. Revert its removal of orig_r3 saves. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2 -> V3: New commit. V4 -> V5: New commit message, as we do more than just revert 8875f47b7681. --- arch/powerpc/kernel/entry_32.S | 1 + arch/powerpc/kernel/interrupt_64.S | 2 ++ arch/powerpc/kernel/syscall.c | 1 - 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 1d599df6f169..44dfce9a60c5 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -101,6 +101,7 @@ __kuep_unlock: .globl transfer_to_syscall transfer_to_syscall: + stw r3, ORIG_GPR3(r1) stw r11, GPR1(r1) stw r11, 0(r1) mflrr12 diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index ce25b28cf418..71d2d9497283 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -91,6 +91,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) li r11,\trapnr std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ @@ -275,6 +276,7 @@ END_BTB_FLUSH_SECTION std r10,_LINK(r1) std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c index 81ace9e8b72b..64102a64fd84 100644 --- a/arch/powerpc/kernel/syscall.c +++ b/arch/powerpc/kernel/syscall.c @@ -25,7 +25,6 @@ notrace long system_call_exception(long r3, long r4, long r5, kuap_lock(); add_random_kstack_offset(); - regs->orig_gpr3 = r3; if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); -- 2.34.1
[PATCH 04/23] powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers
Use the convenience macros for saving/clearing/restoring gprs in keeping with syscall calling conventions. The plural variants of these macros can store a range of registers for concision. This works well when the user gpr value we are hoping to save is still live. In the syscall interrupt handlers, user register state is sometimes juggled between registers. Hold-off from issuing the SAVE_GPR macro for applicable neighbouring lines to highlight the delicate register save logic. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V1 -> V2: Update summary V2 -> V3: Update summary regarding exclusions for the SAVE_GPR marco. Acknowledge new name for ZEROIZE_GPR{,S} macros. V4 -> V5: Move to beginning of series --- arch/powerpc/kernel/interrupt_64.S | 43 ++-- 1 file changed, 9 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index 71d2d9497283..7d92a7a54727 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -71,12 +71,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) mfcrr12 li r11,0 /* Can we avoid saving r3-r8 in common case? */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -149,17 +144,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) /* Could zero these as per ABI, but we may consider a stricter ABI * which preserves these if libc implementations can benefit, so * restore them for now until further measurement is done. */ - ld r0,GPR0(r1) - ld r4,GPR4(r1) - ld r5,GPR5(r1) - ld r6,GPR6(r1) - ld r7,GPR7(r1) - ld r8,GPR8(r1) + REST_GPR(0, r1) + REST_GPRS(4, 8, r1) /* Zero volatile regs that may contain sensitive kernel data */ - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPRS(9, 12) mtspr SPRN_XER,r0 /* @@ -182,7 +170,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r5,_XER(r1) REST_NVGPRS(r1) - ld r0,GPR0(r1) + REST_GPR(0, r1) mtcrr2 mtctr r3 mtlrr4 @@ -250,12 +238,7 @@ END_BTB_FLUSH_SECTION mfcrr12 li r11,0 /* Can we avoid saving r3-r8 in common case? */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -345,16 +328,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ - li r0,0 - li r4,0 - li r5,0 - li r6,0 - li r7,0 - li r8,0 - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPR(0) + ZEROIZE_GPRS(4, 12) mtctr r0 mtspr SPRN_XER,r0 .Lsyscall_restore_regs_cont: @@ -380,7 +355,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) REST_NVGPRS(r1) mtctr r3 mtspr SPRN_XER,r4 - ld r0,GPR0(r1) + REST_GPR(0, r1) REST_GPRS(4, 12, r1) b .Lsyscall_restore_regs_cont .Lsyscall_rst_end: -- 2.34.1
[PATCH 18/23] powerpc: Use common syscall handler type
Cause syscall handlers to be typed as follows when called indirectly throughout the kernel. This is to allow for better type checking. typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); Since both 32 and 64-bit abis allow for at least the first six machine-word length parameters to a function to be passed by registers, even handlers which admit fewer than six parameters may be viewed as having the above type. Coercing syscalls to syscall_fn requires a cast to void* to avoid -Wcast-function-type. Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce explicit cast on systems with SPUs. Signed-off-by: Rohan McLure --- V1 -> V2: New patch. V2 -> V3: Remove unnecessary cast from const syscall_fn to syscall_fn V4 -> V5: Update patch description. --- arch/powerpc/include/asm/syscall.h | 7 +-- arch/powerpc/include/asm/syscalls.h | 1 + arch/powerpc/kernel/systbl.c| 6 +++--- arch/powerpc/kernel/vdso.c | 4 ++-- arch/powerpc/platforms/cell/spu_callbacks.c | 6 +++--- 5 files changed, 14 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index 25fc8ad9a27a..d2a8dfd5de33 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -14,9 +14,12 @@ #include #include +typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, + unsigned long, unsigned long, unsigned long); + /* ftrace syscalls requires exporting the sys_call_table */ -extern const unsigned long sys_call_table[]; -extern const unsigned long compat_sys_call_table[]; +extern const syscall_fn sys_call_table[]; +extern const syscall_fn compat_sys_call_table[]; static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 5d106acf7906..cc87168d6ecb 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,6 +8,7 @@ #include #include +#include #ifdef CONFIG_PPC64 #include #endif diff --git a/arch/powerpc/kernel/systbl.c b/arch/powerpc/kernel/systbl.c index ce52bd2ec292..e5d419822b4e 100644 --- a/arch/powerpc/kernel/systbl.c +++ b/arch/powerpc/kernel/systbl.c @@ -16,9 +16,9 @@ #include #define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) -#define __SYSCALL(nr, entry) [nr] = (unsigned long) &entry, +#define __SYSCALL(nr, entry) [nr] = (void *) entry, -const unsigned long sys_call_table[] = { +const syscall_fn sys_call_table[] = { #ifdef CONFIG_PPC64 #include #else @@ -29,7 +29,7 @@ const unsigned long sys_call_table[] = { #ifdef CONFIG_COMPAT #undef __SYSCALL_WITH_COMPAT #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, compat) -const unsigned long compat_sys_call_table[] = { +const syscall_fn compat_sys_call_table[] = { #include }; #endif /* CONFIG_COMPAT */ diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index bf9574ec26ce..fcca06d200d3 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -304,10 +304,10 @@ static void __init vdso_setup_syscall_map(void) unsigned int i; for (i = 0; i < NR_syscalls; i++) { - if (sys_call_table[i] != (unsigned long)&sys_ni_syscall) + if (sys_call_table[i] != (void *)&sys_ni_syscall) vdso_data->syscall_map[i >> 5] |= 0x8000UL >> (i & 0x1f); if (IS_ENABLED(CONFIG_COMPAT) && - compat_sys_call_table[i] != (unsigned long)&sys_ni_syscall) + compat_sys_call_table[i] != (void *)&sys_ni_syscall) vdso_data->compat_syscall_map[i >> 5] |= 0x8000UL >> (i & 0x1f); } } diff --git a/arch/powerpc/platforms/cell/spu_callbacks.c b/arch/powerpc/platforms/cell/spu_callbacks.c index fe0d8797a00a..e780c14c5733 100644 --- a/arch/powerpc/platforms/cell/spu_callbacks.c +++ b/arch/powerpc/platforms/cell/spu_callbacks.c @@ -34,15 +34,15 @@ * mbind, mq_open, ipc, ... */ -static void *spu_syscall_table[] = { +static const syscall_fn spu_syscall_table[] = { #define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) -#define __SYSCALL(nr, entry) [nr] = entry, +#define __SYSCALL(nr, entry) [nr] = (void *) entry, #include }; long spu_sys_callback(struct spu_syscall_block *s) { - long (*syscall)(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6); + syscall_fn syscall; if (s->nr_ret >= ARRAY_SIZE(spu_syscall_table)) { pr_debug("%s: invalid syscall #%lld", __func__, s->nr_ret); -- 2.34.1
[PATCH 19/23] powerpc: Provide syscall wrapper
Implement syscall wrapper as per s390, x86, arm64. When enabled cause handlers to accept parameters from a stack frame rather than from user scratch register state. This allows for user registers to be safely cleared in order to reduce caller influence on speculation within syscall routine. The wrapper is a macro that emits syscall handler symbols that call into the target handler, obtaining its parameters from a struct pt_regs on the stack. As registers are already saved to the stack prior to calling system_call_exception, it appears that this function is executed more efficiently with the new stack-pointer convention than with parameters passed by registers, avoiding the allocation of a stack frame for this method. On a 32-bit system, we see >20% performance increases on the null_syscall microbenchmark, and on a Power 8 the performance gains amortise the cost of clearing and restoring registers which is implemented at the end of this series, seeing final result of ~5.6% performance improvement on null_syscall. Syscalls are wrapped in this fashion on all platforms except for the Cell processor as this commit does not provide SPU support. This can be quickly fixed in a successive patch, but requires spu_sys_callback to allocate a pt_regs structure to satisfy the wrapped calling convention. Co-developed-by: Andrew Donnellan Signed-off-by: Andrew Donnellan Signed-off-by: Rohan McLure --- V1 -> V2: Generate prototypes for symbols produced by the wrapper. V2 -> V3: Rebased to remove conflict with 1547db7d1f44 ("powerpc: Move system_call_exception() to syscall.c"). Also remove copy from gpr3 save slot on stackframe to orig_r3's slot. Fix whitespace with preprocessor defines in system_call_exception. V4 -> V5: Move systbl.c syscall wrapper support to this patch. Swap calling convention for system_call_exception to be (®s, r0) --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/interrupt.h | 3 +- arch/powerpc/include/asm/syscall.h | 4 + arch/powerpc/include/asm/syscall_wrapper.h | 84 arch/powerpc/include/asm/syscalls.h| 30 ++- arch/powerpc/kernel/entry_32.S | 6 +- arch/powerpc/kernel/interrupt_64.S | 28 +-- arch/powerpc/kernel/syscall.c | 31 +++- arch/powerpc/kernel/systbl.c | 8 ++ arch/powerpc/kernel/vdso.c | 2 + 10 files changed, 164 insertions(+), 33 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 4c466acdc70d..ef6c83e79c9b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -137,6 +137,7 @@ config PPC select ARCH_HAS_STRICT_KERNEL_RWX if (PPC_BOOK3S || PPC_8xx || 40x) && !HIBERNATION select ARCH_HAS_STRICT_KERNEL_RWX if FSL_BOOKE && !HIBERNATION && !RANDOMIZE_BASE select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX + select ARCH_HAS_SYSCALL_WRAPPER if !SPU_BASE select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_UBSAN_SANITIZE_ALL diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index 8069dbc4b8d1..48eec9cd1429 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -665,8 +665,7 @@ static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs) local_irq_enable(); } -long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, - unsigned long r0, struct pt_regs *regs); +long system_call_exception(struct pt_regs *regs, unsigned long r0); notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs, long scv); notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs); notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs); diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index d2a8dfd5de33..3dd36c5e334a 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -14,8 +14,12 @@ #include #include +#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER +typedef long (*syscall_fn)(const struct pt_regs *); +#else typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); +#endif /* ftrace syscalls requires exporting the sys_call_table */ extern const syscall_fn sys_call_table[]; diff --git a/arch/powerpc/include/asm/syscall_wrapper.h b/arch/powerpc/include/asm/syscall_wrapper.h new file mode 100644 index ..91bcfa40f740 --- /dev/null +++ b/arch/powerpc/include/asm/syscall_wrapper.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * syscall_wrapper.h - powerpc specific wrappers to syscall definitions + * + * Base
[PATCH 09/23] asm-generic: compat: Support BE for long long args in 32-bit ABIs
32-bit ABIs support passing 64-bit integers by registers via argument translation. Commit 59c10c52f573 ("riscv: compat: syscall: Add compat_sys_call_table implementation") implements the compat_arg_u64 macro for efficiently defining little endian compatibility syscalls. Architectures supporting big endianness may benefit from reciprocal argument translation, but are welcome also to implement their own. Signed-off-by: Rohan McLure --- V4 -> V5: New patch. --- include/asm-generic/compat.h | 9 +++-- 1 file changed, 7 insertions(+), 2 deletions(-) diff --git a/include/asm-generic/compat.h b/include/asm-generic/compat.h index d06308a2a7a8..aeb257ad3d1a 100644 --- a/include/asm-generic/compat.h +++ b/include/asm-generic/compat.h @@ -14,12 +14,17 @@ #define COMPAT_OFF_T_MAX 0x7fff #endif -#if !defined(compat_arg_u64) && !defined(CONFIG_CPU_BIG_ENDIAN) +#ifndef compat_arg_u64 +#ifdef CONFIG_CPU_BIG_ENDIAN #define compat_arg_u64(name) u32 name##_lo, u32 name##_hi #define compat_arg_u64_dual(name) u32, name##_lo, u32, name##_hi +#else +#define compat_arg_u64(name) u32 name##_hi, u32 name##_lo +#define compat_arg_u64_dual(name) u32, name##_hi, u32, name##_lo +#endif #define compat_arg_u64_glue(name) (((u64)name##_lo & 0xUL) | \ ((u64)name##_hi << 32)) -#endif +#endif /* compat_arg_u64 */ /* These types are common across all compat ABIs */ typedef u32 compat_size_t; -- 2.34.1
[PATCH 14/23] powerpc: Provide do_ppc64_personality helper
Avoid duplication in future patch that will define the ppc64_personality syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, by extracting the common body of ppc64_personality into a helper function. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V2 -> V3: New commit. V4 -> V5: Remove 'inline'. --- arch/powerpc/kernel/syscalls.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 9830957498b0..135a0b9108d5 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -75,7 +75,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, } #ifdef CONFIG_PPC64 -long ppc64_personality(unsigned long personality) +static long do_ppc64_personality(unsigned long personality) { long ret; @@ -87,6 +87,10 @@ long ppc64_personality(unsigned long personality) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; } +long ppc64_personality(unsigned long personality) +{ + return do_ppc64_personality(personality); +} #endif long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, -- 2.34.1
[PATCH 06/23] powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS
The common interrupt handler prologue macro and the bad_stack trampolines include consecutive sequences of register saves, and some register clears. Neaten such instances by expanding use of the SAVE_GPRS macro and employing the ZEROIZE_GPR macro when appropriate. Also simplify an invocation of SAVE_GPRS targetting all non-volatile registers to SAVE_NVGPRS. Signed-off-by: Rohan Mclure Reported-by: Nicholas Piggin --- V3 -> V4: New commit. --- arch/powerpc/kernel/exceptions-64e.S | 27 +++--- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 67dc4e3179a0..48c640ca425d 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -216,17 +216,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) mtlrr10 mtcrr11 - ld r10,GPR10(r1) - ld r11,GPR11(r1) - ld r12,GPR12(r1) + REST_GPRS(10, 12, r1) mtspr \scratch,r0 std r10,\paca_ex+EX_R10(r13); std r11,\paca_ex+EX_R11(r13); ld r10,_NIP(r1) ld r11,_MSR(r1) - ld r0,GPR0(r1) - ld r1,GPR1(r1) + REST_GPRS(0, 1, r1) mtspr \srr0,r10 mtspr \srr1,r11 ld r10,\paca_ex+EX_R10(r13) @@ -372,16 +369,15 @@ ret_from_mc_except: /* Core exception code for all exceptions except TLB misses. */ #define EXCEPTION_COMMON_LVL(n, scratch, excf) \ exc_##n##_common: \ - std r0,GPR0(r1);/* save r0 in stackframe */ \ - std r2,GPR2(r1);/* save r2 in stackframe */ \ - SAVE_GPRS(3, 9, r1);/* save r3 - r9 in stackframe */\ + SAVE_GPR(0, r1);/* save r0 in stackframe */ \ + SAVE_GPRS(2, 9, r1);/* save r2 - r9 in stackframe */\ std r10,_NIP(r1); /* save SRR0 to stackframe */ \ std r11,_MSR(r1); /* save SRR1 to stackframe */ \ beq 2f; /* if from kernel mode */ \ 2: ld r3,excf+EX_R10(r13);/* get back r10 */ \ ld r4,excf+EX_R11(r13);/* get back r11 */ \ mfspr r5,scratch; /* get back r13 */ \ - std r12,GPR12(r1); /* save r12 in stackframe */\ + SAVE_GPR(12, r1); /* save r12 in stackframe */\ ld r2,PACATOC(r13);/* get kernel TOC into r2 */\ mflrr6; /* save LR in stackframe */ \ mfctr r7; /* save CTR in stackframe */\ @@ -390,7 +386,7 @@ exc_##n##_common: \ lwz r10,excf+EX_CR(r13);/* load orig CR back from PACA */ \ lbz r11,PACAIRQSOFTMASK(r13); /* get current IRQ softe */ \ ld r12,exception_marker@toc(r2); \ - li r0,0; \ + ZEROIZE_GPR(0); \ std r3,GPR10(r1); /* save r10 to stackframe */\ std r4,GPR11(r1); /* save r11 to stackframe */\ std r5,GPR13(r1); /* save it to stackframe */ \ @@ -1056,15 +1052,14 @@ bad_stack_book3e: mfspr r11,SPRN_ESR std r10,_DEAR(r1) std r11,_ESR(r1) - std r0,GPR0(r1);/* save r0 in stackframe */ \ - std r2,GPR2(r1);/* save r2 in stackframe */ \ - SAVE_GPRS(3, 9, r1);/* save r3 - r9 in stackframe */\ + SAVE_GPR(0, r1);/* save r0 in stackframe */ \ + SAVE_GPRS(2, 9, r1);/* save r2 - r9 in stackframe */\ ld r3,PACA_EXGEN+EX_R10(r13);/* get back r10 */\ ld r4,PACA_EXGEN+EX_R11(r13);/* get back r11 */\ mfspr r5,SPRN_SPRG_GEN_SCRATCH;/* get back r13 XXX can be wrong */ \ std r3,GPR10(r1); /* save r10 to stackframe */\ std r4,GPR11(r1); /* save r11 to stackframe */\ - std r12,GPR12(r1); /* save r12 in stackframe */\ + SAVE_GPR(12, r1); /* save r12 in stackframe */\ std r5,GPR13(r1); /* save it to stackframe */ \ mflrr10 mfctr r11 @@ -1072,12 +1067,12 @@ bad_stack_book3e: std r10,_LINK(r1) std r11,_CTR(r1) std r12,_XER(r1) - SAVE_GPRS(14, 31, r1) + SAVE_NVGPRS(r1) lhz r12,PACA_TRAP_SAVE(
[PATCH 13/23] powerpc: Remove direct call to mmap2 syscall handlers
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Move the compatibility syscall definition for mmap2 to syscalls.c, so that all mmap implementations can share a helper function. Remove 'inline' on static mmap helper. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V1 -> V2: Move mmap2 compat implementation to asm/kernel/syscalls.c. V3 -> V4: Move to be applied before syscall wrapper introduced. V4 -> V5: Remove 'inline' in helper. --- arch/powerpc/kernel/sys_ppc32.c | 9 - arch/powerpc/kernel/syscalls.c | 17 ++--- 2 files changed, 14 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index d961634976d8..776ae7565fc5 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include #include @@ -48,14 +47,6 @@ #include #include -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff) -{ - /* This should remain 12 even if PAGE_SIZE changes */ - return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); -} - compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, u32 reg6, u32 pos1, u32 pos2) { diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index a04c97faa21a..9830957498b0 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -36,9 +36,9 @@ #include #include -static inline long do_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long off, int shift) +static long do_mmap2(unsigned long addr, size_t len, +unsigned long prot, unsigned long flags, +unsigned long fd, unsigned long off, int shift) { if (!arch_validate_prot(prot, addr)) return -EINVAL; @@ -56,6 +56,17 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, pgoff, PAGE_SHIFT-12); } +#ifdef CONFIG_COMPAT +COMPAT_SYSCALL_DEFINE6(mmap2, + unsigned long, addr, size_t, len, + unsigned long, prot, unsigned long, flags, + unsigned long, fd, unsigned long, pgoff) +{ + /* This should remain 12 even if PAGE_SIZE changes */ + return do_mmap2(addr, len, prot, flags, fd, pgoff << 12, PAGE_SHIFT-12); +} +#endif + SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, unsigned long, prot, unsigned long, flags, unsigned long, fd, off_t, offset) -- 2.34.1
[PATCH 01/23] powerpc: Remove asmlinkage from syscall handler definitions
The asmlinkage macro has no special meaning in powerpc, and prior to this patch is used sporadically on some syscall handler definitions. On architectures that do not define asmlinkage, it resolves to extern "C" for C++ compilers and a nop otherwise. The current invocations of asmlinkage provide far from complete support for C++ toolchains, and so the macro serves no purpose in powerpc. Remove all invocations of asmlinkage in arch/powerpc. These incidentally only occur in syscall definitions and prototypes. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin Reviewed-by: Andrew Donnellan --- V2 -> V3: new patch --- arch/powerpc/include/asm/syscalls.h | 16 arch/powerpc/kernel/sys_ppc32.c | 8 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index a2b13e55254f..21c2faaa2957 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -10,14 +10,14 @@ struct rtas_args; -asmlinkage long sys_mmap(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, off_t offset); -asmlinkage long sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -asmlinkage long ppc64_personality(unsigned long personality); -asmlinkage long sys_rtas(struct rtas_args __user *uargs); +long sys_mmap(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, off_t offset); +long sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long ppc64_personality(unsigned long personality); +long sys_rtas(struct rtas_args __user *uargs); int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 16ff0399a257..f4edcc9489fb 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -85,20 +85,20 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 return ksys_readahead(fd, merge_64(offset1, offset2), count); } -asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4, +int compat_sys_truncate64(const char __user * path, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_truncate(path, merge_64(len1, len2)); } -asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, +long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2) { return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2, merge_64(len1, len2)); } -asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, +int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_ftruncate(fd, merge_64(len1, len2)); @@ -111,7 +111,7 @@ long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, advice); } -asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags, +long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned offset1, unsigned offset2, unsigned nbytes1, unsigned nbytes2) { -- 2.34.1
[PATCH 03/23] powerpc: Add ZEROIZE_GPRS macros for register clears
Provide register zeroing macros, following the same convention as existing register stack save/restore macros, to be used in later change to concisely zero a sequence of consecutive gprs. The resulting macros are called ZEROIZE_GPRS and ZEROIZE_NVGPRS, keeping with the naming of the accompanying restore and save macros, and usage of zeroize to describe this operation elsewhere in the kernel. Signed-off-by: Rohan McLure Reviewed-by: Nicholas Piggin --- V1 -> V2: Change 'ZERO' usage in naming to 'NULLIFY', a more obvious verb V2 -> V3: Change 'NULLIFY' usage in naming to 'ZEROIZE', which has precedent in kernel and explicitly specifies that we are zeroing. V3 -> V4: Update commit message to use zeroize. V4 -> V5: The reason for the patch is to add zeroize macros. Move that to first paragraph in patch description. --- arch/powerpc/include/asm/ppc_asm.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 83c02f5a7f2a..b95689ada59c 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -33,6 +33,20 @@ .endr .endm +/* + * This expands to a sequence of register clears for regs start to end + * inclusive, of the form: + * + * li rN, 0 + */ +.macro ZEROIZE_REGS start, end + .Lreg=\start + .rept (\end - \start + 1) + li .Lreg, 0 + .Lreg=.Lreg+1 + .endr +.endm + /* * Macros for storing registers into and loading registers from * exception frames. @@ -49,6 +63,14 @@ #define REST_NVGPRS(base) REST_GPRS(13, 31, base) #endif +#defineZEROIZE_GPRS(start, end)ZEROIZE_REGS start, end +#ifdef __powerpc64__ +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(14, 31) +#else +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(13, 31) +#endif +#defineZEROIZE_GPR(n) ZEROIZE_GPRS(n, n) + #define SAVE_GPR(n, base) SAVE_GPRS(n, n, base) #define REST_GPR(n, base) REST_GPRS(n, n, base) -- 2.34.1
[PATCH 16/23] powerpc: Include all arch-specific syscall prototypes
Forward declare all syscall handler prototypes where a generic prototype is not provided in either linux/syscalls.h or linux/compat.h in asm/syscalls.h. This is required for compile-time type-checking for syscall handlers, which is implemented later in this series. 32-bit compatibility syscall handlers are expressed in terms of types in ppc32.h. Expose this header globally. Signed-off-by: Rohan McLure --- V1 -> V2: Explicitly include prototypes. V2 -> V3: Remove extraneous #include and ppc_fallocate prototype. Rename header. V4 -> V5: Clean. Elaborate comment on long long munging. Remove prototype hiding conditional on SYSCALL_WRAPPER. --- arch/powerpc/include/asm/syscalls.h | 97 ++ .../ppc32.h => include/asm/syscalls_32.h}| 0 arch/powerpc/kernel/signal_32.c | 2 +- arch/powerpc/perf/callchain_32.c | 2 +- 4 files changed, 77 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 525d2aa0c8ca..5d106acf7906 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,6 +8,14 @@ #include #include +#ifdef CONFIG_PPC64 +#include +#endif +#include +#include + +struct rtas_args; + /* * long long munging: * The 32 bit ABI passes long longs in an odd even register pair. @@ -20,44 +28,89 @@ #define merge_64(high, low) ((u64)high << 32) | low #endif -struct rtas_args; +long sys_ni_syscall(void); + +/* + * PowerPC architecture-specific syscalls + */ + +long sys_rtas(struct rtas_args __user *uargs); + +#ifdef CONFIG_PPC64 +long sys_ppc64_personality(unsigned long personality); +#ifdef CONFIG_COMPAT +long compat_sys_ppc64_personality(unsigned long personality); +#endif /* CONFIG_COMPAT */ +#endif /* CONFIG_PPC64 */ +long sys_swapcontext(struct ucontext __user *old_ctx, +struct ucontext __user *new_ctx, long ctx_size); long sys_mmap(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, unsigned long fd, off_t offset); long sys_mmap2(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, unsigned long fd, unsigned long pgoff); -long sys_ppc64_personality(unsigned long personality); -long sys_rtas(struct rtas_args __user *uargs); -long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, - u32 len_high, u32 len_low); +long sys_switch_endian(void); -#ifdef CONFIG_COMPAT -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); - -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2); +#ifdef CONFIG_PPC32 +long sys_sigreturn(void); +long sys_debug_setcontext(struct ucontext __user *ctx, int ndbg, + struct sig_dbg_op __user *dbg); +#endif -compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2); +long sys_rt_sigreturn(void); -compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count); +long sys_subpage_prot(unsigned long addr, + unsigned long len, u32 __user *map); -int compat_sys_truncate64(const char __user *path, u32 reg4, - unsigned long len1, unsigned long len2); +#ifdef CONFIG_COMPAT +long compat_sys_swapcontext(struct ucontext32 __user *old_ctx, + struct ucontext32 __user *new_ctx, + int ctx_size); +long compat_sys_old_getrlimit(unsigned int resource, + struct compat_rlimit __user *rlim); +long compat_sys_sigreturn(void); +long compat_sys_rt_sigreturn(void); +#endif /* CONFIG_COMPAT */ -int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, - unsigned long len2); +/* + * Architecture specific signatures required by long long munging: + * The 32 bit ABI passes long longs in an odd even register pair. + * The following signatures provide a machine long parameter for + * each register that will be supplied. The implementation is + * responsible for combining parameter pairs. + */ +#ifdef CONFIG_COMPAT +long compat_sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long compat_sys_ppc_pread64(unsigned int fd, + char __user *ubuf, compat_size_t count, + u32 reg6, u32 pos1, u32 pos2); +long compat_sys_ppc_pwrite64(unsigned int fd, +const char __user *ubuf, c
[PATCH 23/23] powerpc/64e: Clear gprs on interrupt routine entry on Book3E
Zero GPRS r14-r31 on entry into the kernel for interrupt sources to limit influence of user-space values in potential speculation gadgets. Prior to this commit, all other GPRS are reassigned during the common prologue to interrupt handlers and so need not be zeroised explicitly. This may be done safely, without loss of register state prior to the interrupt, as the common prologue saves the initial values of non-volatiles, which are unconditionally restored in interrupt_64.S. Mitigation defaults to enabled by INTERRUPT_SANITIZE_REGISTERS. Signed-off-by: Rohan McLure --- V3 -> V4: New patch. V4 -> V5: Depend on Kconfig option. Remove ZEROIZE_NVGPRS on bad kernel stack handler. --- arch/powerpc/kernel/exceptions-64e.S | 8 +++- 1 file changed, 7 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 48c640ca425d..61748769ea29 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -365,6 +365,11 @@ ret_from_mc_except: std r14,PACA_EXMC+EX_R14(r13); \ std r15,PACA_EXMC+EX_R15(r13) +#ifdef CONFIG_INTERRUPT_SANITIZE_REGISTERS +#define SANITIZE_NVGPRSZEROIZE_NVGPRS() +#else +#define SANITIZE_NVGPRS +#endif /* Core exception code for all exceptions except TLB misses. */ #define EXCEPTION_COMMON_LVL(n, scratch, excf) \ @@ -401,7 +406,8 @@ exc_##n##_common: \ std r12,STACK_FRAME_OVERHEAD-16(r1); /* mark the frame */ \ std r3,_TRAP(r1); /* set trap number */ \ std r0,RESULT(r1); /* clear regs->result */\ - SAVE_NVGPRS(r1); + SAVE_NVGPRS(r1);\ + SANITIZE_NVGPRS;/* minimise speculation influence */ #define EXCEPTION_COMMON(n) \ EXCEPTION_COMMON_LVL(n, SPRN_SPRG_GEN_SCRATCH, PACA_EXGEN) -- 2.34.1
Re: [PATCH v4 19/20] powerpc/64s: Clear gprs on interrupt routine entry in Book3S
> On 12 Sep 2022, at 10:15 pm, Nicholas Piggin wrote: > > On Wed Aug 24, 2022 at 12:05 PM AEST, Rohan McLure wrote: >> Zero GPRS r0, r2-r11, r14-r31, on entry into the kernel for all >> other interrupt sources to limit influence of user-space values >> in potential speculation gadgets. The remaining gprs are overwritten by >> entry macros to interrupt handlers, irrespective of whether or not a >> given handler consumes these register values. >> >> Prior to this commit, r14-r31 are restored on a per-interrupt basis at >> exit, but now they are always restored. Remove explicit REST_NVGPRS >> invocations as non-volatiles must now always be restored. 32-bit systems >> do not clear user registers on interrupt, and continue to depend on the >> return value of interrupt_exit_user_prepare to determine whether or not >> to restore non-volatiles. >> >> The mmap_bench benchmark in selftests should rapidly invoke pagefaults. >> See ~0.8% performance regression with this mitigation, but this >> indicates the worst-case performance due to heavier-weight interrupt >> handlers. > > Ow, my heart :( > > Are we not keeping a CONFIG option to rid ourselves of this vile > performance robbing thing? Are we getting rid of the whole > _TIF_RESTOREALL thing too, or does PPC32 want to keep it? I see no reason not to include a CONFIG option for this mitigation here other than simplicity. Any suggestions for a name? I’m thinking PPC64_SANITIZE_INTERRUPTS. Defaults on Book3E_64, optional on Book3S_64. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Add benchmark data >> V2 -> V3: Use ZEROIZE_GPR{,S} macro renames, clarify >> interrupt_exit_user_prepare changes in summary. >> --- >> arch/powerpc/kernel/exceptions-64s.S | 21 - >> arch/powerpc/kernel/interrupt_64.S | 9 ++--- >> 2 files changed, 10 insertions(+), 20 deletions(-) >> >> diff --git a/arch/powerpc/kernel/exceptions-64s.S >> b/arch/powerpc/kernel/exceptions-64s.S >> index a3b51441b039..038e42fb2182 100644 >> --- a/arch/powerpc/kernel/exceptions-64s.S >> +++ b/arch/powerpc/kernel/exceptions-64s.S >> @@ -502,6 +502,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real, text) >> std r10,0(r1) /* make stack chain pointer */ >> std r0,GPR0(r1) /* save r0 in stackframe*/ >> std r10,GPR1(r1)/* save r1 in stackframe*/ >> +ZEROIZE_GPR(0) >> >> /* Mark our [H]SRRs valid for return */ >> li r10,1 >> @@ -538,14 +539,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> ld r10,IAREA+EX_R10(r13) >> std r9,GPR9(r1) >> std r10,GPR10(r1) >> +ZEROIZE_GPRS(9, 10) > > You use 9/10 right afterwards, this'd have to move down to where > you zero r11 at least. > >> ld r9,IAREA+EX_R11(r13)/* move r11 - r13 to stackframe */ >> ld r10,IAREA+EX_R12(r13) >> ld r11,IAREA+EX_R13(r13) >> std r9,GPR11(r1) >> std r10,GPR12(r1) >> std r11,GPR13(r1) >> +/* keep r12 ([H]SRR1/MSR), r13 (PACA) for interrupt routine */ >> +ZEROIZE_GPR(11) > > Kernel always has to keep r13 so no need to comment that. Keeping r11, > is that for those annoying fp_unavailable etc handlers? > > There's probably not much a user can do with this, given they're set > from the MSR. Use can influence some bits of its MSR though. So long > as we're being paranoid, you could add an IOPTION to retain r11 only for > the handlers that need it, or have them load it from MSR and zero it > here. Good suggestion. Presume you’re referring to r12 here. I might go the IOPTION route. > > Thanks, > Nick > >> >> SAVE_NVGPRS(r1) >> +ZEROIZE_NVGPRS() >> >> .if IDAR >> .if IISIDE >> @@ -577,8 +582,8 @@ BEGIN_FTR_SECTION >> END_FTR_SECTION_IFSET(CPU_FTR_CFAR) >> ld r10,IAREA+EX_CTR(r13) >> std r10,_CTR(r1) >> -std r2,GPR2(r1) /* save r2 in stackframe*/ >> -SAVE_GPRS(3, 8, r1) /* save r3 - r8 in stackframe */ >> +SAVE_GPRS(2, 8, r1) /* save r2 - r8 in stackframe */ >> +ZEROIZE_GPRS(2, 8) >> mflrr9 /* Get LR, later save to stack */ >> ld r2,PACATOC(r13) /* get kernel TOC into r2 */ >> std r9,_LINK(r1) >> @@ -696,6 +701,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) >> mtlrr9 >> ld
Re: [PATCH v4 11/20] powerpc: Add ZEROIZE_GPRS macros for register clears
> On 12 Sep 2022, at 9:09 pm, Nicholas Piggin wrote: > > On Wed Aug 24, 2022 at 12:05 PM AEST, Rohan McLure wrote: >> Macros for restoring and saving registers to and from the stack exist. >> Provide macros with the same interface for clearing a range of gprs by >> setting each register's value in that range to zero. > > Can I bikeshed this a bit more and ask if you would change the order? > > Saving and restoring macros are incidental, and are neither the change > nor the reson for it. I think the changelog reads better if you state > the change (or the problem) up front. > > Provid register zeroing macros ... follow the same convention as > existing register stack save/restore macros ... will be used in later > change to zero registers. Thanks for the suggestion, that should read better. > > Or not, if you disagree. > > Reviewed-by: Nicholas Piggin > >> The resulting macros are called ZEROIZE_GPRS and ZEROIZE_NVGPRS, keeping >> with the naming of the accompanying restore and save macros, and usage >> of zeroize to describe this operation elsewhere in the kernel. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Change 'ZERO' usage in naming to 'NULLIFY', a more obvious verb >> V2 -> V3: Change 'NULLIFY' usage in naming to 'ZEROIZE', which has >> precedent in kernel and explicitly specifies that we are zeroing. >> V3 -> V4: Update commit message to use zeroize. >> --- >> arch/powerpc/include/asm/ppc_asm.h | 22 ++ >> 1 file changed, 22 insertions(+) >> >> diff --git a/arch/powerpc/include/asm/ppc_asm.h >> b/arch/powerpc/include/asm/ppc_asm.h >> index 83c02f5a7f2a..b95689ada59c 100644 >> --- a/arch/powerpc/include/asm/ppc_asm.h >> +++ b/arch/powerpc/include/asm/ppc_asm.h >> @@ -33,6 +33,20 @@ >> .endr >> .endm >> >> +/* >> + * This expands to a sequence of register clears for regs start to end >> + * inclusive, of the form: >> + * >> + * li rN, 0 >> + */ >> +.macro ZEROIZE_REGS start, end >> +.Lreg=\start >> +.rept (\end - \start + 1) >> +li .Lreg, 0 >> +.Lreg=.Lreg+1 >> +.endr >> +.endm >> + >> /* >> * Macros for storing registers into and loading registers from >> * exception frames. >> @@ -49,6 +63,14 @@ >> #define REST_NVGPRS(base)REST_GPRS(13, 31, base) >> #endif >> >> +#define ZEROIZE_GPRS(start, end)ZEROIZE_REGS start, end >> +#ifdef __powerpc64__ >> +#define ZEROIZE_NVGPRS()ZEROIZE_GPRS(14, 31) >> +#else >> +#define ZEROIZE_NVGPRS()ZEROIZE_GPRS(13, 31) >> +#endif >> +#define ZEROIZE_GPR(n) ZEROIZE_GPRS(n, n) >> + >> #define SAVE_GPR(n, base)SAVE_GPRS(n, n, base) >> #define REST_GPR(n, base)REST_GPRS(n, n, base) >> >> -- >> 2.34.1 >
Re: [PATCH v4 10/20] powerpc: Use common syscall handler type
> On 12 Sep 2022, at 8:56 pm, Nicholas Piggin wrote: > > On Wed Aug 24, 2022 at 12:05 PM AEST, Rohan McLure wrote: >> Cause syscall handlers to be typed as follows when called indirectly >> throughout the kernel. >> >> typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, >> unsigned long, unsigned long, unsigned long); > > The point is... better type checking? > >> >> Since both 32 and 64-bit abis allow for at least the first six >> machine-word length parameters to a function to be passed by registers, >> even handlers which admit fewer than six parameters may be viewed as >> having the above type. >> >> Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce >> explicit cast on systems with SPUs. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: New patch. >> V2 -> V3: Remove unnecessary cast from const syscall_fn to syscall_fn >> --- >> arch/powerpc/include/asm/syscall.h | 7 +-- >> arch/powerpc/include/asm/syscalls.h | 1 + >> arch/powerpc/kernel/systbl.c| 6 +++--- >> arch/powerpc/kernel/vdso.c | 4 ++-- >> arch/powerpc/platforms/cell/spu_callbacks.c | 6 +++--- >> 5 files changed, 14 insertions(+), 10 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/syscall.h >> b/arch/powerpc/include/asm/syscall.h >> index 25fc8ad9a27a..d2a8dfd5de33 100644 >> --- a/arch/powerpc/include/asm/syscall.h >> +++ b/arch/powerpc/include/asm/syscall.h >> @@ -14,9 +14,12 @@ >> #include >> #include >> >> +typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, >> + unsigned long, unsigned long, unsigned long); >> + >> /* ftrace syscalls requires exporting the sys_call_table */ >> -extern const unsigned long sys_call_table[]; >> -extern const unsigned long compat_sys_call_table[]; >> +extern const syscall_fn sys_call_table[]; >> +extern const syscall_fn compat_sys_call_table[]; > > Ah you constify it in this patch. I think the previous patch should have > kept the const, and it should keep the unsigned long type rather than > use void *. Either that or do this patch first. > >> static inline int syscall_get_nr(struct task_struct *task, struct pt_regs >> *regs) >> { >> diff --git a/arch/powerpc/include/asm/syscalls.h >> b/arch/powerpc/include/asm/syscalls.h >> index 91417dee534e..e979b7593d2b 100644 >> --- a/arch/powerpc/include/asm/syscalls.h >> +++ b/arch/powerpc/include/asm/syscalls.h >> @@ -8,6 +8,7 @@ >> #include >> #include >> >> +#include >> #ifdef CONFIG_PPC64 >> #include >> #endif > > Is this necessary or should be in another patch? Good spot. This belongs in the patch that produces systbl.c. > >> diff --git a/arch/powerpc/kernel/systbl.c b/arch/powerpc/kernel/systbl.c >> index 99ffdfef6b9c..b88a9c2a1f50 100644 >> --- a/arch/powerpc/kernel/systbl.c >> +++ b/arch/powerpc/kernel/systbl.c >> @@ -21,10 +21,10 @@ >> #define __SYSCALL(nr, entry) [nr] = __powerpc_##entry, >> #define __powerpc_sys_ni_syscall sys_ni_syscall >> #else >> -#define __SYSCALL(nr, entry) [nr] = entry, >> +#define __SYSCALL(nr, entry) [nr] = (void *) entry, >> #endif > > Also perhaps this should have been in the prior pach and this pach > should change the cast from void to syscall_fn ? This cast to (void *) kicks in when casting functions with six or fewer parameters to six-parameter type accepting and returning u64. Sadly I can’t find a way to avoid -Wcast-function-type even with (__force syscall_fn) short of an ugly casti to void* here. Any suggestions? > >> >> -void *sys_call_table[] = { >> +const syscall_fn sys_call_table[] = { >> #ifdef CONFIG_PPC64 >> #include >> #else >> @@ -35,7 +35,7 @@ void *sys_call_table[] = { >> #ifdef CONFIG_COMPAT >> #undef __SYSCALL_WITH_COMPAT >> #define __SYSCALL_WITH_COMPAT(nr, native, compat)__SYSCALL(nr, compat) >> -void *compat_sys_call_table[] = { >> +const syscall_fn compat_sys_call_table[] = { >> #include >> }; >> #endif /* CONFIG_COMPAT */ >> diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c >> index 0da287544054..080c9e7de0c0 100644 >> --- a/arch/powerpc/kernel/vdso.c >> +++ b/arch/powerpc/kernel/vdso.c >> @@ -313,10 +313,10 @@ static void __init vdso_setup_syscall_map(void) >> unsigned int i; >> >> for (i = 0; i < NR_syscalls; i++) { >> -if (sys_ca
Re: [PATCH v4 06/20] powerpc: Remove direct call to mmap2 syscall handlers
> On 12 Sep 2022, at 7:47 pm, Nicholas Piggin wrote: > > On Wed Aug 24, 2022 at 12:05 PM AEST, Rohan McLure wrote: >> Syscall handlers should not be invoked internally by their symbol names, >> as these symbols defined by the architecture-defined SYSCALL_DEFINE >> macro. Move the compatibility syscall definition for mmap2 to >> syscalls.c, so that all mmap implementations can share an inline helper >> function, as is done with the personality handlers. >> >> Signed-off-by: Rohan McLure > > Is there any point to keping sys_ppc32.c at all? Might as well move them > all to syscall.c IMO. Currently serves as a fairly arbitrary distinginction between compat calls and others, noting that a compat variant of personality is in syscalls.c. May as well get rid of it. > Reviewed-by: Nicholas Piggin > >> --- >> V1 -> V2: Move mmap2 compat implementation to asm/kernel/syscalls.c. >> V3 -> V4: Move to be applied before syscall wrapper introduced. >> --- >> arch/powerpc/kernel/sys_ppc32.c | 9 - >> arch/powerpc/kernel/syscalls.c | 11 +++ >> 2 files changed, 11 insertions(+), 9 deletions(-) >> >> diff --git a/arch/powerpc/kernel/sys_ppc32.c >> b/arch/powerpc/kernel/sys_ppc32.c >> index f4edcc9489fb..bc6491ed6454 100644 >> --- a/arch/powerpc/kernel/sys_ppc32.c >> +++ b/arch/powerpc/kernel/sys_ppc32.c >> @@ -25,7 +25,6 @@ >> #include >> #include >> #include >> -#include >> #include >> #include >> #include >> @@ -48,14 +47,6 @@ >> #include >> #include >> >> -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, >> - unsigned long prot, unsigned long flags, >> - unsigned long fd, unsigned long pgoff) >> -{ >> -/* This should remain 12 even if PAGE_SIZE changes */ >> -return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); >> -} >> - >> /* >> * long long munging: >> * The 32 bit ABI passes long longs in an odd even register pair. >> diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c >> index b8461128c8f7..32fadf3c2cd3 100644 >> --- a/arch/powerpc/kernel/syscalls.c >> +++ b/arch/powerpc/kernel/syscalls.c >> @@ -56,6 +56,17 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len, >> return do_mmap2(addr, len, prot, flags, fd, pgoff, PAGE_SHIFT-12); >> } >> >> +#ifdef CONFIG_COMPAT >> +COMPAT_SYSCALL_DEFINE6(mmap2, >> + unsigned long, addr, size_t, len, >> + unsigned long, prot, unsigned long, flags, >> + unsigned long, fd, unsigned long, pgoff) >> +{ >> +/* This should remain 12 even if PAGE_SIZE changes */ >> +return do_mmap2(addr, len, prot, flags, fd, pgoff << 12, PAGE_SHIFT-12); >> +} >> +#endif >> + >> SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, >> unsigned long, prot, unsigned long, flags, >> unsigned long, fd, off_t, offset) >> -- >> 2.34.1 >
Re: [PATCH v4 03/20] powerpc/32: Remove powerpc select specialisation
> On 12 Sep 2022, at 7:03 pm, Nicholas Piggin wrote: > > On Wed Aug 24, 2022 at 12:05 PM AEST, Rohan McLure wrote: >> Syscall #82 has been implemented for 32-bit platforms in a unique way on >> powerpc systems. This hack will in effect guess whether the caller is >> expecting new select semantics or old select semantics. It does so via a >> guess, based off the first parameter. In new select, this parameter >> represents the length of a user-memory array of file descriptors, and in >> old select this is a pointer to an arguments structure. >> >> The heuristic simply interprets sufficiently large values of its first >> parameter as being a call to old select. The following is a discussion >> on how this syscall should be handled. >> >> Link: >> https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a...@csgroup.eu/ > > Seems okay to me, probably Christophe needs to ack it. > Should some of that history be included directly in this changelog? > > Should ppc64 compat be added back too, if this is being updated instead > of removed? I don't know much about compat but it seems odd not provide > it (considering it's just using compat_sys_old_select, isn't it? That would make sense to me. I’ll put that in syscall.tbl. > Reviewed-by: Nicholas Piggin > >> >> As discussed in this thread, the existence of such a hack suggests that for >> whatever powerpc binaries may predate glibc, it is most likely that they >> would have taken use of the old select semantics. x86 and arm64 both >> implement this syscall with oldselect semantics. >> >> Remove the powerpc implementation, and update syscall.tbl to refer to emit >> a reference to sys_old_select for 32-bit binaries, in keeping with how >> other architectures support syscall #82. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Remove arch-specific select handler >> V2 -> V3: Remove ppc_old_select prototype in . Move to >> earlier in patch series >> --- >> arch/powerpc/include/asm/syscalls.h | 2 -- >> arch/powerpc/kernel/syscalls.c| 17 - >> arch/powerpc/kernel/syscalls/syscall.tbl | 2 +- >> .../arch/powerpc/entry/syscalls/syscall.tbl | 2 +- >> 4 files changed, 2 insertions(+), 21 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/syscalls.h >> b/arch/powerpc/include/asm/syscalls.h >> index 675a8f5ec3ca..739498c358a1 100644 >> --- a/arch/powerpc/include/asm/syscalls.h >> +++ b/arch/powerpc/include/asm/syscalls.h >> @@ -18,8 +18,6 @@ long sys_mmap2(unsigned long addr, size_t len, >> unsigned long fd, unsigned long pgoff); >> long ppc64_personality(unsigned long personality); >> long sys_rtas(struct rtas_args __user *uargs); >> -int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, >> - fd_set __user *exp, struct __kernel_old_timeval __user *tvp); >> long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, >>u32 len_high, u32 len_low); >> >> diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c >> index fc999140bc27..ef5896bee818 100644 >> --- a/arch/powerpc/kernel/syscalls.c >> +++ b/arch/powerpc/kernel/syscalls.c >> @@ -63,23 +63,6 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, >> return do_mmap2(addr, len, prot, flags, fd, offset, PAGE_SHIFT); >> } >> >> -#ifdef CONFIG_PPC32 >> -/* >> - * Due to some executables calling the wrong select we sometimes >> - * get wrong args. This determines how the args are being passed >> - * (a single ptr to them all args passed) then calls >> - * sys_select() with the appropriate args. -- Cort >> - */ >> -int >> -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user >> *exp, struct __kernel_old_timeval __user *tvp) >> -{ >> -if ((unsigned long)n >= 4096) >> -return sys_old_select((void __user *)n); >> - >> -return sys_select(n, inp, outp, exp, tvp); >> -} >> -#endif >> - >> #ifdef CONFIG_PPC64 >> long ppc64_personality(unsigned long personality) >> { >> diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl >> b/arch/powerpc/kernel/syscalls/syscall.tbl >> index 2600b4237292..4cbbb810ae10 100644 >> --- a/arch/powerpc/kernel/syscalls/syscall.tbl >> +++ b/arch/powerpc/kernel/syscalls/syscall.tbl >> @@ -110,7 +110,7 @@ >> 79 common settimeofdaysys_settimeofday >> compat_sys_settimeofday >> 8
Re: [PATCH v4 08/20] powerpc: Include all arch-specific syscall prototypes
> On 12 Sep 2022, at 8:33 pm, Nicholas Piggin wrote: > > On Wed Aug 24, 2022 at 12:05 PM AEST, Rohan McLure wrote: >> Forward declare all syscall handler prototypes where a generic prototype >> is not provided in either linux/syscalls.h or linux/compat.h in >> asm/syscalls.h. This is required for compile-time type-checking for >> syscall handlers, which is implemented later in this series. >> >> 32-bit compatibility syscall handlers are expressed in terms of types in >> ppc32.h. Expose this header globally. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Explicitly include prototypes. >> V2 -> V3: Remove extraneous #include and ppc_fallocate >> prototype. Rename header. >> --- >> arch/powerpc/include/asm/syscalls.h | 90 +- >> .../ppc32.h => include/asm/syscalls_32.h}| 0 >> arch/powerpc/kernel/signal_32.c | 2 +- >> arch/powerpc/perf/callchain_32.c | 2 +- >> 4 files changed, 70 insertions(+), 24 deletions(-) >> >> diff --git a/arch/powerpc/include/asm/syscalls.h >> b/arch/powerpc/include/asm/syscalls.h >> index 3e3aff0835a6..91417dee534e 100644 >> --- a/arch/powerpc/include/asm/syscalls.h >> +++ b/arch/powerpc/include/asm/syscalls.h >> @@ -8,45 +8,91 @@ >> #include >> #include >> >> +#ifdef CONFIG_PPC64 >> +#include >> +#endif >> +#include >> +#include >> + >> struct rtas_args; >> >> +#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER > > Do you need this ifdef? Good spot. I’ll introduce that with syscall wrappers. >> + >> +/* >> + * PowerPC architecture-specific syscalls >> + */ >> + >> +long sys_rtas(struct rtas_args __user *uargs); >> +long sys_ni_syscall(void); >> + >> +#ifdef CONFIG_PPC64 >> +long sys_ppc64_personality(unsigned long personality); >> +#ifdef CONFIG_COMPAT >> +long compat_sys_ppc64_personality(unsigned long personality); >> +#endif /* CONFIG_COMPAT */ >> +#endif /* CONFIG_PPC64 */ >> + >> +/* Parameters are reordered for powerpc to avoid padding */ >> +long sys_ppc_fadvise64_64(int fd, int advice, >> + u32 offset_high, u32 offset_low, >> + u32 len_high, u32 len_low); > > Should this be under PPC32 since you're adding the ifdefs? > > Because you added a new comment here... This is register padding > to do with something, even/odd pair calling convention? I can't > remember the details would you be able to expand the comment a bit > because I'm sure I'll forget it again too. Cool. I’ll neaten this up and change that comment. > > Thanks, > Nick
Re: [PATCH 3/3] powerpc: mm: support page table check
> On 12 Sep 2022, at 4:11 pm, Christophe Leroy > wrote: > > > > Le 12/09/2022 à 03:47, Rohan McLure a écrit : >> On creation and clearing of a page table mapping, instrument such calls >> by invoking page_table_check_pte_set and page_table_check_pte_clear >> respectively. These calls serve as a sanity check against illegal >> mappings. >> >> Enable ARCH_SUPPORTS_PAGE_TABLE_CHECK for all ppc64, and 32-bit >> platforms implementing Book3S. > > Why only book3s on 32 bits ? Sorry. I failed to update that commit message. This patch instead supports, page table checks on all platforms, but I began writing this patch series to target just Book3S, and then updated it to include all platforms. The only barrier to doing so was the need for the pud_pfn and page_table_check_pud_{clear,set} bloat. >> +++ b/arch/powerpc/include/asm/pgtable.h >> @@ -166,7 +166,11 @@ static inline int pud_pfn(pud_t pud) >> * check so this should never be used. If it grows another user we >> * want to know about it. >> */ >> +#ifndef CONFIG_PAGE_TABLE_CHECK >> BUILD_BUG(); >> +#else >> +BUG(); >> +#endif > > Awfull. Quite right. I suspect you can infer the intention here, which is to enforce that this dead code must not be included anywhere in generic code, but rather be gated by pud_devmap. I will relax this to a WARN().
Re: [PATCH 2/3] powerpc: mm: add p{te,md,ud}_user_accessible_page helpers
>> +static inline bool pmd_user_accessible_page(pmd_t pmd) >> +{ >> +return pmd_is_leaf(pmd) && pmd_present(pmd) >> +&& pte_user(pmd_pte(pmd)); > > The && goes on previous line. > By the way, there is a great tool that can help you : > > $ ./arch/powerpc/tools/checkpatch.sh --strict -g HEAD~ Thank you. Yes I should be using --strict with checkpatch. > >> +} >> + >> +#else >> + >> +static inline bool pmd_user_accessible_page(pmd_t pmd) >> +{ >> +BUG(); > > Can you use BUILD_BUG() instead ? As it stands, p{m,u}d_user_accessible page is invoked by __page_table_check_p{m,u}d_{set,clear}. So unfortunately this function will have to be compiled, but its body could be replaced with a WARN() and return false on 32-bit.
Re: [PATCH 1/3] powerpc: mm: move pud_pfn stub to common pgtable header
This patch and its successor would be avoidable if architectures could specify that they wish to use page_table_check_p{ud,md}_{clear,set}. > On 12 Sep 2022, at 11:47 am, Rohan McLure wrote: > > The pud_pfn inline call is only referenced on 64-bit Book3S systems, > but its invocations are gated by pud_devmap() invocations, rendering the > body of this function as dead code. > > As such, this function is readily exportable to all platforms in the > instance where kernel features depend on it at least being defined. > > Signed-off-by: Rohan McLure > --- > arch/powerpc/include/asm/book3s/64/pgtable.h | 10 -- > arch/powerpc/include/asm/pgtable.h | 12 > 2 files changed, 12 insertions(+), 10 deletions(-) > > diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h > b/arch/powerpc/include/asm/book3s/64/pgtable.h > index 392ff48f77df..8874f2a3661d 100644 > --- a/arch/powerpc/include/asm/book3s/64/pgtable.h > +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h > @@ -1411,16 +1411,6 @@ static inline int pgd_devmap(pgd_t pgd) > } > #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ > > -static inline int pud_pfn(pud_t pud) > -{ > - /* > - * Currently all calls to pud_pfn() are gated around a pud_devmap() > - * check so this should never be used. If it grows another user we > - * want to know about it. > - */ > - BUILD_BUG(); > - return 0; > -} > #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION > pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); > void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, > diff --git a/arch/powerpc/include/asm/pgtable.h > b/arch/powerpc/include/asm/pgtable.h > index 33f4bf8d22b0..522145b16a07 100644 > --- a/arch/powerpc/include/asm/pgtable.h > +++ b/arch/powerpc/include/asm/pgtable.h > @@ -158,6 +158,18 @@ struct seq_file; > void arch_report_meminfo(struct seq_file *m); > #endif /* CONFIG_PPC64 */ > > +#define pud_pfn pud_pfn > +static inline int pud_pfn(pud_t pud) > +{ > + /* > + * Currently all calls to pud_pfn() are gated around a pud_devmap() > + * check so this should never be used. If it grows another user we > + * want to know about it. > + */ > + BUILD_BUG(); > + return 0; > +} > + > #endif /* __ASSEMBLY__ */ > > #endif /* _ASM_POWERPC_PGTABLE_H */ > -- > 2.34.1 >
[PATCH 3/3] powerpc: mm: support page table check
On creation and clearing of a page table mapping, instrument such calls by invoking page_table_check_pte_set and page_table_check_pte_clear respectively. These calls serve as a sanity check against illegal mappings. Enable ARCH_SUPPORTS_PAGE_TABLE_CHECK for all ppc64, and 32-bit platforms implementing Book3S. Change pud_pfn to be a runtime bug rather than a build bug as it is consumed by page_table_check_pud_{clear,set} which are not called. See also: riscv support in commit 3fee229a8eb9 ("riscv/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") arm64 in commit 42b2547137f5 ("arm64/mm: enable ARCH_SUPPORTS_PAGE_TABLE_CHECK") x86_64 in commit d283d422c6c4 ("x86: mm: add x86_64 support for page table check") Signed-off-by: Rohan McLure --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/book3s/32/pgtable.h | 7 ++- arch/powerpc/include/asm/book3s/64/pgtable.h | 9 - arch/powerpc/include/asm/nohash/32/pgtable.h | 5 - arch/powerpc/include/asm/nohash/64/pgtable.h | 2 ++ arch/powerpc/include/asm/nohash/pgtable.h| 1 + arch/powerpc/include/asm/pgtable.h | 4 7 files changed, 26 insertions(+), 3 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 4c466acdc70d..6c213ac46a92 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -149,6 +149,7 @@ config PPC select ARCH_STACKWALK select ARCH_SUPPORTS_ATOMIC_RMW select ARCH_SUPPORTS_DEBUG_PAGEALLOCif PPC_BOOK3S || PPC_8xx || 40x + select ARCH_SUPPORTS_PAGE_TABLE_CHECK select ARCH_USE_BUILTIN_BSWAP select ARCH_USE_CMPXCHG_LOCKREF if PPC64 select ARCH_USE_MEMTEST diff --git a/arch/powerpc/include/asm/book3s/32/pgtable.h b/arch/powerpc/include/asm/book3s/32/pgtable.h index 40041ac713d9..3d05c8fe4604 100644 --- a/arch/powerpc/include/asm/book3s/32/pgtable.h +++ b/arch/powerpc/include/asm/book3s/32/pgtable.h @@ -53,6 +53,8 @@ #ifndef __ASSEMBLY__ +#include + static inline bool pte_user(pte_t pte) { return pte_val(pte) & _PAGE_USER; @@ -353,7 +355,9 @@ static inline int __ptep_test_and_clear_young(struct mm_struct *mm, static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { - return __pte(pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0)); + unsigned long old = pte_update(mm, addr, ptep, ~_PAGE_HASHPTE, 0, 0); + page_table_check_pte_clear(mm, addr, __pte(old)); + return __pte(old); } #define __HAVE_ARCH_PTEP_SET_WRPROTECT @@ -541,6 +545,7 @@ static inline pte_t pte_modify(pte_t pte, pgprot_t newprot) static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, pte_t *ptep, pte_t pte, int percpu) { + page_table_check_pte_set(mm, addr, ptep, pte); #if defined(CONFIG_SMP) && !defined(CONFIG_PTE_64BIT) /* First case is 32-bit Hash MMU in SMP mode with 32-bit PTEs. We use the * helper pte_update() which does an atomic update. We need to do that diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 8874f2a3661d..cbb7bd99c897 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -179,6 +179,8 @@ #define PAGE_AGP (PAGE_KERNEL_NC) #ifndef __ASSEMBLY__ +#include + /* * page table defines */ @@ -483,6 +485,7 @@ static inline pte_t ptep_get_and_clear(struct mm_struct *mm, unsigned long addr, pte_t *ptep) { unsigned long old = pte_update(mm, addr, ptep, ~0UL, 0, 0); + page_table_check_pte_clear(mm, addr, __pte(old)); return __pte(old); } @@ -491,12 +494,15 @@ static inline pte_t ptep_get_and_clear_full(struct mm_struct *mm, unsigned long addr, pte_t *ptep, int full) { + pte_t old_pte; if (full && radix_enabled()) { /* * We know that this is a full mm pte clear and * hence can be sure there is no parallel set_pte. */ - return radix__ptep_get_and_clear_full(mm, addr, ptep, full); + old_pte = radix__ptep_get_and_clear_full(mm, addr, ptep, full); + page_table_check_pte_clear(mm, addr, old_pte); + return old_pte; } return ptep_get_and_clear(mm, addr, ptep); } @@ -872,6 +878,7 @@ static inline void __set_pte_at(struct mm_struct *mm, unsigned long addr, */ pte = __pte_raw(pte_raw(pte) | cpu_to_be64(_PAGE_PTE)); + page_table_check_pte_set(mm, addr, ptep, pte); if (radix_enabled()) return radix__set_pte_at(mm, addr, ptep, pte, percpu); return hash__set_pte_at(mm, addr, ptep, pte, percpu
[PATCH 2/3] powerpc: mm: add p{te,md,ud}_user_accessible_page helpers
Add the following helpers for detecting whether a page table entry is a leaf and is accessible to user space. * pte_user_accessible_page * pmd_user_accessible_page * pud_user_accessible_page The heavy lifting is done by pte_user, which checks user accessibility on a per-mmu level, provided prior to this commit. On 32-bit systems, provide stub implementations for these methods, with BUG(), as debug features such as page table checks will emit functions that call p{md,ud}_user_accessible_page but must not be used. Signed-off-by: Rohan McLure --- arch/powerpc/include/asm/pgtable.h | 35 1 file changed, 35 insertions(+) diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 522145b16a07..8c1f5feb9360 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -170,6 +170,41 @@ static inline int pud_pfn(pud_t pud) return 0; } +static inline bool pte_user_accessible_page(pte_t pte) +{ + return (pte_val(pte) & _PAGE_PRESENT) && pte_user(pte); +} + +#ifdef CONFIG_PPC64 + +static inline bool pmd_user_accessible_page(pmd_t pmd) +{ + return pmd_is_leaf(pmd) && pmd_present(pmd) + && pte_user(pmd_pte(pmd)); +} + +static inline bool pud_user_accessible_page(pud_t pud) +{ + return pud_is_leaf(pud) && pud_present(pud) + && pte_user(pud_pte(pud)); +} + +#else + +static inline bool pmd_user_accessible_page(pmd_t pmd) +{ + BUG(); + return false; +} + +static inline bool pud_user_accessible_page(pud_t pud) +{ + BUG(); + return false; +} + +#endif /* CONFIG_PPC64 */ + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ -- 2.34.1
[PATCH 1/3] powerpc: mm: move pud_pfn stub to common pgtable header
The pud_pfn inline call is only referenced on 64-bit Book3S systems, but its invocations are gated by pud_devmap() invocations, rendering the body of this function as dead code. As such, this function is readily exportable to all platforms in the instance where kernel features depend on it at least being defined. Signed-off-by: Rohan McLure --- arch/powerpc/include/asm/book3s/64/pgtable.h | 10 -- arch/powerpc/include/asm/pgtable.h | 12 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/book3s/64/pgtable.h b/arch/powerpc/include/asm/book3s/64/pgtable.h index 392ff48f77df..8874f2a3661d 100644 --- a/arch/powerpc/include/asm/book3s/64/pgtable.h +++ b/arch/powerpc/include/asm/book3s/64/pgtable.h @@ -1411,16 +1411,6 @@ static inline int pgd_devmap(pgd_t pgd) } #endif /* CONFIG_TRANSPARENT_HUGEPAGE */ -static inline int pud_pfn(pud_t pud) -{ - /* -* Currently all calls to pud_pfn() are gated around a pud_devmap() -* check so this should never be used. If it grows another user we -* want to know about it. -*/ - BUILD_BUG(); - return 0; -} #define __HAVE_ARCH_PTEP_MODIFY_PROT_TRANSACTION pte_t ptep_modify_prot_start(struct vm_area_struct *, unsigned long, pte_t *); void ptep_modify_prot_commit(struct vm_area_struct *, unsigned long, diff --git a/arch/powerpc/include/asm/pgtable.h b/arch/powerpc/include/asm/pgtable.h index 33f4bf8d22b0..522145b16a07 100644 --- a/arch/powerpc/include/asm/pgtable.h +++ b/arch/powerpc/include/asm/pgtable.h @@ -158,6 +158,18 @@ struct seq_file; void arch_report_meminfo(struct seq_file *m); #endif /* CONFIG_PPC64 */ +#define pud_pfn pud_pfn +static inline int pud_pfn(pud_t pud) +{ + /* +* Currently all calls to pud_pfn() are gated around a pud_devmap() +* check so this should never be used. If it grows another user we +* want to know about it. +*/ + BUILD_BUG(); + return 0; +} + #endif /* __ASSEMBLY__ */ #endif /* _ASM_POWERPC_PGTABLE_H */ -- 2.34.1
Re: [PATCH v4 00/20] powerpc: Syscall wrapper and register clearing
Any comments for this revision? Hopefully these revisions address 32-bit and embedded systems appropriately. Thanks, Rohan > On 24 Aug 2022, at 12:05 pm, Rohan McLure wrote: > > V3 available here: > > Link: > https://lore.kernel.org/all/4c3a8815-67ff-41eb-a703-981920ca1...@linux.ibm.com/T/ > > Implement a syscall wrapper, causing arguments to handlers to be passed > via a struct pt_regs on the stack. The syscall wrapper is implemented > for all platforms other than the Cell processor, from which SPUs expect > the ability to directly call syscall handler symbols with the regular > in-register calling convention. > > Adopting syscall wrappers requires redefinition of architecture-specific > syscalls and compatibility syscalls to use the SYSCALL_DEFINE and > COMPAT_SYSCALL_DEFINE macros, as well as removal of direct-references to > the emitted syscall-handler symbols from within the kernel. This work > lead to the following modernisations of powerpc's syscall handlers: > > - Replace syscall 82 semantics with sys_old_select and remove > ppc_select handler, which features direct call to both sys_old_select > and sys_select. > - Use a generic fallocate compatibility syscall > > Replace asm implementation of syscall table with C implementation for > more compile-time checks. > > Many compatibility syscalls are candidates to be removed in favour of > generically defined handlers, but exhibit different parameter orderings > and numberings due to 32-bit ABI support for 64-bit parameters. The > parameter reorderings are however consistent with arm. A future patch > series will serve to modernise syscalls by providing generic > implementations featuring these reorderings. > > The design of this syscall is very similar to the s390, x86 and arm64 > implementations. See also Commit 4378a7d4be30 (arm64: implement syscall > wrappers). > The motivation for this change is that it allows for the clearing of > register state when entering the kernel via through interrupt handlers > on 64-bit servers. This serves to reduce the influence of values in > registers carried over from the interrupted process, e.g. syscall > parameters from user space, or user state at the site of a pagefault. > All values in registers are saved and zeroized at the entry to an > interrupt handler and restored afterward. While this may sound like a > heavy-weight mitigation, many gprs are already saved and restored on > handling of an interrupt, and the mmap_bench benchmark on Power 9 guest, > repeatedly invoking the pagefault handler suggests at most ~0.8% > regression in performance. Realistic workloads are not constantly > producing interrupts, and so this does not indicate realistic slowdown. > > Using wrapped syscalls yields to a performance improvement of ~5.6% on > the null_syscall benchmark on pseries guests, by removing the need for > system_call_exception to allocate its own stack frame. This amortises > the additional costs of saving and restoring non-volatile registers > (register clearing is cheap on super scalar platforms), and so the > final mitigation actually yields a net performance improvement of ~0.6% > on the null_syscall benchmark. > > Patch Changelog: > > - Fix instances where NULLIFY_GPRS were still present > - Minimise unrecoverable windows in entry_32.S between SRR0/1 restores > and RFI > - Remove all references to syscall symbols prior to introducing syscall > wrapper. > - Remove unnecessary duplication of syscall handlers with sys_... and > powerpc_sys_... symbols. > - Clear non-volatile registers on Book3E systems, as some of these > systems feature hardware speculation, and we already unconditionally > restore NVGPRS. > > Rohan McLure (20): > powerpc: Remove asmlinkage from syscall handler definitions > powerpc: Use generic fallocate compatibility syscall > powerpc/32: Remove powerpc select specialisation > powerpc: Provide do_ppc64_personality helper > powerpc: Remove direct call to personality syscall handler > powerpc: Remove direct call to mmap2 syscall handlers > powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers > powerpc: Include all arch-specific syscall prototypes > powerpc: Enable compile-time check for syscall handlers > powerpc: Use common syscall handler type > powerpc: Add ZEROIZE_GPRS macros for register clears > Revert "powerpc/syscall: Save r3 in regs->orig_r3" > powerpc: Provide syscall wrapper > powerpc/64s: Clear/restore caller gprs in syscall interrupt/return > powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers > powerpc/32: Clarify interrupt restores with REST_GPR macro in >entry_32.S > powerpc/64e: Clarify register saves and clear
[PATCH v4 19/20] powerpc/64s: Clear gprs on interrupt routine entry in Book3S
Zero GPRS r0, r2-r11, r14-r31, on entry into the kernel for all other interrupt sources to limit influence of user-space values in potential speculation gadgets. The remaining gprs are overwritten by entry macros to interrupt handlers, irrespective of whether or not a given handler consumes these register values. Prior to this commit, r14-r31 are restored on a per-interrupt basis at exit, but now they are always restored. Remove explicit REST_NVGPRS invocations as non-volatiles must now always be restored. 32-bit systems do not clear user registers on interrupt, and continue to depend on the return value of interrupt_exit_user_prepare to determine whether or not to restore non-volatiles. The mmap_bench benchmark in selftests should rapidly invoke pagefaults. See ~0.8% performance regression with this mitigation, but this indicates the worst-case performance due to heavier-weight interrupt handlers. Signed-off-by: Rohan McLure --- V1 -> V2: Add benchmark data V2 -> V3: Use ZEROIZE_GPR{,S} macro renames, clarify interrupt_exit_user_prepare changes in summary. --- arch/powerpc/kernel/exceptions-64s.S | 21 - arch/powerpc/kernel/interrupt_64.S | 9 ++--- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index a3b51441b039..038e42fb2182 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -502,6 +502,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real, text) std r10,0(r1) /* make stack chain pointer */ std r0,GPR0(r1) /* save r0 in stackframe*/ std r10,GPR1(r1)/* save r1 in stackframe*/ + ZEROIZE_GPR(0) /* Mark our [H]SRRs valid for return */ li r10,1 @@ -538,14 +539,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r10,IAREA+EX_R10(r13) std r9,GPR9(r1) std r10,GPR10(r1) + ZEROIZE_GPRS(9, 10) ld r9,IAREA+EX_R11(r13)/* move r11 - r13 to stackframe */ ld r10,IAREA+EX_R12(r13) ld r11,IAREA+EX_R13(r13) std r9,GPR11(r1) std r10,GPR12(r1) std r11,GPR13(r1) + /* keep r12 ([H]SRR1/MSR), r13 (PACA) for interrupt routine */ + ZEROIZE_GPR(11) SAVE_NVGPRS(r1) + ZEROIZE_NVGPRS() .if IDAR .if IISIDE @@ -577,8 +582,8 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r10,IAREA+EX_CTR(r13) std r10,_CTR(r1) - std r2,GPR2(r1) /* save r2 in stackframe*/ - SAVE_GPRS(3, 8, r1) /* save r3 - r8 in stackframe */ + SAVE_GPRS(2, 8, r1) /* save r2 - r8 in stackframe */ + ZEROIZE_GPRS(2, 8) mflrr9 /* Get LR, later save to stack */ ld r2,PACATOC(r13) /* get kernel TOC into r2 */ std r9,_LINK(r1) @@ -696,6 +701,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mtlrr9 ld r9,_CCR(r1) mtcrr9 + REST_NVGPRS(r1) REST_GPRS(2, 13, r1) REST_GPR(0, r1) /* restore original r1. */ @@ -1368,11 +1374,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) b interrupt_return_srr 1: bl do_break - /* -* do_break() may have changed the NV GPRS while handling a breakpoint. -* If so, we need to restore them with their updated values. -*/ - REST_NVGPRS(r1) b interrupt_return_srr @@ -1598,7 +1599,6 @@ EXC_COMMON_BEGIN(alignment_common) GEN_COMMON alignment addir3,r1,STACK_FRAME_OVERHEAD bl alignment_exception - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_srr @@ -1708,7 +1708,6 @@ EXC_COMMON_BEGIN(program_check_common) .Ldo_program_check: addir3,r1,STACK_FRAME_OVERHEAD bl program_check_exception - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_srr @@ -2139,7 +2138,6 @@ EXC_COMMON_BEGIN(emulation_assist_common) GEN_COMMON emulation_assist addir3,r1,STACK_FRAME_OVERHEAD bl emulation_assist_interrupt - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_hsrr @@ -2457,7 +2455,6 @@ EXC_COMMON_BEGIN(facility_unavailable_common) GEN_COMMON facility_unavailable addir3,r1,STACK_FRAME_OVERHEAD bl facility_unavailable_exception - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_srr @@ -2485,7 +2482,6 @@ EXC_COMMON_BEGIN(h_facility_unavailable_common) GEN_COMMON h_facility_unavailable addir3,r1,STACK_FRAME_OVERHEAD
[PATCH v4 20/20] powerpc/64e: Clear gprs on interrupt routine entry
Zero GPRS r14-r31 on entry into the kernel for interrupt sources to limit influence of user-space values in potential speculation gadgets. Prior to this commit, all other GPRS are reassigned during the common prologue to interrupt handlers and so need not be zeroised explicitly. This may be done safely, without loss of register state prior to the interrupt, as the common prologue saves the initial values of non-volatiles, which are unconditionally restored in interrupt_64.S. Signed-off-by: Rohan McLure --- V3 -> V4: New patch. --- arch/powerpc/kernel/exceptions-64e.S | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 48c640ca425d..296b3bf6b2a6 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -401,7 +401,8 @@ exc_##n##_common: \ std r12,STACK_FRAME_OVERHEAD-16(r1); /* mark the frame */ \ std r3,_TRAP(r1); /* set trap number */ \ std r0,RESULT(r1); /* clear regs->result */\ - SAVE_NVGPRS(r1); + SAVE_NVGPRS(r1);\ + ZEROIZE_NVGPRS(); /* minimise speculation influence */ #define EXCEPTION_COMMON(n) \ EXCEPTION_COMMON_LVL(n, SPRN_SPRG_GEN_SCRATCH, PACA_EXGEN) @@ -1068,6 +1069,7 @@ bad_stack_book3e: std r11,_CTR(r1) std r12,_XER(r1) SAVE_NVGPRS(r1) + ZEROIZE_NVGPRS() lhz r12,PACA_TRAP_SAVE(r13) std r12,_TRAP(r1) addir11,r1,INT_FRAME_SIZE -- 2.34.1
[PATCH v4 14/20] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
Clear user state in gprs (assign to zero) to reduce the influence of user registers on speculation within kernel syscall handlers. Clears occur at the very beginning of the sc and scv 0 interrupt handlers, with restores occurring following the execution of the syscall handler. Signed-off-by: Rohan McLure --- V1 -> V2: Update summary V2 -> V3: Remove erroneous summary paragraph on syscall_exit_prepare V3 -> V4: Use ZEROIZE instead of NULLIFY --- arch/powerpc/kernel/interrupt_64.S | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index 0178aeba3820..a8065209dd8c 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -70,7 +70,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) ld r2,PACATOC(r13) mfcrr12 li r11,0 - /* Can we avoid saving r3-r8 in common case? */ + /* Save syscall parameters in r3-r8 */ std r3,GPR3(r1) std r4,GPR4(r1) std r5,GPR5(r1) @@ -109,6 +109,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) * but this is the best we can do. */ + /* +* Zero user registers to prevent influencing speculative execution +* state of kernel code. +*/ + ZEROIZE_GPRS(5, 12) + ZEROIZE_NVGPRS() + /* Calling convention has r3 = orig r0, r4 = regs */ mr r3,r0 bl system_call_exception @@ -139,6 +146,7 @@ BEGIN_FTR_SECTION HMT_MEDIUM_LOW END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) + REST_NVGPRS(r1) cmpdi r3,0 bne .Lsyscall_vectored_\name\()_restore_regs @@ -181,7 +189,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r4,_LINK(r1) ld r5,_XER(r1) - REST_NVGPRS(r1) ld r0,GPR0(r1) mtcrr2 mtctr r3 @@ -249,7 +256,7 @@ END_BTB_FLUSH_SECTION ld r2,PACATOC(r13) mfcrr12 li r11,0 - /* Can we avoid saving r3-r8 in common case? */ + /* Save syscall parameters in r3-r8 */ std r3,GPR3(r1) std r4,GPR4(r1) std r5,GPR5(r1) @@ -300,6 +307,13 @@ END_BTB_FLUSH_SECTION wrteei 1 #endif + /* +* Zero user registers to prevent influencing speculative execution +* state of kernel code. +*/ + ZEROIZE_GPRS(5, 12) + ZEROIZE_NVGPRS() + /* Calling convention has r3 = orig r0, r4 = regs */ mr r3,r0 bl system_call_exception @@ -342,6 +356,7 @@ BEGIN_FTR_SECTION stdcx. r0,0,r1 /* to clear the reservation */ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) + REST_NVGPRS(r1) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ @@ -377,7 +392,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) .Lsyscall_restore_regs: ld r3,_CTR(r1) ld r4,_XER(r1) - REST_NVGPRS(r1) mtctr r3 mtspr SPRN_XER,r4 ld r0,GPR0(r1) -- 2.34.1
[PATCH v4 18/20] powerpc/64s: Fix comment on interrupt handler prologue
Interrupt handlers on 64s systems will often need to save register state from the interrupted process to make space for loading special purpose registers or for internal state. Fix a comment documenting a common code path macro in the beginning of interrupt handlers where r10 is saved to the PACA to afford space for the value of the CFAR. Comment is currently written as if r10-r12 are saved to PACA, but in fact only r10 is saved, with r11-r12 saved much later. The distance in code between these saves has grown over the many revisions of this macro. Fix this by signalling with a comment where r11-r12 are saved to the PACA. Signed-off-by: Rohan McLure --- V1 -> V2: Given its own commit V2 -> V3: Annotate r11-r12 save locations with comment. --- arch/powerpc/kernel/exceptions-64s.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 3d0dc133a9ae..a3b51441b039 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -281,7 +281,7 @@ BEGIN_FTR_SECTION mfspr r9,SPRN_PPR END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) HMT_MEDIUM - std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ + std r10,IAREA+EX_R10(r13) /* save r10 */ .if ICFAR BEGIN_FTR_SECTION mfspr r10,SPRN_CFAR @@ -321,7 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mfctr r10 std r10,IAREA+EX_CTR(r13) mfcrr9 - std r11,IAREA+EX_R11(r13) + std r11,IAREA+EX_R11(r13) /* save r11 - r12 */ std r12,IAREA+EX_R12(r13) /* -- 2.34.1
[PATCH v4 16/20] powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel stack frame. Issue the REST_GPR{,S} macros to clearly signal when this is happening, and bunch together restores at the end of the interrupt handler where the saved value is not consumed earlier in the handler code. Signed-off-by: Rohan McLure Reported-by: Christophe Leroy --- V2 -> V3: New patch. V3 -> V4: Minimise restores in the unrecoverable window between restoring SRR0/1 and return from interrupt. --- arch/powerpc/kernel/entry_32.S | 33 +--- 1 file changed, 13 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 8d6e02ef5dc0..963d646915fe 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -68,7 +68,7 @@ prepare_transfer_to_handler: lwz r9,_MSR(r11)/* if sleeping, clear MSR.EE */ rlwinm r9,r9,0,~MSR_EE lwz r12,_LINK(r11) /* and return to address in LR */ - lwz r2, GPR2(r11) + REST_GPR(2, r11) b fast_exception_return _ASM_NOKPROBE_SYMBOL(prepare_transfer_to_handler) #endif /* CONFIG_PPC_BOOK3S_32 || CONFIG_E500 */ @@ -144,7 +144,7 @@ ret_from_syscall: lwz r7,_NIP(r1) lwz r8,_MSR(r1) cmpwi r3,0 - lwz r3,GPR3(r1) + REST_GPR(3, r1) syscall_exit_finish: mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 @@ -152,8 +152,8 @@ syscall_exit_finish: bne 3f mtcrr5 -1: lwz r2,GPR2(r1) - lwz r1,GPR1(r1) +1: REST_GPR(2, r1) + REST_GPR(1, r1) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -165,10 +165,8 @@ syscall_exit_finish: REST_NVGPRS(r1) mtctr r4 mtxer r5 - lwz r0,GPR0(r1) - lwz r3,GPR3(r1) - REST_GPRS(4, 11, r1) - lwz r12,GPR12(r1) + REST_GPR(0, r1) + REST_GPRS(3, 12, r1) b 1b #ifdef CONFIG_44x @@ -260,9 +258,8 @@ fast_exception_return: beq 3f /* if not, we've got problems */ #endif -2: REST_GPRS(3, 6, r11) - lwz r10,_CCR(r11) - REST_GPRS(1, 2, r11) +2: lwz r10,_CCR(r11) + REST_GPRS(1, 6, r11) mtcrr10 lwz r10,_LINK(r11) mtlrr10 @@ -277,7 +274,7 @@ fast_exception_return: mtspr SPRN_SRR0,r12 REST_GPR(9, r11) REST_GPR(12, r11) - lwz r11,GPR11(r11) + REST_GPR(11, r11) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -454,9 +451,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r3,_MSR(r1);\ andi. r3,r3,MSR_PR; \ bne interrupt_return; \ - lwz r0,GPR0(r1);\ - lwz r2,GPR2(r1);\ - REST_GPRS(3, 8, r1);\ + REST_GPR(0, r1);\ + REST_GPRS(2, 8, r1);\ lwz r10,_XER(r1); \ lwz r11,_CTR(r1); \ mtspr SPRN_XER,r10; \ @@ -475,11 +471,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r12,_MSR(r1); \ mtspr exc_lvl_srr0,r11; \ mtspr exc_lvl_srr1,r12; \ - lwz r9,GPR9(r1);\ - lwz r12,GPR12(r1); \ - lwz r10,GPR10(r1); \ - lwz r11,GPR11(r1); \ - lwz r1,GPR1(r1);\ + REST_GPRS(9, 12, r1); \ + REST_GPR(1, r1);\ exc_lvl_rfi;\ b .; /* prevent prefetch past exc_lvl_rfi */ -- 2.34.1
[PATCH v4 17/20] powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS
The common interrupt handler prologue macro and the bad_stack trampolines include consecutive sequences of register saves, and some register clears. Neaten such instances by expanding use of the SAVE_GPRS macro and employing the ZEROIZE_GPR macro when appropriate. Also simplify an invocation of SAVE_GPRS targetting all non-volatile registers to SAVE_NVGPRS. Signed-off-by: Rohan Mclure --- V3 -> V4: New commit. --- arch/powerpc/kernel/exceptions-64e.S | 27 +++--- 1 file changed, 11 insertions(+), 16 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64e.S b/arch/powerpc/kernel/exceptions-64e.S index 67dc4e3179a0..48c640ca425d 100644 --- a/arch/powerpc/kernel/exceptions-64e.S +++ b/arch/powerpc/kernel/exceptions-64e.S @@ -216,17 +216,14 @@ END_FTR_SECTION_IFSET(CPU_FTR_EMB_HV) mtlrr10 mtcrr11 - ld r10,GPR10(r1) - ld r11,GPR11(r1) - ld r12,GPR12(r1) + REST_GPRS(10, 12, r1) mtspr \scratch,r0 std r10,\paca_ex+EX_R10(r13); std r11,\paca_ex+EX_R11(r13); ld r10,_NIP(r1) ld r11,_MSR(r1) - ld r0,GPR0(r1) - ld r1,GPR1(r1) + REST_GPRS(0, 1, r1) mtspr \srr0,r10 mtspr \srr1,r11 ld r10,\paca_ex+EX_R10(r13) @@ -372,16 +369,15 @@ ret_from_mc_except: /* Core exception code for all exceptions except TLB misses. */ #define EXCEPTION_COMMON_LVL(n, scratch, excf) \ exc_##n##_common: \ - std r0,GPR0(r1);/* save r0 in stackframe */ \ - std r2,GPR2(r1);/* save r2 in stackframe */ \ - SAVE_GPRS(3, 9, r1);/* save r3 - r9 in stackframe */\ + SAVE_GPR(0, r1);/* save r0 in stackframe */ \ + SAVE_GPRS(2, 9, r1);/* save r2 - r9 in stackframe */\ std r10,_NIP(r1); /* save SRR0 to stackframe */ \ std r11,_MSR(r1); /* save SRR1 to stackframe */ \ beq 2f; /* if from kernel mode */ \ 2: ld r3,excf+EX_R10(r13);/* get back r10 */ \ ld r4,excf+EX_R11(r13);/* get back r11 */ \ mfspr r5,scratch; /* get back r13 */ \ - std r12,GPR12(r1); /* save r12 in stackframe */\ + SAVE_GPR(12, r1); /* save r12 in stackframe */\ ld r2,PACATOC(r13);/* get kernel TOC into r2 */\ mflrr6; /* save LR in stackframe */ \ mfctr r7; /* save CTR in stackframe */\ @@ -390,7 +386,7 @@ exc_##n##_common: \ lwz r10,excf+EX_CR(r13);/* load orig CR back from PACA */ \ lbz r11,PACAIRQSOFTMASK(r13); /* get current IRQ softe */ \ ld r12,exception_marker@toc(r2); \ - li r0,0; \ + ZEROIZE_GPR(0); \ std r3,GPR10(r1); /* save r10 to stackframe */\ std r4,GPR11(r1); /* save r11 to stackframe */\ std r5,GPR13(r1); /* save it to stackframe */ \ @@ -1056,15 +1052,14 @@ bad_stack_book3e: mfspr r11,SPRN_ESR std r10,_DEAR(r1) std r11,_ESR(r1) - std r0,GPR0(r1);/* save r0 in stackframe */ \ - std r2,GPR2(r1);/* save r2 in stackframe */ \ - SAVE_GPRS(3, 9, r1);/* save r3 - r9 in stackframe */\ + SAVE_GPR(0, r1);/* save r0 in stackframe */ \ + SAVE_GPRS(2, 9, r1);/* save r2 - r9 in stackframe */\ ld r3,PACA_EXGEN+EX_R10(r13);/* get back r10 */\ ld r4,PACA_EXGEN+EX_R11(r13);/* get back r11 */\ mfspr r5,SPRN_SPRG_GEN_SCRATCH;/* get back r13 XXX can be wrong */ \ std r3,GPR10(r1); /* save r10 to stackframe */\ std r4,GPR11(r1); /* save r11 to stackframe */\ - std r12,GPR12(r1); /* save r12 in stackframe */\ + SAVE_GPR(12, r1); /* save r12 in stackframe */\ std r5,GPR13(r1); /* save it to stackframe */ \ mflrr10 mfctr r11 @@ -1072,12 +1067,12 @@ bad_stack_book3e: std r10,_LINK(r1) std r11,_CTR(r1) std r12,_XER(r1) - SAVE_GPRS(14, 31, r1) + SAVE_NVGPRS(r1) lhz r12,PACA_TRAP_SAVE(r13) std r12,_TRAP
[PATCH v4 15/20] powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers
Use the convenience macros for saving/clearing/restoring gprs in keeping with syscall calling conventions. The plural variants of these macros can store a range of registers for concision. This works well when the user gpr value we are hoping to save is still live. In the syscall interrupt handlers, user register state is sometimes juggled between registers. Hold-off from issuing the SAVE_GPR macro for applicable neighbouring lines to highlight the delicate register save logic. Signed-off-by: Rohan McLure --- V1 -> V2: Update summary V2 -> V3: Update summary regarding exclusions for the SAVE_GPR marco. Acknowledge new name for ZEROIZE_GPR{,S} macros. --- arch/powerpc/kernel/interrupt_64.S | 43 ++-- 1 file changed, 9 insertions(+), 34 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index a8065209dd8c..ad302ad93433 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -71,12 +71,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) mfcrr12 li r11,0 /* Save syscall parameters in r3-r8 */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -157,17 +152,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) /* Could zero these as per ABI, but we may consider a stricter ABI * which preserves these if libc implementations can benefit, so * restore them for now until further measurement is done. */ - ld r0,GPR0(r1) - ld r4,GPR4(r1) - ld r5,GPR5(r1) - ld r6,GPR6(r1) - ld r7,GPR7(r1) - ld r8,GPR8(r1) + REST_GPR(0, r1) + REST_GPRS(4, 8, r1) /* Zero volatile regs that may contain sensitive kernel data */ - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPRS(9, 12) mtspr SPRN_XER,r0 /* @@ -189,7 +177,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r4,_LINK(r1) ld r5,_XER(r1) - ld r0,GPR0(r1) + REST_GPR(0, r1) mtcrr2 mtctr r3 mtlrr4 @@ -257,12 +245,7 @@ END_BTB_FLUSH_SECTION mfcrr12 li r11,0 /* Save syscall parameters in r3-r8 */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -360,16 +343,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ - li r0,0 - li r4,0 - li r5,0 - li r6,0 - li r7,0 - li r8,0 - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPR(0) + ZEROIZE_GPRS(4, 12) mtctr r0 mtspr SPRN_XER,r0 .Lsyscall_restore_regs_cont: @@ -394,7 +369,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r4,_XER(r1) mtctr r3 mtspr SPRN_XER,r4 - ld r0,GPR0(r1) + REST_GPR(0, r1) REST_GPRS(4, 12, r1) b .Lsyscall_restore_regs_cont .Lsyscall_rst_end: -- 2.34.1
[PATCH v4 13/20] powerpc: Provide syscall wrapper
Implement syscall wrapper as per s390, x86, arm64. When enabled cause handlers to accept parameters from a stack frame rather than from user scratch register state. This allows for user registers to be safely cleared in order to reduce caller influence on speculation within syscall routine. The wrapper is a macro that emits syscall handler symbols that call into the target handler, obtaining its parameters from a struct pt_regs on the stack. As registers are already saved to the stack prior to calling system_call_exception, it appears that this function is executed more efficiently with the new stack-pointer convention than with parameters passed by registers, avoiding the allocation of a stack frame for this method. On a 32-bit system, we see >20% performance increases on the null_syscall microbenchmark, and on a Power 8 the performance gains amortise the cost of clearing and restoring registers which is implemented at the end of this series, seeing final result of ~5.6% performance improvement on null_syscall. Syscalls are wrapped in this fashion on all platforms except for the Cell processor as this commit does not provide SPU support. This can be quickly fixed in a successive patch, but requires spu_sys_callback to allocate a pt_regs structure to satisfy the wrapped calling convention. Co-developed-by: Andrew Donnellan Signed-off-by: Andrew Donnellan Signed-off-by: Rohan McLure --- V1 -> V2: Generate prototypes for symbols produced by the wrapper. V2 -> V3: Rebased to remove conflict with 1547db7d1f44 ("powerpc: Move system_call_exception() to syscall.c"). Also remove copy from gpr3 save slot on stackframe to orig_r3's slot. Fix whitespace with preprocessor defines in system_call_exception. --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/interrupt.h | 3 +- arch/powerpc/include/asm/syscall.h | 4 + arch/powerpc/include/asm/syscall_wrapper.h | 84 arch/powerpc/include/asm/syscalls.h| 25 +- arch/powerpc/kernel/entry_32.S | 6 +- arch/powerpc/kernel/interrupt_64.S | 16 ++-- arch/powerpc/kernel/syscall.c | 31 +++- arch/powerpc/kernel/systbl.c | 2 + arch/powerpc/kernel/vdso.c | 2 + 10 files changed, 142 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 4c466acdc70d..ef6c83e79c9b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -137,6 +137,7 @@ config PPC select ARCH_HAS_STRICT_KERNEL_RWX if (PPC_BOOK3S || PPC_8xx || 40x) && !HIBERNATION select ARCH_HAS_STRICT_KERNEL_RWX if FSL_BOOKE && !HIBERNATION && !RANDOMIZE_BASE select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX + select ARCH_HAS_SYSCALL_WRAPPER if !SPU_BASE select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_UBSAN_SANITIZE_ALL diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index 8069dbc4b8d1..3f9cad81585c 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -665,8 +665,7 @@ static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs) local_irq_enable(); } -long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, - unsigned long r0, struct pt_regs *regs); +long system_call_exception(unsigned long r0, struct pt_regs *regs); notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs, long scv); notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs); notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs); diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index d2a8dfd5de33..3dd36c5e334a 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -14,8 +14,12 @@ #include #include +#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER +typedef long (*syscall_fn)(const struct pt_regs *); +#else typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); +#endif /* ftrace syscalls requires exporting the sys_call_table */ extern const syscall_fn sys_call_table[]; diff --git a/arch/powerpc/include/asm/syscall_wrapper.h b/arch/powerpc/include/asm/syscall_wrapper.h new file mode 100644 index ..91bcfa40f740 --- /dev/null +++ b/arch/powerpc/include/asm/syscall_wrapper.h @@ -0,0 +1,84 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * syscall_wrapper.h - powerpc specific wrappers to syscall definitions + * + * Based on arch/{x86,arm64}/include/asm/syscall_wrapper.h + */ + +#ifndef __ASM_SYSCALL_WRAPPER_H +#define __ASM_SYSCALL_WRAPPER_H + +stru
[PATCH v4 12/20] Revert "powerpc/syscall: Save r3 in regs->orig_r3"
This reverts commit 8875f47b7681aa4e4484a9b612577b044725f839. Save caller's original r3 state to the kernel stackframe before entering system_call_exception. This allows for user registers to be cleared by the time system_call_exception is entered, reducing the influence of user registers on speculation within the kernel. Prior to this commit, orig_r3 was saved at the beginning of system_call_exception. Instead, save orig_r3 while the user value is still live in r3. Also replicate this early save in 32-bit. A similar save was removed in commit 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") when 32-bit adopted system_call_exception. Revert its removal of orig_r3 saves. Signed-off-by: Rohan McLure --- V2 -> V3: New commit. --- arch/powerpc/kernel/entry_32.S | 1 + arch/powerpc/kernel/interrupt_64.S | 2 ++ arch/powerpc/kernel/syscall.c | 1 - 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 1d599df6f169..44dfce9a60c5 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -101,6 +101,7 @@ __kuep_unlock: .globl transfer_to_syscall transfer_to_syscall: + stw r3, ORIG_GPR3(r1) stw r11, GPR1(r1) stw r11, 0(r1) mflrr12 diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index ce25b28cf418..71d2d9497283 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -91,6 +91,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) li r11,\trapnr std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ @@ -275,6 +276,7 @@ END_BTB_FLUSH_SECTION std r10,_LINK(r1) std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c index 81ace9e8b72b..64102a64fd84 100644 --- a/arch/powerpc/kernel/syscall.c +++ b/arch/powerpc/kernel/syscall.c @@ -25,7 +25,6 @@ notrace long system_call_exception(long r3, long r4, long r5, kuap_lock(); add_random_kstack_offset(); - regs->orig_gpr3 = r3; if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); -- 2.34.1
[PATCH v4 11/20] powerpc: Add ZEROIZE_GPRS macros for register clears
Macros for restoring and saving registers to and from the stack exist. Provide macros with the same interface for clearing a range of gprs by setting each register's value in that range to zero. The resulting macros are called ZEROIZE_GPRS and ZEROIZE_NVGPRS, keeping with the naming of the accompanying restore and save macros, and usage of zeroize to describe this operation elsewhere in the kernel. Signed-off-by: Rohan McLure --- V1 -> V2: Change 'ZERO' usage in naming to 'NULLIFY', a more obvious verb V2 -> V3: Change 'NULLIFY' usage in naming to 'ZEROIZE', which has precedent in kernel and explicitly specifies that we are zeroing. V3 -> V4: Update commit message to use zeroize. --- arch/powerpc/include/asm/ppc_asm.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 83c02f5a7f2a..b95689ada59c 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -33,6 +33,20 @@ .endr .endm +/* + * This expands to a sequence of register clears for regs start to end + * inclusive, of the form: + * + * li rN, 0 + */ +.macro ZEROIZE_REGS start, end + .Lreg=\start + .rept (\end - \start + 1) + li .Lreg, 0 + .Lreg=.Lreg+1 + .endr +.endm + /* * Macros for storing registers into and loading registers from * exception frames. @@ -49,6 +63,14 @@ #define REST_NVGPRS(base) REST_GPRS(13, 31, base) #endif +#defineZEROIZE_GPRS(start, end)ZEROIZE_REGS start, end +#ifdef __powerpc64__ +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(14, 31) +#else +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(13, 31) +#endif +#defineZEROIZE_GPR(n) ZEROIZE_GPRS(n, n) + #define SAVE_GPR(n, base) SAVE_GPRS(n, n, base) #define REST_GPR(n, base) REST_GPRS(n, n, base) -- 2.34.1
[PATCH v4 09/20] powerpc: Enable compile-time check for syscall handlers
The table of syscall handlers and registered compatibility syscall handlers has in past been produced using assembly, with function references resolved at link time. This moves link-time errors to compile-time, by rewriting systbl.S in C, and including the linux/syscalls.h, linux/compat.h and asm/syscalls.h headers for prototypes. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure --- V1 -> V2: New patch. --- arch/powerpc/kernel/{systbl.S => systbl.c} | 27 ++-- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/kernel/systbl.S b/arch/powerpc/kernel/systbl.c similarity index 59% rename from arch/powerpc/kernel/systbl.S rename to arch/powerpc/kernel/systbl.c index cb3358886203..99ffdfef6b9c 100644 --- a/arch/powerpc/kernel/systbl.S +++ b/arch/powerpc/kernel/systbl.c @@ -10,31 +10,32 @@ * PPC64 updates by Dave Engebretsen (engeb...@us.ibm.com) */ -#include +#include +#include +#include +#include -.section .rodata,"a" +#define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) -#ifdef CONFIG_PPC64 - .p2align3 -#define __SYSCALL(nr, entry) .8byte entry +#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER +#define __SYSCALL(nr, entry) [nr] = __powerpc_##entry, +#define __powerpc_sys_ni_syscall sys_ni_syscall #else -#define __SYSCALL(nr, entry) .long entry +#define __SYSCALL(nr, entry) [nr] = entry, #endif -#define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, native) -.globl sys_call_table -sys_call_table: +void *sys_call_table[] = { #ifdef CONFIG_PPC64 #include #else #include #endif +}; #ifdef CONFIG_COMPAT #undef __SYSCALL_WITH_COMPAT #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, compat) -.globl compat_sys_call_table -compat_sys_call_table: -#define compat_sys_sigsuspend sys_sigsuspend +void *compat_sys_call_table[] = { #include -#endif +}; +#endif /* CONFIG_COMPAT */ -- 2.34.1
[PATCH v4 10/20] powerpc: Use common syscall handler type
Cause syscall handlers to be typed as follows when called indirectly throughout the kernel. typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); Since both 32 and 64-bit abis allow for at least the first six machine-word length parameters to a function to be passed by registers, even handlers which admit fewer than six parameters may be viewed as having the above type. Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce explicit cast on systems with SPUs. Signed-off-by: Rohan McLure --- V1 -> V2: New patch. V2 -> V3: Remove unnecessary cast from const syscall_fn to syscall_fn --- arch/powerpc/include/asm/syscall.h | 7 +-- arch/powerpc/include/asm/syscalls.h | 1 + arch/powerpc/kernel/systbl.c| 6 +++--- arch/powerpc/kernel/vdso.c | 4 ++-- arch/powerpc/platforms/cell/spu_callbacks.c | 6 +++--- 5 files changed, 14 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index 25fc8ad9a27a..d2a8dfd5de33 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -14,9 +14,12 @@ #include #include +typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, + unsigned long, unsigned long, unsigned long); + /* ftrace syscalls requires exporting the sys_call_table */ -extern const unsigned long sys_call_table[]; -extern const unsigned long compat_sys_call_table[]; +extern const syscall_fn sys_call_table[]; +extern const syscall_fn compat_sys_call_table[]; static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 91417dee534e..e979b7593d2b 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,6 +8,7 @@ #include #include +#include #ifdef CONFIG_PPC64 #include #endif diff --git a/arch/powerpc/kernel/systbl.c b/arch/powerpc/kernel/systbl.c index 99ffdfef6b9c..b88a9c2a1f50 100644 --- a/arch/powerpc/kernel/systbl.c +++ b/arch/powerpc/kernel/systbl.c @@ -21,10 +21,10 @@ #define __SYSCALL(nr, entry) [nr] = __powerpc_##entry, #define __powerpc_sys_ni_syscall sys_ni_syscall #else -#define __SYSCALL(nr, entry) [nr] = entry, +#define __SYSCALL(nr, entry) [nr] = (void *) entry, #endif -void *sys_call_table[] = { +const syscall_fn sys_call_table[] = { #ifdef CONFIG_PPC64 #include #else @@ -35,7 +35,7 @@ void *sys_call_table[] = { #ifdef CONFIG_COMPAT #undef __SYSCALL_WITH_COMPAT #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, compat) -void *compat_sys_call_table[] = { +const syscall_fn compat_sys_call_table[] = { #include }; #endif /* CONFIG_COMPAT */ diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 0da287544054..080c9e7de0c0 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -313,10 +313,10 @@ static void __init vdso_setup_syscall_map(void) unsigned int i; for (i = 0; i < NR_syscalls; i++) { - if (sys_call_table[i] != (unsigned long)&sys_ni_syscall) + if (sys_call_table[i] != (void *)&sys_ni_syscall) vdso_data->syscall_map[i >> 5] |= 0x8000UL >> (i & 0x1f); if (IS_ENABLED(CONFIG_COMPAT) && - compat_sys_call_table[i] != (unsigned long)&sys_ni_syscall) + compat_sys_call_table[i] != (void *)&sys_ni_syscall) vdso_data->compat_syscall_map[i >> 5] |= 0x8000UL >> (i & 0x1f); } } diff --git a/arch/powerpc/platforms/cell/spu_callbacks.c b/arch/powerpc/platforms/cell/spu_callbacks.c index fe0d8797a00a..e780c14c5733 100644 --- a/arch/powerpc/platforms/cell/spu_callbacks.c +++ b/arch/powerpc/platforms/cell/spu_callbacks.c @@ -34,15 +34,15 @@ * mbind, mq_open, ipc, ... */ -static void *spu_syscall_table[] = { +static const syscall_fn spu_syscall_table[] = { #define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) -#define __SYSCALL(nr, entry) [nr] = entry, +#define __SYSCALL(nr, entry) [nr] = (void *) entry, #include }; long spu_sys_callback(struct spu_syscall_block *s) { - long (*syscall)(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6); + syscall_fn syscall; if (s->nr_ret >= ARRAY_SIZE(spu_syscall_table)) { pr_debug("%s: invalid syscall #%lld", __func__, s->nr_ret); -- 2.34.1
[PATCH v4 00/20] powerpc: Syscall wrapper and register clearing
V3 available here: Link: https://lore.kernel.org/all/4c3a8815-67ff-41eb-a703-981920ca1...@linux.ibm.com/T/ Implement a syscall wrapper, causing arguments to handlers to be passed via a struct pt_regs on the stack. The syscall wrapper is implemented for all platforms other than the Cell processor, from which SPUs expect the ability to directly call syscall handler symbols with the regular in-register calling convention. Adopting syscall wrappers requires redefinition of architecture-specific syscalls and compatibility syscalls to use the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, as well as removal of direct-references to the emitted syscall-handler symbols from within the kernel. This work lead to the following modernisations of powerpc's syscall handlers: - Replace syscall 82 semantics with sys_old_select and remove ppc_select handler, which features direct call to both sys_old_select and sys_select. - Use a generic fallocate compatibility syscall Replace asm implementation of syscall table with C implementation for more compile-time checks. Many compatibility syscalls are candidates to be removed in favour of generically defined handlers, but exhibit different parameter orderings and numberings due to 32-bit ABI support for 64-bit parameters. The parameter reorderings are however consistent with arm. A future patch series will serve to modernise syscalls by providing generic implementations featuring these reorderings. The design of this syscall is very similar to the s390, x86 and arm64 implementations. See also Commit 4378a7d4be30 (arm64: implement syscall wrappers). The motivation for this change is that it allows for the clearing of register state when entering the kernel via through interrupt handlers on 64-bit servers. This serves to reduce the influence of values in registers carried over from the interrupted process, e.g. syscall parameters from user space, or user state at the site of a pagefault. All values in registers are saved and zeroized at the entry to an interrupt handler and restored afterward. While this may sound like a heavy-weight mitigation, many gprs are already saved and restored on handling of an interrupt, and the mmap_bench benchmark on Power 9 guest, repeatedly invoking the pagefault handler suggests at most ~0.8% regression in performance. Realistic workloads are not constantly producing interrupts, and so this does not indicate realistic slowdown. Using wrapped syscalls yields to a performance improvement of ~5.6% on the null_syscall benchmark on pseries guests, by removing the need for system_call_exception to allocate its own stack frame. This amortises the additional costs of saving and restoring non-volatile registers (register clearing is cheap on super scalar platforms), and so the final mitigation actually yields a net performance improvement of ~0.6% on the null_syscall benchmark. Patch Changelog: - Fix instances where NULLIFY_GPRS were still present - Minimise unrecoverable windows in entry_32.S between SRR0/1 restores and RFI - Remove all references to syscall symbols prior to introducing syscall wrapper. - Remove unnecessary duplication of syscall handlers with sys_... and powerpc_sys_... symbols. - Clear non-volatile registers on Book3E systems, as some of these systems feature hardware speculation, and we already unconditionally restore NVGPRS. Rohan McLure (20): powerpc: Remove asmlinkage from syscall handler definitions powerpc: Use generic fallocate compatibility syscall powerpc/32: Remove powerpc select specialisation powerpc: Provide do_ppc64_personality helper powerpc: Remove direct call to personality syscall handler powerpc: Remove direct call to mmap2 syscall handlers powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers powerpc: Include all arch-specific syscall prototypes powerpc: Enable compile-time check for syscall handlers powerpc: Use common syscall handler type powerpc: Add ZEROIZE_GPRS macros for register clears Revert "powerpc/syscall: Save r3 in regs->orig_r3" powerpc: Provide syscall wrapper powerpc/64s: Clear/restore caller gprs in syscall interrupt/return powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S powerpc/64e: Clarify register saves and clears with {SAVE,ZEROIZE}_GPRS powerpc/64s: Fix comment on interrupt handler prologue powerpc/64s: Clear gprs on interrupt routine entry in Book3S powerpc/64e: Clear gprs on interrupt routine entry arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/compat.h| 5 + arch/powerpc/include/asm/interrupt.h | 3 +- arch/powerpc/include/asm/ppc_asm.h | 22 +++ arch/powerpc/include/asm/syscall.h | 11 +- arch/powerpc/include/asm/syscall_wrapper.h | 84 +++ arch/powerpc/include/asm/syscall
[PATCH v4 08/20] powerpc: Include all arch-specific syscall prototypes
Forward declare all syscall handler prototypes where a generic prototype is not provided in either linux/syscalls.h or linux/compat.h in asm/syscalls.h. This is required for compile-time type-checking for syscall handlers, which is implemented later in this series. 32-bit compatibility syscall handlers are expressed in terms of types in ppc32.h. Expose this header globally. Signed-off-by: Rohan McLure --- V1 -> V2: Explicitly include prototypes. V2 -> V3: Remove extraneous #include and ppc_fallocate prototype. Rename header. --- arch/powerpc/include/asm/syscalls.h | 90 +- .../ppc32.h => include/asm/syscalls_32.h}| 0 arch/powerpc/kernel/signal_32.c | 2 +- arch/powerpc/perf/callchain_32.c | 2 +- 4 files changed, 70 insertions(+), 24 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 3e3aff0835a6..91417dee534e 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,45 +8,91 @@ #include #include +#ifdef CONFIG_PPC64 +#include +#endif +#include +#include + struct rtas_args; +#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER + +/* + * PowerPC architecture-specific syscalls + */ + +long sys_rtas(struct rtas_args __user *uargs); +long sys_ni_syscall(void); + +#ifdef CONFIG_PPC64 +long sys_ppc64_personality(unsigned long personality); +#ifdef CONFIG_COMPAT +long compat_sys_ppc64_personality(unsigned long personality); +#endif /* CONFIG_COMPAT */ +#endif /* CONFIG_PPC64 */ + +/* Parameters are reordered for powerpc to avoid padding */ +long sys_ppc_fadvise64_64(int fd, int advice, + u32 offset_high, u32 offset_low, + u32 len_high, u32 len_low); +long sys_swapcontext(struct ucontext __user *old_ctx, +struct ucontext __user *new_ctx, long ctx_size); long sys_mmap(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, unsigned long fd, off_t offset); long sys_mmap2(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, unsigned long fd, unsigned long pgoff); -long sys_ppc64_personality(unsigned long personality); -long sys_rtas(struct rtas_args __user *uargs); -long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, - u32 len_high, u32 len_low); +long sys_switch_endian(void); -#ifdef CONFIG_COMPAT -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); - -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2); +#ifdef CONFIG_PPC32 +long sys_sigreturn(void); +long sys_debug_setcontext(struct ucontext __user *ctx, int ndbg, + struct sig_dbg_op __user *dbg); +#endif -compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2); +long sys_rt_sigreturn(void); -compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count); +long sys_subpage_prot(unsigned long addr, + unsigned long len, u32 __user *map); -int compat_sys_truncate64(const char __user *path, u32 reg4, - unsigned long len1, unsigned long len2); +#ifdef CONFIG_COMPAT +long compat_sys_swapcontext(struct ucontext32 __user *old_ctx, + struct ucontext32 __user *new_ctx, + int ctx_size); +long compat_sys_old_getrlimit(unsigned int resource, + struct compat_rlimit __user *rlim); +long compat_sys_sigreturn(void); +long compat_sys_rt_sigreturn(void); -int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, - unsigned long len2); +/* Architecture-specific implementations in sys_ppc32.c */ +long compat_sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long compat_sys_ppc_pread64(unsigned int fd, + char __user *ubuf, compat_size_t count, + u32 reg6, u32 pos1, u32 pos2); +long compat_sys_ppc_pwrite64(unsigned int fd, +const char __user *ubuf, compat_size_t count, +u32 reg6, u32 pos1, u32 pos2); +long compat_sys_ppc_readahead(int fd, u32 r4, + u32 offset1, u32 offset2, u32 count); +long compat_sys_ppc_truncate64(const char __user *path, u32 reg4, + unsigned long len1, unsigned long len2); +long compat_sys_ppc_ftruncate64(unsigned int fd,
[PATCH v4 07/20] powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers
Arch-specific implementations of syscall handlers are currently used over generic implementations for the following reasons: 1. Semantics unique to powerpc 2. Compatibility syscalls require 'argument padding' to comply with 64-bit argument convention in ELF32 abi. 3. Parameter types or order is different in other architectures. These syscall handlers have been defined prior to this patch series without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with custom input and output types. We remove every such direct definition in favour of the aforementioned macros. Also update syscalls.tbl in order to refer to the symbol names generated by each of these macros. Since ppc64_personality can be called by both 64 bit and 32 bit binaries through compatibility, we must generate both both compat_sys_ and sys_ symbols for this handler. A number of architectures including arm and powerpc agree on an alternative argument order and numbering for most of these arch-specific handlers. A future patch series may allow for asm/unistd.h to signal through its defines that a generic implementation of these syscall handlers with the correct calling convention be omitted, through the __ARCH_WANT_COMPAT_SYS_... convention. Signed-off-by: Rohan McLure --- V1 -> V2: All syscall handlers wrapped by this macro. V2 -> V3: Move creation of do_ppc64_personality helper to prior patch. V3 -> V4: Fix parenthesis alignment. Don't emit sys_*** symbols. --- arch/powerpc/include/asm/syscalls.h | 10 ++-- arch/powerpc/kernel/sys_ppc32.c | 45 ++ arch/powerpc/kernel/syscalls.c | 17 +-- arch/powerpc/kernel/syscalls/syscall.tbl | 22 - .../arch/powerpc/entry/syscalls/syscall.tbl | 22 - 5 files changed, 64 insertions(+), 52 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 739498c358a1..3e3aff0835a6 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -16,10 +16,10 @@ long sys_mmap(unsigned long addr, size_t len, long sys_mmap2(unsigned long addr, size_t len, unsigned long prot, unsigned long flags, unsigned long fd, unsigned long pgoff); -long ppc64_personality(unsigned long personality); +long sys_ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, - u32 len_high, u32 len_low); +long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, + u32 len_high, u32 len_low); #ifdef CONFIG_COMPAT unsigned long compat_sys_mmap2(unsigned long addr, size_t len, @@ -40,8 +40,8 @@ int compat_sys_truncate64(const char __user *path, u32 reg4, int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); -long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, -size_t len, int advice); +long compat_sys_ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, + size_t len, int advice); long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned int offset1, unsigned int offset2, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index bc6491ed6454..dd9039671227 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -59,52 +59,55 @@ #define merge_64(high, low) ((u64)high << 32) | low #endif -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, -u32 reg6, u32 pos1, u32 pos2) +COMPAT_SYSCALL_DEFINE6(ppc_pread64, + unsigned int, fd, + char __user *, ubuf, compat_size_t, count, + u32, reg6, u32, pos1, u32, pos2) { return ksys_pread64(fd, ubuf, count, merge_64(pos1, pos2)); } -compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2) +COMPAT_SYSCALL_DEFINE6(ppc_pwrite64, + unsigned int, fd, + const char __user *, ubuf, compat_size_t, count, + u32, reg6, u32, pos1, u32, pos2) { return ksys_pwrite64(fd, ubuf, count, merge_64(pos1, pos2)); } -compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count) +COMPAT_SYSCALL_DEFINE5(ppc_readahead, + int, fd, u32, r4, + u32, offset1, u32, offset2, u32, count) { return ksys_readahead(fd, merge_64(offset1, offset2), count); } -int compat_sys_truncate64(const char __user * path, u32 reg4, - unsigned long len1, unsigned long len2) +COMPAT_
[PATCH v4 06/20] powerpc: Remove direct call to mmap2 syscall handlers
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Move the compatibility syscall definition for mmap2 to syscalls.c, so that all mmap implementations can share an inline helper function, as is done with the personality handlers. Signed-off-by: Rohan McLure --- V1 -> V2: Move mmap2 compat implementation to asm/kernel/syscalls.c. V3 -> V4: Move to be applied before syscall wrapper introduced. --- arch/powerpc/kernel/sys_ppc32.c | 9 - arch/powerpc/kernel/syscalls.c | 11 +++ 2 files changed, 11 insertions(+), 9 deletions(-) diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index f4edcc9489fb..bc6491ed6454 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include #include @@ -48,14 +47,6 @@ #include #include -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff) -{ - /* This should remain 12 even if PAGE_SIZE changes */ - return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); -} - /* * long long munging: * The 32 bit ABI passes long longs in an odd even register pair. diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index b8461128c8f7..32fadf3c2cd3 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -56,6 +56,17 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, pgoff, PAGE_SHIFT-12); } +#ifdef CONFIG_COMPAT +COMPAT_SYSCALL_DEFINE6(mmap2, + unsigned long, addr, size_t, len, + unsigned long, prot, unsigned long, flags, + unsigned long, fd, unsigned long, pgoff) +{ + /* This should remain 12 even if PAGE_SIZE changes */ + return do_mmap2(addr, len, prot, flags, fd, pgoff << 12, PAGE_SHIFT-12); +} +#endif + SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, unsigned long, prot, unsigned long, flags, unsigned long, fd, off_t, offset) -- 2.34.1
[PATCH v4 05/20] powerpc: Remove direct call to personality syscall handler
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Fortunately, in the case of ppc64_personality, its call to sys_personality can be replaced with an invocation to the equivalent ksys_personality inline helper in . Signed-off-by: Rohan McLure --- V1 -> V2: Use inline helper to deduplicate bodies in compat/regular implementations. V3 -> V4: Move to be applied before syscall wrapper. --- arch/powerpc/kernel/syscalls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index 9f29e451e2de..b8461128c8f7 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -71,7 +71,7 @@ static inline long do_ppc64_personality(unsigned long personality) if (personality(current->personality) == PER_LINUX32 && personality(personality) == PER_LINUX) personality = (personality & ~PER_MASK) | PER_LINUX32; - ret = sys_personality(personality); + ret = ksys_personality(personality); if (personality(ret) == PER_LINUX32) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; -- 2.34.1
[PATCH v4 03/20] powerpc/32: Remove powerpc select specialisation
Syscall #82 has been implemented for 32-bit platforms in a unique way on powerpc systems. This hack will in effect guess whether the caller is expecting new select semantics or old select semantics. It does so via a guess, based off the first parameter. In new select, this parameter represents the length of a user-memory array of file descriptors, and in old select this is a pointer to an arguments structure. The heuristic simply interprets sufficiently large values of its first parameter as being a call to old select. The following is a discussion on how this syscall should be handled. Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a...@csgroup.eu/ As discussed in this thread, the existence of such a hack suggests that for whatever powerpc binaries may predate glibc, it is most likely that they would have taken use of the old select semantics. x86 and arm64 both implement this syscall with oldselect semantics. Remove the powerpc implementation, and update syscall.tbl to refer to emit a reference to sys_old_select for 32-bit binaries, in keeping with how other architectures support syscall #82. Signed-off-by: Rohan McLure --- V1 -> V2: Remove arch-specific select handler V2 -> V3: Remove ppc_old_select prototype in . Move to earlier in patch series --- arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/kernel/syscalls.c| 17 - arch/powerpc/kernel/syscalls/syscall.tbl | 2 +- .../arch/powerpc/entry/syscalls/syscall.tbl | 2 +- 4 files changed, 2 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 675a8f5ec3ca..739498c358a1 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -18,8 +18,6 @@ long sys_mmap2(unsigned long addr, size_t len, unsigned long fd, unsigned long pgoff); long ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, - fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low); diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index fc999140bc27..ef5896bee818 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -63,23 +63,6 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, offset, PAGE_SHIFT); } -#ifdef CONFIG_PPC32 -/* - * Due to some executables calling the wrong select we sometimes - * get wrong args. This determines how the args are being passed - * (a single ptr to them all args passed) then calls - * sys_select() with the appropriate args. -- Cort - */ -int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp) -{ - if ((unsigned long)n >= 4096) - return sys_old_select((void __user *)n); - - return sys_select(n, inp, outp, exp, tvp); -} -#endif - #ifdef CONFIG_PPC64 long ppc64_personality(unsigned long personality) { diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 2600b4237292..4cbbb810ae10 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select sys_ni_syscall 82 64 select sys_ni_syscall 82 spu select sys_ni_syscall 83 common symlink sys_symlink diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl index 2600b4237292..4cbbb810ae10 100644 --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select sys_ni_syscall 82 64 select sys_ni_syscall 82 spu select sys_ni_syscall 83 common
[PATCH v4 04/20] powerpc: Provide do_ppc64_personality helper
Avoid duplication in future patch that will define the ppc64_personality syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, by extracting the common body of ppc64_personality into a helper function. Signed-off-by: Rohan McLure --- V2 -> V3: New commit. --- arch/powerpc/kernel/syscalls.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index ef5896bee818..9f29e451e2de 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -64,7 +64,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, } #ifdef CONFIG_PPC64 -long ppc64_personality(unsigned long personality) +static inline long do_ppc64_personality(unsigned long personality) { long ret; @@ -76,6 +76,10 @@ long ppc64_personality(unsigned long personality) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; } +long ppc64_personality(unsigned long personality) +{ + return do_ppc64_personality(personality); +} #endif long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, -- 2.34.1
[PATCH v4 01/20] powerpc: Remove asmlinkage from syscall handler definitions
The asmlinkage macro has no special meaning in powerpc, and prior to this patch is used sporadically on some syscall handler definitions. On architectures that do not define asmlinkage, it resolves to extern "C" for C++ compilers and a nop otherwise. The current invocations of asmlinkage provide far from complete support for C++ toolchains, and so the macro serves no purpose in powerpc. Remove all invocations of asmlinkage in arch/powerpc. These incidentally only occur in syscall definitions and prototypes. Signed-off-by: Rohan McLure --- V2 -> V3: new patch --- arch/powerpc/include/asm/syscalls.h | 16 arch/powerpc/kernel/sys_ppc32.c | 8 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index a2b13e55254f..21c2faaa2957 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -10,14 +10,14 @@ struct rtas_args; -asmlinkage long sys_mmap(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, off_t offset); -asmlinkage long sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -asmlinkage long ppc64_personality(unsigned long personality); -asmlinkage long sys_rtas(struct rtas_args __user *uargs); +long sys_mmap(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, off_t offset); +long sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long ppc64_personality(unsigned long personality); +long sys_rtas(struct rtas_args __user *uargs); int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 16ff0399a257..f4edcc9489fb 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -85,20 +85,20 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 return ksys_readahead(fd, merge_64(offset1, offset2), count); } -asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4, +int compat_sys_truncate64(const char __user * path, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_truncate(path, merge_64(len1, len2)); } -asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, +long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2) { return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2, merge_64(len1, len2)); } -asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, +int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_ftruncate(fd, merge_64(len1, len2)); @@ -111,7 +111,7 @@ long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, advice); } -asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags, +long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned offset1, unsigned offset2, unsigned nbytes1, unsigned nbytes2) { -- 2.34.1
[PATCH v4 02/20] powerpc: Use generic fallocate compatibility syscall
The powerpc fallocate compat syscall handler is identical to the generic implementation provided by commit 59c10c52f573f ("riscv: compat: syscall: Add compat_sys_call_table implementation"), and as such can be removed in favour of the generic implementation. A future patch series will replace more architecture-defined syscall handlers with generic implementations, dependent on introducing generic implementations that are compatible with powerpc and arm's parameter reorderings. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure --- V1 -> V2: Remove arch-specific fallocate handler. V2 -> V3: Remove generic fallocate prototype. Move to beginning of series. --- arch/powerpc/include/asm/compat.h | 5 + arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/include/asm/unistd.h | 1 + 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/compat.h b/arch/powerpc/include/asm/compat.h index dda4091fd012..f20caae3f019 100644 --- a/arch/powerpc/include/asm/compat.h +++ b/arch/powerpc/include/asm/compat.h @@ -16,6 +16,11 @@ typedef u16 compat_ipc_pid_t; #include #ifdef __BIG_ENDIAN__ +#define compat_arg_u64(name) u32 name##_hi, u32 name##_lo +#define compat_arg_u64_dual(name) u32, name##_hi, u32, name##_lo +#define compat_arg_u64_glue(name) (((u64)name##_lo & 0xUL) | \ +((u64)name##_hi << 32)) + #define COMPAT_UTS_MACHINE "ppc\0\0" #else #define COMPAT_UTS_MACHINE "ppcle\0\0" diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 21c2faaa2957..675a8f5ec3ca 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -39,8 +39,6 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 int compat_sys_truncate64(const char __user *path, u32 reg4, unsigned long len1, unsigned long len2); -long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2); - int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index b1129b4ef57d..659a996c75aa 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -45,6 +45,7 @@ #define __ARCH_WANT_SYS_UTIME #define __ARCH_WANT_SYS_NEWFSTATAT #define __ARCH_WANT_COMPAT_STAT +#define __ARCH_WANT_COMPAT_FALLOCATE #define __ARCH_WANT_COMPAT_SYS_SENDFILE #endif #define __ARCH_WANT_SYS_FORK -- 2.34.1
Re: [PATCH v3 18/18] powerpc/64s: Clear gprs on interrupt routine entry
> What about arch/powerpc/kernel/exceptions-64e.S, no change required > inside it ? As interru_64.S applies to both 64s and 64e, I would have > expected changes in exceptions_64e too. As it stands the changes in interrupt_64.S cause non-volatiles to be unconditionally restored. This may lead to a performance regression on Book3E, as previously interrupt_return_srr would restore non-volatiles only after handling a signal, otherwise assuming nvgprs to be intact. As some Book3E systems do feature speculation, it makes sense to perform the same mitigation on these systems as performed on Book3S systems.
Re: [PATCH v3 14/18] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
> On 19 Aug 2022, at 4:52 pm, Christophe Leroy > wrote: > > > > Le 19/08/2022 à 05:38, Rohan McLure a écrit : >> Clear user state in gprs (assign to zero) to reduce the influence of user >> registers on speculation within kernel syscall handlers. Clears occur >> at the very beginning of the sc and scv 0 interrupt handlers, with >> restores occurring following the execution of the syscall handler. >> >> Signed-off-by: Rohan McLure >> --- >> V1 -> V2: Update summary >> V2 -> V3: Remove erroneous summary paragraph on syscall_exit_prepare >> --- >> arch/powerpc/kernel/interrupt_64.S | 22 ++ >> 1 file changed, 18 insertions(+), 4 deletions(-) >> >> diff --git a/arch/powerpc/kernel/interrupt_64.S >> b/arch/powerpc/kernel/interrupt_64.S >> index 0178aeba3820..d9625113c7a5 100644 >> --- a/arch/powerpc/kernel/interrupt_64.S >> +++ b/arch/powerpc/kernel/interrupt_64.S >> @@ -70,7 +70,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) >> ld r2,PACATOC(r13) >> mfcrr12 >> li r11,0 >> -/* Can we avoid saving r3-r8 in common case? */ >> +/* Save syscall parameters in r3-r8 */ >> std r3,GPR3(r1) >> std r4,GPR4(r1) >> std r5,GPR5(r1) >> @@ -109,6 +109,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> * but this is the best we can do. >> */ >> >> +/* >> + * Zero user registers to prevent influencing speculative execution >> + * state of kernel code. >> + */ >> +NULLIFY_GPRS(5, 12) > > Macro name has changed. > >> +NULLIFY_NVGPRS() > > Why clearing non volatile GPRs ? They are supposed to be callee saved so > I can't see how they could be used for speculation. Do you have any > exemple ? > >> + >> /* Calling convention has r3 = orig r0, r4 = regs */ >> mr r3,r0 >> bl system_call_exception >> @@ -139,6 +146,7 @@ BEGIN_FTR_SECTION >> HMT_MEDIUM_LOW >> END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> >> +REST_NVGPRS(r1) > > > >> cmpdi r3,0 >> bne .Lsyscall_vectored_\name\()_restore_regs >> >> @@ -181,7 +189,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> ld r4,_LINK(r1) >> ld r5,_XER(r1) >> >> -REST_NVGPRS(r1) >> ld r0,GPR0(r1) >> mtcrr2 >> mtctr r3 >> @@ -249,7 +256,7 @@ END_BTB_FLUSH_SECTION >> ld r2,PACATOC(r13) >> mfcrr12 >> li r11,0 >> -/* Can we avoid saving r3-r8 in common case? */ >> +/* Save syscall parameters in r3-r8 */ >> std r3,GPR3(r1) >> std r4,GPR4(r1) >> std r5,GPR5(r1) >> @@ -300,6 +307,13 @@ END_BTB_FLUSH_SECTION >> wrteei 1 >> #endif >> >> +/* >> + * Zero user registers to prevent influencing speculative execution >> + * state of kernel code. >> + */ >> +NULLIFY_GPRS(5, 12) >> +NULLIFY_NVGPRS() >> + > > Name has changed. > >> /* Calling convention has r3 = orig r0, r4 = regs */ >> mr r3,r0 >> bl system_call_exception >> @@ -342,6 +356,7 @@ BEGIN_FTR_SECTION >> stdcx. r0,0,r1 /* to clear the reservation */ >> END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) >> >> +REST_NVGPRS(r1) > > Same question. > >> cmpdi r3,0 >> bne .Lsyscall_restore_regs >> /* Zero volatile regs that may contain sensitive kernel data */ >> @@ -377,7 +392,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) >> .Lsyscall_restore_regs: >> ld r3,_CTR(r1) >> ld r4,_XER(r1) >> -REST_NVGPRS(r1) >> mtctr r3 >> mtspr SPRN_XER,r4 >> ld r0,GPR0(r1) > Why clearing non volatile GPRs ? They are supposed to be callee saved so > I can't see how they could be used for speculation. Do you have any > exemple ? Even though non-volatiles will be callee-saved subject to ABI in the syscall handler prologue, it is still conceivable that a syscall handler will leave a register uninitialised on one branch outcome, assigning that register in the other. On speculative processors, there remains the possibility for untaken branches to be executed microarchitecturally (by mistraining the branch predictor or otherwise), whereby the microarchitectural effects of the execution present a side-channel. Such a hardening measure removes the burden for de
[PATCH v3 11/18] powerpc: Add ZEROIZE_GPRS macros for register clears
Macros for restoring and saving registers to and from the stack exist. Provide macros with the same interface for clearing a range of gprs by setting each register's value in that range to zero. The resulting macros are called ZEROIZE_GPRS and ZEROIZE_NVGPRS, keeping with the naming of the accompanying restore and save macros, and usage of nullify to describe this operation elsewhere in the kernel. Signed-off-by: Rohan McLure --- V1 -> V2: Change 'ZERO' usage in naming to 'NULLIFY', a more obvious verb V2 -> V3: Change 'NULLIFY' usage in naming to 'ZEROIZE', which has precedent in kernel and explicitly specifies that we are zeroing. --- arch/powerpc/include/asm/ppc_asm.h | 22 ++ 1 file changed, 22 insertions(+) diff --git a/arch/powerpc/include/asm/ppc_asm.h b/arch/powerpc/include/asm/ppc_asm.h index 83c02f5a7f2a..b95689ada59c 100644 --- a/arch/powerpc/include/asm/ppc_asm.h +++ b/arch/powerpc/include/asm/ppc_asm.h @@ -33,6 +33,20 @@ .endr .endm +/* + * This expands to a sequence of register clears for regs start to end + * inclusive, of the form: + * + * li rN, 0 + */ +.macro ZEROIZE_REGS start, end + .Lreg=\start + .rept (\end - \start + 1) + li .Lreg, 0 + .Lreg=.Lreg+1 + .endr +.endm + /* * Macros for storing registers into and loading registers from * exception frames. @@ -49,6 +63,14 @@ #define REST_NVGPRS(base) REST_GPRS(13, 31, base) #endif +#defineZEROIZE_GPRS(start, end)ZEROIZE_REGS start, end +#ifdef __powerpc64__ +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(14, 31) +#else +#defineZEROIZE_NVGPRS()ZEROIZE_GPRS(13, 31) +#endif +#defineZEROIZE_GPR(n) ZEROIZE_GPRS(n, n) + #define SAVE_GPR(n, base) SAVE_GPRS(n, n, base) #define REST_GPR(n, base) REST_GPRS(n, n, base) -- 2.34.1
[PATCH v3 16/18] powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S
Restoring the register state of the interrupted thread involves issuing a large number of predictable loads to the kernel stack frame. Issue the REST_GPR{,S} macros to clearly signal when this is happening, and bunch together restores at the end of the interrupt handler where the saved value is not consumed earlier in the handler code. Signed-off-by: Rohan McLure Reported-by: Christophe Leroy --- V2 -> V3: New patch. --- arch/powerpc/kernel/entry_32.S | 35 1 file changed, 13 insertions(+), 22 deletions(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 8d6e02ef5dc0..03a105a5806a 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -68,7 +68,7 @@ prepare_transfer_to_handler: lwz r9,_MSR(r11)/* if sleeping, clear MSR.EE */ rlwinm r9,r9,0,~MSR_EE lwz r12,_LINK(r11) /* and return to address in LR */ - lwz r2, GPR2(r11) + REST_GPR(2, r11) b fast_exception_return _ASM_NOKPROBE_SYMBOL(prepare_transfer_to_handler) #endif /* CONFIG_PPC_BOOK3S_32 || CONFIG_E500 */ @@ -144,7 +144,7 @@ ret_from_syscall: lwz r7,_NIP(r1) lwz r8,_MSR(r1) cmpwi r3,0 - lwz r3,GPR3(r1) + REST_GPR(3, r1) syscall_exit_finish: mtspr SPRN_SRR0,r7 mtspr SPRN_SRR1,r8 @@ -152,8 +152,8 @@ syscall_exit_finish: bne 3f mtcrr5 -1: lwz r2,GPR2(r1) - lwz r1,GPR1(r1) +1: REST_GPR(2, r1) + REST_GPR(1, r1) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -165,10 +165,8 @@ syscall_exit_finish: REST_NVGPRS(r1) mtctr r4 mtxer r5 - lwz r0,GPR0(r1) - lwz r3,GPR3(r1) - REST_GPRS(4, 11, r1) - lwz r12,GPR12(r1) + REST_GPR(0, r1) + REST_GPRS(3, 12, r1) b 1b #ifdef CONFIG_44x @@ -260,24 +258,22 @@ fast_exception_return: beq 3f /* if not, we've got problems */ #endif -2: REST_GPRS(3, 6, r11) - lwz r10,_CCR(r11) - REST_GPRS(1, 2, r11) +2: lwz r10,_CCR(r11) mtcrr10 lwz r10,_LINK(r11) mtlrr10 /* Clear the exception_marker on the stack to avoid confusing stacktrace */ li r10, 0 stw r10, 8(r11) - REST_GPR(10, r11) #if defined(CONFIG_PPC_8xx) && defined(CONFIG_PERF_EVENTS) mtspr SPRN_NRI, r0 #endif mtspr SPRN_SRR1,r9 mtspr SPRN_SRR0,r12 - REST_GPR(9, r11) + REST_GPRS(1, 6, r11) + REST_GPRS(9, 10, r11) REST_GPR(12, r11) - lwz r11,GPR11(r11) + REST_GPR(11, r11) rfi #ifdef CONFIG_40x b . /* Prevent prefetch past rfi */ @@ -454,9 +450,7 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r3,_MSR(r1);\ andi. r3,r3,MSR_PR; \ bne interrupt_return; \ - lwz r0,GPR0(r1);\ - lwz r2,GPR2(r1);\ - REST_GPRS(3, 8, r1);\ + REST_GPR(0, r1);\ lwz r10,_XER(r1); \ lwz r11,_CTR(r1); \ mtspr SPRN_XER,r10; \ @@ -475,11 +469,8 @@ _ASM_NOKPROBE_SYMBOL(interrupt_return) lwz r12,_MSR(r1); \ mtspr exc_lvl_srr0,r11; \ mtspr exc_lvl_srr1,r12; \ - lwz r9,GPR9(r1);\ - lwz r12,GPR12(r1); \ - lwz r10,GPR10(r1); \ - lwz r11,GPR11(r1); \ - lwz r1,GPR1(r1);\ + REST_GPRS(2, 12, r1); \ + REST_GPR(1, r1);\ exc_lvl_rfi;\ b .; /* prevent prefetch past exc_lvl_rfi */ -- 2.34.1
[PATCH v3 07/18] powerpc: Remove direct call to mmap2 syscall handlers
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Move the compatibility syscall definition for mmap2 to syscalls.c, so that all mmap implementations can share an inline helper function, as is done with the personality handlers. Signed-off-by: Rohan McLure --- V1 -> V2: Move mmap2 compat implementation to asm/kernel/syscalls.c. --- arch/powerpc/kernel/sys_ppc32.c | 10 -- arch/powerpc/kernel/syscalls.c | 11 +++ 2 files changed, 11 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 60cb5b4413b0..dd9039671227 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -25,7 +25,6 @@ #include #include #include -#include #include #include #include @@ -48,15 +47,6 @@ #include #include -COMPAT_SYSCALL_DEFINE6(mmap2, - unsigned long, addr, size_t, len, - unsigned long, prot, unsigned long, flags, - unsigned long, fd, unsigned long, pgoff) -{ - /* This should remain 12 even if PAGE_SIZE changes */ - return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); -} - /* * long long munging: * The 32 bit ABI passes long longs in an odd even register pair. diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index e083935c5bf2..0afbcbd50433 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -56,6 +56,17 @@ SYSCALL_DEFINE6(mmap2, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, pgoff, PAGE_SHIFT-12); } +#ifdef CONFIG_COMPAT +COMPAT_SYSCALL_DEFINE6(mmap2, + unsigned long, addr, size_t, len, + unsigned long, prot, unsigned long, flags, + unsigned long, fd, unsigned long, pgoff) +{ + /* This should remain 12 even if PAGE_SIZE changes */ + return do_mmap2(addr, len, prot, flags, fd, pgoff << 12, PAGE_SHIFT-12); +} +#endif + SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, unsigned long, prot, unsigned long, flags, unsigned long, fd, off_t, offset) -- 2.34.1
[PATCH v3 02/18] powerpc: Use generic fallocate compatibility syscall
The powerpc fallocate compat syscall handler is identical to the generic implementation provided by commit 59c10c52f573f ("riscv: compat: syscall: Add compat_sys_call_table implementation"), and as such can be removed in favour of the generic implementation. A future patch series will replace more architecture-defined syscall handlers with generic implementations, dependent on introducing generic implementations that are compatible with powerpc and arm's parameter reorderings. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure --- V1 -> V2: Remove arch-specific fallocate handler. V2 -> V3: Remove generic fallocate prototype. Move to beginning of series. --- arch/powerpc/include/asm/compat.h | 5 + arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/include/asm/unistd.h | 1 + 3 files changed, 6 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/include/asm/compat.h b/arch/powerpc/include/asm/compat.h index dda4091fd012..f20caae3f019 100644 --- a/arch/powerpc/include/asm/compat.h +++ b/arch/powerpc/include/asm/compat.h @@ -16,6 +16,11 @@ typedef u16 compat_ipc_pid_t; #include #ifdef __BIG_ENDIAN__ +#define compat_arg_u64(name) u32 name##_hi, u32 name##_lo +#define compat_arg_u64_dual(name) u32, name##_hi, u32, name##_lo +#define compat_arg_u64_glue(name) (((u64)name##_lo & 0xUL) | \ +((u64)name##_hi << 32)) + #define COMPAT_UTS_MACHINE "ppc\0\0" #else #define COMPAT_UTS_MACHINE "ppcle\0\0" diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 21c2faaa2957..675a8f5ec3ca 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -39,8 +39,6 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 int compat_sys_truncate64(const char __user *path, u32 reg4, unsigned long len1, unsigned long len2); -long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2); - int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); diff --git a/arch/powerpc/include/asm/unistd.h b/arch/powerpc/include/asm/unistd.h index b1129b4ef57d..659a996c75aa 100644 --- a/arch/powerpc/include/asm/unistd.h +++ b/arch/powerpc/include/asm/unistd.h @@ -45,6 +45,7 @@ #define __ARCH_WANT_SYS_UTIME #define __ARCH_WANT_SYS_NEWFSTATAT #define __ARCH_WANT_COMPAT_STAT +#define __ARCH_WANT_COMPAT_FALLOCATE #define __ARCH_WANT_COMPAT_SYS_SENDFILE #endif #define __ARCH_WANT_SYS_FORK -- 2.34.1
[PATCH v3 09/18] powerpc: Enable compile-time check for syscall handlers
The table of syscall handlers and registered compatibility syscall handlers has in past been produced using assembly, with function references resolved at link time. This moves link-time errors to compile-time, by rewriting systbl.S in C, and including the linux/syscalls.h, linux/compat.h and asm/syscalls.h headers for prototypes. Reported-by: Arnd Bergmann Signed-off-by: Rohan McLure --- V1 -> V2: New patch. --- arch/powerpc/kernel/{systbl.S => systbl.c} | 27 ++-- 1 file changed, 14 insertions(+), 13 deletions(-) diff --git a/arch/powerpc/kernel/systbl.S b/arch/powerpc/kernel/systbl.c similarity index 59% rename from arch/powerpc/kernel/systbl.S rename to arch/powerpc/kernel/systbl.c index cb3358886203..99ffdfef6b9c 100644 --- a/arch/powerpc/kernel/systbl.S +++ b/arch/powerpc/kernel/systbl.c @@ -10,31 +10,32 @@ * PPC64 updates by Dave Engebretsen (engeb...@us.ibm.com) */ -#include +#include +#include +#include +#include -.section .rodata,"a" +#define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) -#ifdef CONFIG_PPC64 - .p2align3 -#define __SYSCALL(nr, entry) .8byte entry +#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER +#define __SYSCALL(nr, entry) [nr] = __powerpc_##entry, +#define __powerpc_sys_ni_syscall sys_ni_syscall #else -#define __SYSCALL(nr, entry) .long entry +#define __SYSCALL(nr, entry) [nr] = entry, #endif -#define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, native) -.globl sys_call_table -sys_call_table: +void *sys_call_table[] = { #ifdef CONFIG_PPC64 #include #else #include #endif +}; #ifdef CONFIG_COMPAT #undef __SYSCALL_WITH_COMPAT #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, compat) -.globl compat_sys_call_table -compat_sys_call_table: -#define compat_sys_sigsuspend sys_sigsuspend +void *compat_sys_call_table[] = { #include -#endif +}; +#endif /* CONFIG_COMPAT */ -- 2.34.1
[PATCH v3 17/18] powerpc/64s: Fix comment on interrupt handler prologue
Interrupt handlers on 64s systems will often need to save register state from the interrupted process to make space for loading special purpose registers or for internal state. Fix a comment documenting a common code path macro in the beginning of interrupt handlers where r10 is saved to the PACA to afford space for the value of the CFAR. Comment is currently written as if r10-r12 are saved to PACA, but in fact only r10 is saved, with r11-r12 saved much later. The distance in code between these saves has grown over the many revisions of this macro. Fix this by signalling with a comment where r11-r12 are saved to the PACA. Signed-off-by: Rohan McLure --- V1 -> V2: Given its own commit V2 -> V3: Annotate r11-r12 save locations with comment. --- arch/powerpc/kernel/exceptions-64s.S | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index 3d0dc133a9ae..a3b51441b039 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -281,7 +281,7 @@ BEGIN_FTR_SECTION mfspr r9,SPRN_PPR END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) HMT_MEDIUM - std r10,IAREA+EX_R10(r13) /* save r10 - r12 */ + std r10,IAREA+EX_R10(r13) /* save r10 */ .if ICFAR BEGIN_FTR_SECTION mfspr r10,SPRN_CFAR @@ -321,7 +321,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mfctr r10 std r10,IAREA+EX_CTR(r13) mfcrr9 - std r11,IAREA+EX_R11(r13) + std r11,IAREA+EX_R11(r13) /* save r11 - r12 */ std r12,IAREA+EX_R12(r13) /* -- 2.34.1
[PATCH v3 18/18] powerpc/64s: Clear gprs on interrupt routine entry
Zero GPRS r0, r2-r11, r14-r31, on entry into the kernel for all other interrupt sources to limit influence of user-space values in potential speculation gadgets. The remaining gprs are overwritten by entry macros to interrupt handlers, irrespective of whether or not a given handler consumes these register values. Prior to this commit, r14-r31 are restored on a per-interrupt basis at exit, but now they are always restored. Remove explicit REST_NVGPRS invocations as non-volatiles must now always be restored. 32-bit systems do not clear user registers on interrupt, and continue to depend on the return value of interrupt_exit_user_prepare to determine whether or not to restore non-volatiles. The mmap_bench benchmark in selftests should rapidly invoke pagefaults. See ~0.8% performance regression with this mitigation, but this indicates the worst-case performance due to heavier-weight interrupt handlers. Signed-off-by: Rohan McLure --- V1 -> V2: Add benchmark data V2 -> V3: Use ZEROIZE_GPR{,S} macro renames, clarify interrupt_exit_user_prepare changes in summary. --- arch/powerpc/kernel/exceptions-64s.S | 21 - arch/powerpc/kernel/interrupt_64.S | 9 ++--- 2 files changed, 10 insertions(+), 20 deletions(-) diff --git a/arch/powerpc/kernel/exceptions-64s.S b/arch/powerpc/kernel/exceptions-64s.S index a3b51441b039..038e42fb2182 100644 --- a/arch/powerpc/kernel/exceptions-64s.S +++ b/arch/powerpc/kernel/exceptions-64s.S @@ -502,6 +502,7 @@ DEFINE_FIXED_SYMBOL(\name\()_common_real, text) std r10,0(r1) /* make stack chain pointer */ std r0,GPR0(r1) /* save r0 in stackframe*/ std r10,GPR1(r1)/* save r1 in stackframe*/ + ZEROIZE_GPR(0) /* Mark our [H]SRRs valid for return */ li r10,1 @@ -538,14 +539,18 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r10,IAREA+EX_R10(r13) std r9,GPR9(r1) std r10,GPR10(r1) + ZEROIZE_GPRS(9, 10) ld r9,IAREA+EX_R11(r13)/* move r11 - r13 to stackframe */ ld r10,IAREA+EX_R12(r13) ld r11,IAREA+EX_R13(r13) std r9,GPR11(r1) std r10,GPR12(r1) std r11,GPR13(r1) + /* keep r12 ([H]SRR1/MSR), r13 (PACA) for interrupt routine */ + ZEROIZE_GPR(11) SAVE_NVGPRS(r1) + ZEROIZE_NVGPRS() .if IDAR .if IISIDE @@ -577,8 +582,8 @@ BEGIN_FTR_SECTION END_FTR_SECTION_IFSET(CPU_FTR_CFAR) ld r10,IAREA+EX_CTR(r13) std r10,_CTR(r1) - std r2,GPR2(r1) /* save r2 in stackframe*/ - SAVE_GPRS(3, 8, r1) /* save r3 - r8 in stackframe */ + SAVE_GPRS(2, 8, r1) /* save r2 - r8 in stackframe */ + ZEROIZE_GPRS(2, 8) mflrr9 /* Get LR, later save to stack */ ld r2,PACATOC(r13) /* get kernel TOC into r2 */ std r9,_LINK(r1) @@ -696,6 +701,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_CFAR) mtlrr9 ld r9,_CCR(r1) mtcrr9 + REST_NVGPRS(r1) REST_GPRS(2, 13, r1) REST_GPR(0, r1) /* restore original r1. */ @@ -1368,11 +1374,6 @@ ALT_MMU_FTR_SECTION_END_IFCLR(MMU_FTR_TYPE_RADIX) b interrupt_return_srr 1: bl do_break - /* -* do_break() may have changed the NV GPRS while handling a breakpoint. -* If so, we need to restore them with their updated values. -*/ - REST_NVGPRS(r1) b interrupt_return_srr @@ -1598,7 +1599,6 @@ EXC_COMMON_BEGIN(alignment_common) GEN_COMMON alignment addir3,r1,STACK_FRAME_OVERHEAD bl alignment_exception - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_srr @@ -1708,7 +1708,6 @@ EXC_COMMON_BEGIN(program_check_common) .Ldo_program_check: addir3,r1,STACK_FRAME_OVERHEAD bl program_check_exception - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_srr @@ -2139,7 +2138,6 @@ EXC_COMMON_BEGIN(emulation_assist_common) GEN_COMMON emulation_assist addir3,r1,STACK_FRAME_OVERHEAD bl emulation_assist_interrupt - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_hsrr @@ -2457,7 +2455,6 @@ EXC_COMMON_BEGIN(facility_unavailable_common) GEN_COMMON facility_unavailable addir3,r1,STACK_FRAME_OVERHEAD bl facility_unavailable_exception - REST_NVGPRS(r1) /* instruction emulation may change GPRs */ b interrupt_return_srr @@ -2485,7 +2482,6 @@ EXC_COMMON_BEGIN(h_facility_unavailable_common) GEN_COMMON h_facility_unavailable addir3,r1,STACK_FRAME_OVERHEAD
[PATCH v3 05/18] powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers
Arch-specific implementations of syscall handlers are currently used over generic implementations for the following reasons: 1. Semantics unique to powerpc 2. Compatibility syscalls require 'argument padding' to comply with 64-bit argument convention in ELF32 abi. 3. Parameter types or order is different in other architectures. These syscall handlers have been defined prior to this patch series without invoking the SYSCALL_DEFINE or COMPAT_SYSCALL_DEFINE macros with custom input and output types. We remove every such direct definition in favour of the aforementioned macros. Also update syscalls.tbl in order to refer to the symbol names generated by each of these macros. Since ppc64_personality can be called by both 64 bit and 32 bit binaries through compatibility, we must generate both both compat_sys_ and sys_ symbols for this handler. A number of architectures including arm and powerpc agree on an alternative argument order and numbering for most of these arch-specific handlers. A future patch series may allow for asm/unistd.h to signal through its defines that a generic implementation of these syscall handlers with the correct calling convention be omitted, through the __ARCH_WANT_COMPAT_SYS_... convention. Signed-off-by: Rohan McLure --- V1 -> V2: All syscall handlers wrapped by this macro. V2 -> V3: Move creation of do_ppc64_personality helper to prior patch. --- arch/powerpc/include/asm/syscalls.h | 18 +++--- arch/powerpc/kernel/sys_ppc32.c | 52 ++ arch/powerpc/kernel/syscalls.c | 16 -- arch/powerpc/kernel/syscalls/syscall.tbl | 22 .../arch/powerpc/entry/syscalls/syscall.tbl | 22 5 files changed, 71 insertions(+), 59 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 739498c358a1..0af7c2d8b2c9 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -11,15 +11,15 @@ struct rtas_args; long sys_mmap(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, off_t offset); + unsigned long prot, unsigned long flags, + unsigned long fd, off_t offset); long sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -long ppc64_personality(unsigned long personality); + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long sys_ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, - u32 len_high, u32 len_low); +long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, + u32 len_high, u32 len_low); #ifdef CONFIG_COMPAT unsigned long compat_sys_mmap2(unsigned long addr, size_t len, @@ -40,8 +40,8 @@ int compat_sys_truncate64(const char __user *path, u32 reg4, int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2); -long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, -size_t len, int advice); +long compat_sys_ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, + size_t len, int advice); long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned int offset1, unsigned int offset2, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index f4edcc9489fb..60cb5b4413b0 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -48,9 +48,10 @@ #include #include -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff) +COMPAT_SYSCALL_DEFINE6(mmap2, + unsigned long, addr, size_t, len, + unsigned long, prot, unsigned long, flags, + unsigned long, fd, unsigned long, pgoff) { /* This should remain 12 even if PAGE_SIZE changes */ return sys_mmap(addr, len, prot, flags, fd, pgoff << 12); @@ -68,52 +69,55 @@ unsigned long compat_sys_mmap2(unsigned long addr, size_t len, #define merge_64(high, low) ((u64)high << 32) | low #endif -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, -u32 reg6, u32 pos1, u32 pos2) +COMPAT_SYSCALL_DEFINE6(ppc_pread64, + unsigned int, fd, + char __user *, ubuf, compat_size_t, count, + u32, reg6, u32, pos1, u32, pos2) { return ksys_pread64(fd, ubuf
[PATCH v3 14/18] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
Clear user state in gprs (assign to zero) to reduce the influence of user registers on speculation within kernel syscall handlers. Clears occur at the very beginning of the sc and scv 0 interrupt handlers, with restores occurring following the execution of the syscall handler. Signed-off-by: Rohan McLure --- V1 -> V2: Update summary V2 -> V3: Remove erroneous summary paragraph on syscall_exit_prepare --- arch/powerpc/kernel/interrupt_64.S | 22 ++ 1 file changed, 18 insertions(+), 4 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index 0178aeba3820..d9625113c7a5 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -70,7 +70,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) ld r2,PACATOC(r13) mfcrr12 li r11,0 - /* Can we avoid saving r3-r8 in common case? */ + /* Save syscall parameters in r3-r8 */ std r3,GPR3(r1) std r4,GPR4(r1) std r5,GPR5(r1) @@ -109,6 +109,13 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) * but this is the best we can do. */ + /* +* Zero user registers to prevent influencing speculative execution +* state of kernel code. +*/ + NULLIFY_GPRS(5, 12) + NULLIFY_NVGPRS() + /* Calling convention has r3 = orig r0, r4 = regs */ mr r3,r0 bl system_call_exception @@ -139,6 +146,7 @@ BEGIN_FTR_SECTION HMT_MEDIUM_LOW END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) + REST_NVGPRS(r1) cmpdi r3,0 bne .Lsyscall_vectored_\name\()_restore_regs @@ -181,7 +189,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r4,_LINK(r1) ld r5,_XER(r1) - REST_NVGPRS(r1) ld r0,GPR0(r1) mtcrr2 mtctr r3 @@ -249,7 +256,7 @@ END_BTB_FLUSH_SECTION ld r2,PACATOC(r13) mfcrr12 li r11,0 - /* Can we avoid saving r3-r8 in common case? */ + /* Save syscall parameters in r3-r8 */ std r3,GPR3(r1) std r4,GPR4(r1) std r5,GPR5(r1) @@ -300,6 +307,13 @@ END_BTB_FLUSH_SECTION wrteei 1 #endif + /* +* Zero user registers to prevent influencing speculative execution +* state of kernel code. +*/ + NULLIFY_GPRS(5, 12) + NULLIFY_NVGPRS() + /* Calling convention has r3 = orig r0, r4 = regs */ mr r3,r0 bl system_call_exception @@ -342,6 +356,7 @@ BEGIN_FTR_SECTION stdcx. r0,0,r1 /* to clear the reservation */ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) + REST_NVGPRS(r1) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ @@ -377,7 +392,6 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) .Lsyscall_restore_regs: ld r3,_CTR(r1) ld r4,_XER(r1) - REST_NVGPRS(r1) mtctr r3 mtspr SPRN_XER,r4 ld r0,GPR0(r1) -- 2.34.1
[PATCH v3 15/18] powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers
Use the convenience macros for saving/clearing/restoring gprs in keeping with syscall calling conventions. The plural variants of these macros can store a range of registers for concision. This works well when the user gpr value we are hoping to save is still live. In the syscall interrupt handlers, user register state is sometimes juggled between registers. Hold-off from issuing the SAVE_GPR macro for applicable neighbouring lines to highlight the delicate register save logic. Signed-off-by: Rohan McLure --- V1 -> V2: Update summary V2 -> V3: Update summary regarding exclusions for the SAVE_GPR marco. Acknowledge new name for ZEROIZE_GPR{,S} macros. --- arch/powerpc/kernel/interrupt_64.S | 51 +++- 1 file changed, 13 insertions(+), 38 deletions(-) diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index d9625113c7a5..ad302ad93433 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -71,12 +71,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) mfcrr12 li r11,0 /* Save syscall parameters in r3-r8 */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -113,8 +108,8 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) * Zero user registers to prevent influencing speculative execution * state of kernel code. */ - NULLIFY_GPRS(5, 12) - NULLIFY_NVGPRS() + ZEROIZE_GPRS(5, 12) + ZEROIZE_NVGPRS() /* Calling convention has r3 = orig r0, r4 = regs */ mr r3,r0 @@ -157,17 +152,10 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) /* Could zero these as per ABI, but we may consider a stricter ABI * which preserves these if libc implementations can benefit, so * restore them for now until further measurement is done. */ - ld r0,GPR0(r1) - ld r4,GPR4(r1) - ld r5,GPR5(r1) - ld r6,GPR6(r1) - ld r7,GPR7(r1) - ld r8,GPR8(r1) + REST_GPR(0, r1) + REST_GPRS(4, 8, r1) /* Zero volatile regs that may contain sensitive kernel data */ - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPRS(9, 12) mtspr SPRN_XER,r0 /* @@ -189,7 +177,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r4,_LINK(r1) ld r5,_XER(r1) - ld r0,GPR0(r1) + REST_GPR(0, r1) mtcrr2 mtctr r3 mtlrr4 @@ -257,12 +245,7 @@ END_BTB_FLUSH_SECTION mfcrr12 li r11,0 /* Save syscall parameters in r3-r8 */ - std r3,GPR3(r1) - std r4,GPR4(r1) - std r5,GPR5(r1) - std r6,GPR6(r1) - std r7,GPR7(r1) - std r8,GPR8(r1) + SAVE_GPRS(3, 8, r1) /* Zero r9-r12, this should only be required when restoring all GPRs */ std r11,GPR9(r1) std r11,GPR10(r1) @@ -311,8 +294,8 @@ END_BTB_FLUSH_SECTION * Zero user registers to prevent influencing speculative execution * state of kernel code. */ - NULLIFY_GPRS(5, 12) - NULLIFY_NVGPRS() + ZEROIZE_GPRS(5, 12) + ZEROIZE_NVGPRS() /* Calling convention has r3 = orig r0, r4 = regs */ mr r3,r0 @@ -360,16 +343,8 @@ END_FTR_SECTION_IFCLR(CPU_FTR_STCX_CHECKS_ADDRESS) cmpdi r3,0 bne .Lsyscall_restore_regs /* Zero volatile regs that may contain sensitive kernel data */ - li r0,0 - li r4,0 - li r5,0 - li r6,0 - li r7,0 - li r8,0 - li r9,0 - li r10,0 - li r11,0 - li r12,0 + ZEROIZE_GPR(0) + ZEROIZE_GPRS(4, 12) mtctr r0 mtspr SPRN_XER,r0 .Lsyscall_restore_regs_cont: @@ -394,7 +369,7 @@ END_FTR_SECTION_IFSET(CPU_FTR_HAS_PPR) ld r4,_XER(r1) mtctr r3 mtspr SPRN_XER,r4 - ld r0,GPR0(r1) + REST_GPR(0, r1) REST_GPRS(4, 12, r1) b .Lsyscall_restore_regs_cont .Lsyscall_rst_end: -- 2.34.1
[PATCH v3 13/18] powerpc: Provide syscall wrapper
Implement syscall wrapper as per s390, x86, arm64. When enabled cause handlers to accept parameters from a stack frame rather than from user scratch register state. This allows for user registers to be safely cleared in order to reduce caller influence on speculation within syscall routine. The wrapper is a macro that emits syscall handler symbols that call into the target handler, obtaining its parameters from a struct pt_regs on the stack. As registers are already saved to the stack prior to calling system_call_exception, it appears that this function is executed more efficiently with the new stack-pointer convention than with parameters passed by registers, avoiding the allocation of a stack frame for this method. On a 32-bit system, we see >20% performance increases on the null_syscall microbenchmark, and on a Power 8 the performance gains amortise the cost of clearing and restoring registers which is implemented at the end of this series, seeing final result of ~5.6% performance improvement on null_syscall. Syscalls are wrapped in this fashion on all platforms except for the Cell processor as this commit does not provide SPU support. This can be quickly fixed in a successive patch, but requires spu_sys_callback to allocate a pt_regs structure to satisfy the wrapped calling convention. Co-developed-by: Andrew Donnellan Signed-off-by: Andrew Donnellan Signed-off-by: Rohan McLure --- V1 -> V2: Generate prototypes for symbols produced by the wrapper. V2 -> V3: Rebased to remove conflict with 1547db7d1f44 ("powerpc: Move system_call_exception() to syscall.c"). Also remove copy from gpr3 save slot on stackframe to orig_r3's slot. Fix whitespace with preprocessor defines in system_call_exception. --- arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/interrupt.h | 3 +- arch/powerpc/include/asm/syscall.h | 4 + arch/powerpc/include/asm/syscall_wrapper.h | 94 arch/powerpc/include/asm/syscalls.h| 25 +- arch/powerpc/kernel/entry_32.S | 6 +- arch/powerpc/kernel/interrupt_64.S | 16 ++-- arch/powerpc/kernel/syscall.c | 31 +++ arch/powerpc/kernel/systbl.c | 2 + arch/powerpc/kernel/vdso.c | 2 + 10 files changed, 152 insertions(+), 32 deletions(-) diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig index 4c466acdc70d..ef6c83e79c9b 100644 --- a/arch/powerpc/Kconfig +++ b/arch/powerpc/Kconfig @@ -137,6 +137,7 @@ config PPC select ARCH_HAS_STRICT_KERNEL_RWX if (PPC_BOOK3S || PPC_8xx || 40x) && !HIBERNATION select ARCH_HAS_STRICT_KERNEL_RWX if FSL_BOOKE && !HIBERNATION && !RANDOMIZE_BASE select ARCH_HAS_STRICT_MODULE_RWX if ARCH_HAS_STRICT_KERNEL_RWX + select ARCH_HAS_SYSCALL_WRAPPER if !SPU_BASE select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST select ARCH_HAS_UACCESS_FLUSHCACHE select ARCH_HAS_UBSAN_SANITIZE_ALL diff --git a/arch/powerpc/include/asm/interrupt.h b/arch/powerpc/include/asm/interrupt.h index 8069dbc4b8d1..3f9cad81585c 100644 --- a/arch/powerpc/include/asm/interrupt.h +++ b/arch/powerpc/include/asm/interrupt.h @@ -665,8 +665,7 @@ static inline void interrupt_cond_local_irq_enable(struct pt_regs *regs) local_irq_enable(); } -long system_call_exception(long r3, long r4, long r5, long r6, long r7, long r8, - unsigned long r0, struct pt_regs *regs); +long system_call_exception(unsigned long r0, struct pt_regs *regs); notrace unsigned long syscall_exit_prepare(unsigned long r3, struct pt_regs *regs, long scv); notrace unsigned long interrupt_exit_user_prepare(struct pt_regs *regs); notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs); diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index d2a8dfd5de33..3dd36c5e334a 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -14,8 +14,12 @@ #include #include +#ifdef CONFIG_ARCH_HAS_SYSCALL_WRAPPER +typedef long (*syscall_fn)(const struct pt_regs *); +#else typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); +#endif /* ftrace syscalls requires exporting the sys_call_table */ extern const syscall_fn sys_call_table[]; diff --git a/arch/powerpc/include/asm/syscall_wrapper.h b/arch/powerpc/include/asm/syscall_wrapper.h new file mode 100644 index ..4f4e9f919a32 --- /dev/null +++ b/arch/powerpc/include/asm/syscall_wrapper.h @@ -0,0 +1,94 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * syscall_wrapper.h - powerpc specific wrappers to syscall definitions + * + * Based on arch/{x86,arm64}/include/asm/syscall_wrapper.h + */ + +#ifndef __ASM_SYSCALL_WRAPPER_H +#define __ASM_SYSCALL_WRAPPER_H + +stru
[PATCH v3 10/18] powerpc: Use common syscall handler type
Cause syscall handlers to be typed as follows when called indirectly throughout the kernel. typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, unsigned long, unsigned long, unsigned long); Since both 32 and 64-bit abis allow for at least the first six machine-word length parameters to a function to be passed by registers, even handlers which admit fewer than six parameters may be viewed as having the above type. Fixup comparisons in VDSO to avoid pointer-integer comparison. Introduce explicit cast on systems with SPUs. Signed-off-by: Rohan McLure --- V1 -> V2: New patch. V2 -> V3: Remove unnecessary cast from const syscall_fn to syscall_fn --- arch/powerpc/include/asm/syscall.h | 7 +-- arch/powerpc/include/asm/syscalls.h | 1 + arch/powerpc/kernel/systbl.c| 6 +++--- arch/powerpc/kernel/vdso.c | 4 ++-- arch/powerpc/platforms/cell/spu_callbacks.c | 6 +++--- 5 files changed, 14 insertions(+), 10 deletions(-) diff --git a/arch/powerpc/include/asm/syscall.h b/arch/powerpc/include/asm/syscall.h index 25fc8ad9a27a..d2a8dfd5de33 100644 --- a/arch/powerpc/include/asm/syscall.h +++ b/arch/powerpc/include/asm/syscall.h @@ -14,9 +14,12 @@ #include #include +typedef long (*syscall_fn)(unsigned long, unsigned long, unsigned long, + unsigned long, unsigned long, unsigned long); + /* ftrace syscalls requires exporting the sys_call_table */ -extern const unsigned long sys_call_table[]; -extern const unsigned long compat_sys_call_table[]; +extern const syscall_fn sys_call_table[]; +extern const syscall_fn compat_sys_call_table[]; static inline int syscall_get_nr(struct task_struct *task, struct pt_regs *regs) { diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 91417dee534e..e979b7593d2b 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,6 +8,7 @@ #include #include +#include #ifdef CONFIG_PPC64 #include #endif diff --git a/arch/powerpc/kernel/systbl.c b/arch/powerpc/kernel/systbl.c index 99ffdfef6b9c..b88a9c2a1f50 100644 --- a/arch/powerpc/kernel/systbl.c +++ b/arch/powerpc/kernel/systbl.c @@ -21,10 +21,10 @@ #define __SYSCALL(nr, entry) [nr] = __powerpc_##entry, #define __powerpc_sys_ni_syscall sys_ni_syscall #else -#define __SYSCALL(nr, entry) [nr] = entry, +#define __SYSCALL(nr, entry) [nr] = (void *) entry, #endif -void *sys_call_table[] = { +const syscall_fn sys_call_table[] = { #ifdef CONFIG_PPC64 #include #else @@ -35,7 +35,7 @@ void *sys_call_table[] = { #ifdef CONFIG_COMPAT #undef __SYSCALL_WITH_COMPAT #define __SYSCALL_WITH_COMPAT(nr, native, compat) __SYSCALL(nr, compat) -void *compat_sys_call_table[] = { +const syscall_fn compat_sys_call_table[] = { #include }; #endif /* CONFIG_COMPAT */ diff --git a/arch/powerpc/kernel/vdso.c b/arch/powerpc/kernel/vdso.c index 0da287544054..080c9e7de0c0 100644 --- a/arch/powerpc/kernel/vdso.c +++ b/arch/powerpc/kernel/vdso.c @@ -313,10 +313,10 @@ static void __init vdso_setup_syscall_map(void) unsigned int i; for (i = 0; i < NR_syscalls; i++) { - if (sys_call_table[i] != (unsigned long)&sys_ni_syscall) + if (sys_call_table[i] != (void *)&sys_ni_syscall) vdso_data->syscall_map[i >> 5] |= 0x8000UL >> (i & 0x1f); if (IS_ENABLED(CONFIG_COMPAT) && - compat_sys_call_table[i] != (unsigned long)&sys_ni_syscall) + compat_sys_call_table[i] != (void *)&sys_ni_syscall) vdso_data->compat_syscall_map[i >> 5] |= 0x8000UL >> (i & 0x1f); } } diff --git a/arch/powerpc/platforms/cell/spu_callbacks.c b/arch/powerpc/platforms/cell/spu_callbacks.c index fe0d8797a00a..e780c14c5733 100644 --- a/arch/powerpc/platforms/cell/spu_callbacks.c +++ b/arch/powerpc/platforms/cell/spu_callbacks.c @@ -34,15 +34,15 @@ * mbind, mq_open, ipc, ... */ -static void *spu_syscall_table[] = { +static const syscall_fn spu_syscall_table[] = { #define __SYSCALL_WITH_COMPAT(nr, entry, compat) __SYSCALL(nr, entry) -#define __SYSCALL(nr, entry) [nr] = entry, +#define __SYSCALL(nr, entry) [nr] = (void *) entry, #include }; long spu_sys_callback(struct spu_syscall_block *s) { - long (*syscall)(u64 a1, u64 a2, u64 a3, u64 a4, u64 a5, u64 a6); + syscall_fn syscall; if (s->nr_ret >= ARRAY_SIZE(spu_syscall_table)) { pr_debug("%s: invalid syscall #%lld", __func__, s->nr_ret); -- 2.34.1
[PATCH v3 03/18] powerpc/32: Remove powerpc select specialisation
Syscall #82 has been implemented for 32-bit platforms in a unique way on powerpc systems. This hack will in effect guess whether the caller is expecting new select semantics or old select semantics. It does so via a guess, based off the first parameter. In new select, this parameter represents the length of a user-memory array of file descriptors, and in old select this is a pointer to an arguments structure. The heuristic simply interprets sufficiently large values of its first parameter as being a call to old select. The following is a discussion on how this syscall should be handled. Link: https://lore.kernel.org/lkml/13737de5-0eb7-e881-9af0-163b0d29a...@csgroup.eu/ As discussed in this thread, the existence of such a hack suggests that for whatever powerpc binaries may predate glibc, it is most likely that they would have taken use of the old select semantics. x86 and arm64 both implement this syscall with oldselect semantics. Remove the powerpc implementation, and update syscall.tbl to refer to emit a reference to sys_old_select for 32-bit binaries, in keeping with how other architectures support syscall #82. Signed-off-by: Rohan McLure --- V1 -> V2: Remove arch-specific select handler V2 -> V3: Remove ppc_old_select prototype in . Move to earlier in patch series --- arch/powerpc/include/asm/syscalls.h | 2 -- arch/powerpc/kernel/syscalls.c| 17 - arch/powerpc/kernel/syscalls/syscall.tbl | 2 +- .../arch/powerpc/entry/syscalls/syscall.tbl | 2 +- 4 files changed, 2 insertions(+), 21 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 675a8f5ec3ca..739498c358a1 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -18,8 +18,6 @@ long sys_mmap2(unsigned long addr, size_t len, unsigned long fd, unsigned long pgoff); long ppc64_personality(unsigned long personality); long sys_rtas(struct rtas_args __user *uargs); -int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, - fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, u32 len_high, u32 len_low); diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index fc999140bc27..ef5896bee818 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -63,23 +63,6 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, return do_mmap2(addr, len, prot, flags, fd, offset, PAGE_SHIFT); } -#ifdef CONFIG_PPC32 -/* - * Due to some executables calling the wrong select we sometimes - * get wrong args. This determines how the args are being passed - * (a single ptr to them all args passed) then calls - * sys_select() with the appropriate args. -- Cort - */ -int -ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp) -{ - if ((unsigned long)n >= 4096) - return sys_old_select((void __user *)n); - - return sys_select(n, inp, outp, exp, tvp); -} -#endif - #ifdef CONFIG_PPC64 long ppc64_personality(unsigned long personality) { diff --git a/arch/powerpc/kernel/syscalls/syscall.tbl b/arch/powerpc/kernel/syscalls/syscall.tbl index 2600b4237292..4cbbb810ae10 100644 --- a/arch/powerpc/kernel/syscalls/syscall.tbl +++ b/arch/powerpc/kernel/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select sys_ni_syscall 82 64 select sys_ni_syscall 82 spu select sys_ni_syscall 83 common symlink sys_symlink diff --git a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl index 2600b4237292..4cbbb810ae10 100644 --- a/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl +++ b/tools/perf/arch/powerpc/entry/syscalls/syscall.tbl @@ -110,7 +110,7 @@ 79 common settimeofdaysys_settimeofday compat_sys_settimeofday 80 common getgroups sys_getgroups 81 common setgroups sys_setgroups -82 32 select ppc_select sys_ni_syscall +82 32 select sys_old_select sys_ni_syscall 82 64 select sys_ni_syscall 82 spu select sys_ni_syscall 83 common
[PATCH v3 08/18] powerpc: Include all arch-specific syscall prototypes
Forward declare all syscall handler prototypes where a generic prototype is not provided in either linux/syscalls.h or linux/compat.h in asm/syscalls.h. This is required for compile-time type-checking for syscall handlers, which is implemented later in this series. 32-bit compatibility syscall handlers are expressed in terms of types in ppc32.h. Expose this header globally. Signed-off-by: Rohan McLure --- V1 -> V2: Explicitly include prototypes. V2 -> V3: Remove extraneous #include and ppc_fallocate prototype. Rename header. --- arch/powerpc/include/asm/syscalls.h | 96 +- .../ppc32.h => include/asm/syscalls_32.h}| 0 arch/powerpc/kernel/signal_32.c | 2 +- arch/powerpc/perf/callchain_32.c | 2 +- 4 files changed, 73 insertions(+), 27 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index 0af7c2d8b2c9..91417dee534e 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -8,45 +8,91 @@ #include #include +#ifdef CONFIG_PPC64 +#include +#endif +#include +#include + struct rtas_args; -long sys_mmap(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, off_t offset); -long sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -long sys_ppc64_personality(unsigned long personality); +#ifndef CONFIG_ARCH_HAS_SYSCALL_WRAPPER + +/* + * PowerPC architecture-specific syscalls + */ + long sys_rtas(struct rtas_args __user *uargs); -long sys_ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, - u32 len_high, u32 len_low); +long sys_ni_syscall(void); +#ifdef CONFIG_PPC64 +long sys_ppc64_personality(unsigned long personality); #ifdef CONFIG_COMPAT -unsigned long compat_sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); +long compat_sys_ppc64_personality(unsigned long personality); +#endif /* CONFIG_COMPAT */ +#endif /* CONFIG_PPC64 */ + +/* Parameters are reordered for powerpc to avoid padding */ +long sys_ppc_fadvise64_64(int fd, int advice, + u32 offset_high, u32 offset_low, + u32 len_high, u32 len_low); +long sys_swapcontext(struct ucontext __user *old_ctx, +struct ucontext __user *new_ctx, long ctx_size); +long sys_mmap(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, off_t offset); +long sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long sys_switch_endian(void); -compat_ssize_t compat_sys_pread64(unsigned int fd, char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2); +#ifdef CONFIG_PPC32 +long sys_sigreturn(void); +long sys_debug_setcontext(struct ucontext __user *ctx, int ndbg, + struct sig_dbg_op __user *dbg); +#endif -compat_ssize_t compat_sys_pwrite64(unsigned int fd, const char __user *ubuf, compat_size_t count, - u32 reg6, u32 pos1, u32 pos2); +long sys_rt_sigreturn(void); -compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u32 count); +long sys_subpage_prot(unsigned long addr, + unsigned long len, u32 __user *map); -int compat_sys_truncate64(const char __user *path, u32 reg4, - unsigned long len1, unsigned long len2); +#ifdef CONFIG_COMPAT +long compat_sys_swapcontext(struct ucontext32 __user *old_ctx, + struct ucontext32 __user *new_ctx, + int ctx_size); +long compat_sys_old_getrlimit(unsigned int resource, + struct compat_rlimit __user *rlim); +long compat_sys_sigreturn(void); +long compat_sys_rt_sigreturn(void); -int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, - unsigned long len2); +/* Architecture-specific implementations in sys_ppc32.c */ +long compat_sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long compat_sys_ppc_pread64(unsigned int fd, + char __user *ubuf, compat_size_t count, + u32 reg6, u32 pos1, u32 pos2); +long compat_sys_ppc_pwrite64(unsigned int fd, +const char __user *ubuf, compat_size_t count, +u32 reg6, u32 pos1, u32 pos2); +long compat_sys_ppc_readahead(int f
[PATCH v3 00/18] powerpc: Syscall wrapper and register clearing
V2 available here: Link: https://lore.kernel.org/all/20220725062039.117425-1-rmcl...@linux.ibm.com/ Implement a syscall wrapper, causing arguments to handlers to be passed via a struct pt_regs on the stack. The syscall wrapper is implemented for all platforms other than the Cell processor, from which SPUs expect the ability to directly call syscall handler symbols with the regular in-register calling convention. Adopting syscall wrappers requires redefinition of architecture-specific syscalls and compatibility syscalls to use the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, as well as removal of direct-references to the emitted syscall-handler symbols from within the kernel. This work lead to the following modernisations of powerpc's syscall handlers: - Replace syscall 82 semantics with sys_old_select and remove ppc_select handler, which features direct call to both sys_old_select and sys_select. - Use a generic fallocate compatibility syscall Replace asm implementation of syscall table with C implementation for more compile-time checks. Many compatibility syscalls are candidates to be removed in favour of generically defined handlers, but exhibit different parameter orderings and numberings due to 32-bit ABI support for 64-bit parameters. The paramater reorderings are however consistent with arm. A future patch series will serve to modernise syscalls by providing generic implementations featuring these reorderings. The design of this syscall is very similar to the s390, x86 and arm64 implementations. See also Commit 4378a7d4be30 (arm64: implement syscall wrappers). The motivation for this change is that it allows for the clearing of register state when entering the kernel via through interrupt handlers on 64-bit servers. This serves to reduce the influence of values in registers carried over from the interrupted process, e.g. syscall parameters from user space, or user state at the site of a pagefault. All values in registers are saved and nullified (assigned to zero) at the entry to an interrupt handler and restored afterward. While this may sound like a heavy-weight mitigation, many gprs are already saved and restored on handling of an interrupt, and the mmap_bench benchmark on Power 9 guest, repeatedly invoking the pagefault handler suggests at most ~0.8% regression in performance. Realistic workloads are not constantly producing interrupts, and so this does not indicate realistic slowdown. Using wrapped syscalls yields to a performance improvement of ~5.6% on the null_syscall benchmark on pseries guests, by removing the need for system_call_exception to allocate its own stack frame. This amortises the additional costs of saving and restoring non-volatile registers (register clearing is cheap on super scalar platforms), and so the final mitigation actually yields a net performance improvement of ~0.6% on the null_syscall benchmark. Patch Changelog: - Rename NULLIFY_GPRS macros to ZEROIZE_GPRS - Clear up entry_32.S with new macros - Acknowledge system_call_exception move to syscall.c - Save caller r3 for system calls in interrupt handlers rather than in system_call_exception - Remove asmlinkage from arch/powerpc - Rearrange patches, realign changes to their relevant patches Rohan McLure (18): powerpc: Remove asmlinkage from syscall handler definitions powerpc: Use generic fallocate compatibility syscall powerpc/32: Remove powerpc select specialisation powerpc: Provide do_ppc64_personality helper powerpc: Adopt SYSCALL_DEFINE for arch-specific syscall handlers powerpc: Remove direct call to personality syscall handler powerpc: Remove direct call to mmap2 syscall handlers powerpc: Include all arch-specific syscall prototypes powerpc: Enable compile-time check for syscall handlers powerpc: Use common syscall handler type powerpc: Add ZEROIZE_GPRS macros for register clears Revert "powerpc/syscall: Save r3 in regs->orig_r3" powerpc: Provide syscall wrapper powerpc/64s: Clear/restore caller gprs in syscall interrupt/return powerpc/64s: Use {ZEROIZE,SAVE,REST}_GPRS macros in sc, scv 0 handlers powerpc/32: Clarify interrupt restores with REST_GPR macro in entry_32.S powerpc/64s: Fix comment on interrupt handler prologue powerpc/64s: Clear gprs on interrupt routine entry arch/powerpc/Kconfig | 1 + arch/powerpc/include/asm/compat.h| 5 + arch/powerpc/include/asm/interrupt.h | 3 +- arch/powerpc/include/asm/ppc_asm.h | 22 +++ arch/powerpc/include/asm/syscall.h | 11 +- arch/powerpc/include/asm/syscall_wrapper.h | 94 arch/powerpc/include/asm/syscalls.h | 128 + .../ppc32.h => include/asm/syscalls_32.h}| 0 arch/powerpc/include/asm/unistd.h| 1 + arch/powerpc/kernel/entry_32.S | 42 +++--- arch/powerpc/kernel/exceptions-64s.S | 23 ++- arch/powerpc/
[PATCH v3 04/18] powerpc: Provide do_ppc64_personality helper
Avoid duplication in future patch that will define the ppc64_personality syscall handler in terms of the SYSCALL_DEFINE and COMPAT_SYSCALL_DEFINE macros, by extracting the common body of ppc64_personality into a helper function. Signed-off-by: Rohan McLure --- V2 -> V3: New commit. --- arch/powerpc/kernel/syscalls.c | 6 +- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index ef5896bee818..9f29e451e2de 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -64,7 +64,7 @@ SYSCALL_DEFINE6(mmap, unsigned long, addr, size_t, len, } #ifdef CONFIG_PPC64 -long ppc64_personality(unsigned long personality) +static inline long do_ppc64_personality(unsigned long personality) { long ret; @@ -76,6 +76,10 @@ long ppc64_personality(unsigned long personality) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; } +long ppc64_personality(unsigned long personality) +{ + return do_ppc64_personality(personality); +} #endif long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, -- 2.34.1
[PATCH v3 01/18] powerpc: Remove asmlinkage from syscall handler definitions
The asmlinkage macro has no special meaning in powerpc, and prior to this patch is used sporadically on some syscall handler definitions. On architectures that do not define asmlinkage, it resolves to extern "C" for C++ compilers and a nop otherwise. The current invocations of asmlinkage provide far from complete support for C++ toolchains, and so the macro serves no purpose in powerpc. Remove all invocations of asmlinkage in arch/powerpc. These incidentally only occur in syscall defintions and prototypes. Signed-off-by: Rohan McLure --- V2 -> V3: new patch --- arch/powerpc/include/asm/syscalls.h | 16 arch/powerpc/kernel/sys_ppc32.c | 8 2 files changed, 12 insertions(+), 12 deletions(-) diff --git a/arch/powerpc/include/asm/syscalls.h b/arch/powerpc/include/asm/syscalls.h index a2b13e55254f..21c2faaa2957 100644 --- a/arch/powerpc/include/asm/syscalls.h +++ b/arch/powerpc/include/asm/syscalls.h @@ -10,14 +10,14 @@ struct rtas_args; -asmlinkage long sys_mmap(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, off_t offset); -asmlinkage long sys_mmap2(unsigned long addr, size_t len, - unsigned long prot, unsigned long flags, - unsigned long fd, unsigned long pgoff); -asmlinkage long ppc64_personality(unsigned long personality); -asmlinkage long sys_rtas(struct rtas_args __user *uargs); +long sys_mmap(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, off_t offset); +long sys_mmap2(unsigned long addr, size_t len, + unsigned long prot, unsigned long flags, + unsigned long fd, unsigned long pgoff); +long ppc64_personality(unsigned long personality); +long sys_rtas(struct rtas_args __user *uargs); int ppc_select(int n, fd_set __user *inp, fd_set __user *outp, fd_set __user *exp, struct __kernel_old_timeval __user *tvp); long ppc_fadvise64_64(int fd, int advice, u32 offset_high, u32 offset_low, diff --git a/arch/powerpc/kernel/sys_ppc32.c b/arch/powerpc/kernel/sys_ppc32.c index 16ff0399a257..f4edcc9489fb 100644 --- a/arch/powerpc/kernel/sys_ppc32.c +++ b/arch/powerpc/kernel/sys_ppc32.c @@ -85,20 +85,20 @@ compat_ssize_t compat_sys_readahead(int fd, u32 r4, u32 offset1, u32 offset2, u3 return ksys_readahead(fd, merge_64(offset1, offset2), count); } -asmlinkage int compat_sys_truncate64(const char __user * path, u32 reg4, +int compat_sys_truncate64(const char __user * path, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_truncate(path, merge_64(len1, len2)); } -asmlinkage long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, +long compat_sys_fallocate(int fd, int mode, u32 offset1, u32 offset2, u32 len1, u32 len2) { return ksys_fallocate(fd, mode, ((loff_t)offset1 << 32) | offset2, merge_64(len1, len2)); } -asmlinkage int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, +int compat_sys_ftruncate64(unsigned int fd, u32 reg4, unsigned long len1, unsigned long len2) { return ksys_ftruncate(fd, merge_64(len1, len2)); @@ -111,7 +111,7 @@ long ppc32_fadvise64(int fd, u32 unused, u32 offset1, u32 offset2, advice); } -asmlinkage long compat_sys_sync_file_range2(int fd, unsigned int flags, +long compat_sys_sync_file_range2(int fd, unsigned int flags, unsigned offset1, unsigned offset2, unsigned nbytes1, unsigned nbytes2) { -- 2.34.1
[PATCH v3 12/18] Revert "powerpc/syscall: Save r3 in regs->orig_r3"
This reverts commit 8875f47b7681aa4e4484a9b612577b044725f839. Save caller's original r3 state to the kernel stackframe before entering system_call_exception. This allows for user registers to be cleared by the time system_call_exception is entered, reducing the influence of user registers on speculation within the kernel. Prior to this commit, orig_r3 was saved at the beginning of system_call_exception. Instead, save orig_r3 while the user value is still live in r3. Also replicate this early save in 32-bit. A similar save was removed in 6f76a01173cc ("powerpc/syscall: implement system call entry/exit logic in C for PPC32") when 32-bit adopted system_call_exception. Revert its removal of orig_r3 saves. Signed-off-by: Rohan McLure --- V2 -> V3: New commit. --- arch/powerpc/kernel/entry_32.S | 1 + arch/powerpc/kernel/interrupt_64.S | 2 ++ arch/powerpc/kernel/syscall.c | 1 - 3 files changed, 3 insertions(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/entry_32.S b/arch/powerpc/kernel/entry_32.S index 1d599df6f169..44dfce9a60c5 100644 --- a/arch/powerpc/kernel/entry_32.S +++ b/arch/powerpc/kernel/entry_32.S @@ -101,6 +101,7 @@ __kuep_unlock: .globl transfer_to_syscall transfer_to_syscall: + stw r3, ORIG_GPR3(r1) stw r11, GPR1(r1) stw r11, 0(r1) mflrr12 diff --git a/arch/powerpc/kernel/interrupt_64.S b/arch/powerpc/kernel/interrupt_64.S index ce25b28cf418..71d2d9497283 100644 --- a/arch/powerpc/kernel/interrupt_64.S +++ b/arch/powerpc/kernel/interrupt_64.S @@ -91,6 +91,7 @@ _ASM_NOKPROBE_SYMBOL(system_call_vectored_\name) li r11,\trapnr std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ @@ -275,6 +276,7 @@ END_BTB_FLUSH_SECTION std r10,_LINK(r1) std r11,_TRAP(r1) std r12,_CCR(r1) + std r3,ORIG_GPR3(r1) addir10,r1,STACK_FRAME_OVERHEAD ld r11,exception_marker@toc(r2) std r11,-16(r10)/* "regshere" marker */ diff --git a/arch/powerpc/kernel/syscall.c b/arch/powerpc/kernel/syscall.c index 81ace9e8b72b..64102a64fd84 100644 --- a/arch/powerpc/kernel/syscall.c +++ b/arch/powerpc/kernel/syscall.c @@ -25,7 +25,6 @@ notrace long system_call_exception(long r3, long r4, long r5, kuap_lock(); add_random_kstack_offset(); - regs->orig_gpr3 = r3; if (IS_ENABLED(CONFIG_PPC_IRQ_SOFT_MASK_DEBUG)) BUG_ON(irq_soft_mask_return() != IRQS_ALL_DISABLED); -- 2.34.1
[PATCH v3 06/18] powerpc: Remove direct call to personality syscall handler
Syscall handlers should not be invoked internally by their symbol names, as these symbols defined by the architecture-defined SYSCALL_DEFINE macro. Fortunately, in the case of ppc64_personality, its call to sys_personality can be replaced with an invocation to the equivalent ksys_personality inline helper in . Signed-off-by: Rohan McLure --- V1 -> V2: Use inline helper to deduplicate bodies in compat/regular implementations. --- arch/powerpc/kernel/syscalls.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/powerpc/kernel/syscalls.c b/arch/powerpc/kernel/syscalls.c index e89a2176b2a3..e083935c5bf2 100644 --- a/arch/powerpc/kernel/syscalls.c +++ b/arch/powerpc/kernel/syscalls.c @@ -71,7 +71,7 @@ static inline long do_ppc64_personality(unsigned long personality) if (personality(current->personality) == PER_LINUX32 && personality(personality) == PER_LINUX) personality = (personality & ~PER_MASK) | PER_LINUX32; - ret = sys_personality(personality); + ret = ksys_personality(personality); if (personality(ret) == PER_LINUX32) ret = (ret & ~PER_MASK) | PER_LINUX; return ret; -- 2.34.1
Re: [PATCH v2 11/14] powerpc/64s: Clear/restore caller gprs in syscall interrupt/return
> On 11 Aug 2022, at 8:11 pm, Andrew Donnellan wrote: > > On Mon, 2022-07-25 at 16:31 +1000, Rohan McLure wrote: >> Clear user state in gprs (assign to zero) to reduce the influence of >> user >> registers on speculation within kernel syscall handlers. Clears occur >> at the very beginning of the sc and scv 0 interrupt handlers, with >> restores occurring following the execution of the syscall handler. >> >> One function of syscall_exit_prepare is to determine when non- >> volatile >> regs must be restored, and it still serves that purpose on 32-bit. >> Use >> it now for determining where to find XER, CTR, CR. > > I'm not sure exactly how syscall_exit_prepare comes into this? Apologies, this comment belongs in patch 14 and concerns interrupt_exit_user_prepare.