From: Ard Biesheuvel <[email protected]> Ryan reports that get_random_u16() is dominant in the performance profiling of syscall entry when kstack randomization is enabled [0].
This is the reason many architectures rely on a counter instead, and that, in turn, is the reason for the convoluted way the (pseudo-)entropy is gathered and recorded in a per-CPU variable. Let's try to make the get_random_uXX() fast path faster, and switch to get_random_u8() so that we'll hit the slow path 2x less often. Then, wire it up in the syscall entry path, replacing the per-CPU variable, making the logic at syscall exit redundant. [0] https://lore.kernel.org/all/[email protected]/ Cc: Kees Cook <[email protected]> Cc: Ryan Roberts <[email protected]> Cc: Will Deacon <[email protected]> Cc: Arnd Bergmann <[email protected]> Cc: Jeremy Linton <[email protected]> Cc: Catalin Marinas <[email protected]> Cc: Mark Rutland <[email protected]> Cc: Jason A. Donenfeld <[email protected]> Ard Biesheuvel (6): hexagon: Wire up cmpxchg64_local() to generic implementation arc: Wire up cmpxchg64_local() to generic implementation random: Use u32 to keep track of batched entropy generation random: Use a lockless fast path for get_random_uXX() random: Plug race in preceding patch randomize_kstack: Use get_random_u8() at entry for entropy arch/Kconfig | 9 ++-- arch/arc/include/asm/cmpxchg.h | 3 ++ arch/hexagon/include/asm/cmpxchg.h | 4 ++ drivers/char/random.c | 49 ++++++++++++++------ include/linux/randomize_kstack.h | 36 ++------------ init/main.c | 1 - 6 files changed, 49 insertions(+), 53 deletions(-) base-commit: ac3fd01e4c1efce8f2c054cdeb2ddd2fc0fb150d -- 2.52.0.107.ga0afd4fd5b-goog
