Hi Anders, I'm not well versed on tuxrun, and how to make that work with a qemu binary outside of the container, so I'm not sure if I'm comparing apples to bananas. Can you look and see if this fixes the kselftest slowdown you reported?
Anyway, for a boot and shutdown of your rootfs, I see: Before: 11.13% [.] aa64_va_parameters 8.38% [.] helper_lookup_tb_ptr 7.37% [.] pauth_computepac 3.79% [.] qht_lookup_custom After: 9.17% [.] helper_lookup_tb_ptr 8.05% [.] pauth_computepac 4.22% [.] qht_lookup_custom 3.68% [.] pauth_addpac ... 1.67% [.] aa64_va_parameters This is all due to the heavy use pauth makes of aa64_va_parameters. It "only" needs 2 parameters, tsz and tbi, but tsz is probably the most expensive part of aa64_va_parameters -- do anything about that and we might as well cache the whole thing. The change from struct+bitfields to uint32_t+FIELD is meant to combat some really ugly code that gcc produced. Seems like they should have compiled to the same thing, more or less, but alas. r~ Richard Henderson (4): target/arm: Flush only required tlbs for TCR_EL[12] target/arm: Store tbi for both insns and data in ARMVAParameters target/arm: Use FIELD for ARMVAParameters target/arm: Cache ARMVAParameters target/arm/cpu.h | 30 +++++++ target/arm/internals.h | 21 +---- target/arm/helper.c | 177 ++++++++++++++++++++++++++++---------- target/arm/pauth_helper.c | 39 +++++---- target/arm/ptw.c | 57 ++++++------ 5 files changed, 217 insertions(+), 107 deletions(-) -- 2.34.1