Hello! v1? Yes: I intend to repost this with/without the last two patches depending on whether anyone thinks they are needed, and should be considered as part of this series, or separate.
This series started as a workaround for Neoverse-N1 #1349291, but has become an improvement in RAS error accounting for KVM on arm64. Neoverse-N1 affected by #1349291 may report an Uncontained RAS Errors as Unrecoverable. [0] This is the difference between killing the thread and killing the machine. The workaround is to treat all Unrecoverable SError as Uncontained. The arch code's SError handling already does this, due to its nascent kernel-first support. So only KVM needs some work as it has its own SError handling as we want KVM to handle guest:SError and the host to handle host:SError. Instead of working around the errata in KVM, we account SError as precisely as we can instead. This means moving the ESB-instruction into the guest-exit vectors, and deferring guest-entry if there is an SError pending. (so that the host's existing handling takes it). This is all good stuff, but it comes with the cost of a dsb in the world-switch code. It's the non-RAS non-VHE systems that will see this as costly. Benchmarked using kvm-ws-tests's do_hvc [1] on Seattle: | v5.2-rc1 mean:4339 stddev:33 | v5.2-rc1+patches1-4 mean:4476 stddev: 2 | with series 3.15% slower Patch 5 replaces this dsb with a nop if the system doesn't have v8.2 as these systems are unlikely to report errors in a way that we can handle. | 5.2-rc1+patches1-5 mean:4405 stddev:31 | with series 1.53% slower Patch 6 applies the same ISR_EL1 trick to avoid unmasking SError on guest-exit, which avoids a pstate-write and more system register reads. I'm aware 'vaxorcism' isn't an english word...) After all this: | v5.2-rc1+patches1-6 mean:4309 stddev:26 | with series 0.69% faster So for hardware that doesn't benefit from the extra work, we are back where we started. If the performance-game is valid, I intend to squash patch 5 into patch 3, and post patch 6 independently. I don't think patch 6 should be backported, but patch 5 would be fair game if its squashed in. Thanks, James [0] account-required: https://developer.arm.com/docs/sden885747/latest/arm-neoverse-n1-mp050-software-developer-errata-notice [1] https://git.kernel.org/pub/scm/linux/kernel/git/maz/kvm-ws-tests.git/ James Morse (6): KVM: arm64: Abstract the size of the HYP vectors pre-amble KVM: arm64: Consume pending SError as early as possible KVM: arm64: Defer guest entry when an asynchronous exception is pending arm64: Update silicon-errata.txt for Neoverse-N1 #1349291 KVM: arm64: nop out dsb in __guest_enter() unless we have v8.2 RAS KVM: arm64: Skip more of the SError vaxorcism Documentation/arm64/silicon-errata.txt | 1 + arch/arm64/include/asm/kvm_asm.h | 6 +++++ arch/arm64/kernel/traps.c | 4 ++++ arch/arm64/kvm/hyp/entry.S | 33 ++++++++++++++++++++------ arch/arm64/kvm/hyp/hyp-entry.S | 12 +++++++++- arch/arm64/kvm/va_layout.c | 7 +++--- 6 files changed, 51 insertions(+), 12 deletions(-) -- 2.20.1 _______________________________________________ kvmarm mailing list kvmarm@lists.cs.columbia.edu https://lists.cs.columbia.edu/mailman/listinfo/kvmarm