On 2021-01-20 at 21:09, Andy Lutomirski wrote:
This series fixes two regressions: a boot failure on AMD K7 and a
performance regression on everything.

I did a double-take here -- the regressions were reported by different
people, both named Krzysztof :)

Changes from v2:
  - Tidy up the if statements (Sean)
  - Changelog and comment improvements (Boris)

Changes from v1:
  - Fix MMX better -- MMX really does need FNINIT.
  - Improve the EFI code.
  - Rename the KFPU constants.
  - Changelog improvements.

Andy Lutomirski (4):
   x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state
   x86/mmx: Use KFPU_387 for MMX string operations
   x86/fpu: Make the EFI FPU calling convention explicit
   x86/fpu/64: Don't FNINIT in kernel_fpu_begin()

Hi Andy,

I have tested the new patchset on the following CPUs running 5.4.90
(with some adjustments required for it to apply) and 5.10.9 kernels:
 - AMD Phenom(tm) II X3 B77 Processor (family: 0x10, model: 0x4, stepping: 0x3)
 - Intel(R) Xeon(R) CPU 3070  @ 2.66GHz (family: 0x6, model: 0xf, stepping: 0x6)
 - Intel(R) Xeon(R) CPU E3-1280 V2 @ 3.60GHz (family: 0x6, model: 0x3a, 
stepping: 0x9)

For all of them, it was possible to recover most of the performance lost
due to the introduction of "Reset MXCSR to default in kernel_fpu_begin":
 - B77: 90% instead of 82% for prefetch64-sse, 92% instead of 84% for 
generic_sse
 - 3070: 93% instead of 86% for prefetch64-sse, 93% instead of 88% for 
generic_sse
 - 1280v2: 99% instead of 88% for prefetch64-sse, 99% instead of 88% for 
generic_sse.

For some reason, 1280v2 (Ivy Bridge) sees almost no regression for
prefetch64-sse and generic_sse. The only issue is that AVX is still at
67% of its original performance. This is of course better compared to
60%. There is no AVX on the other 2 CPUs.

I was using 64 bit kernels for testing, please let me know if 32 bit
is also needed.

Tested-by: Krzysztof Piotr Olędzki <[email protected]>

Thanks,
 Krzysztof

Reply via email to