Hello folks,
here are the details of the AVX story I talked about during the last
hangout. It is not directly related to what we have discussed with
respect to the FPU context switching in HelenOS, but it is tangential.
"Supporting AVX-512 increases the amount of data you need to store
across context switches, which increases per-thread overhead. Apple took
an innovative approach to this - disable AVX-512 by default, wait for a
thread to hit an illegal instruction, enable AVX-512 for that thread,
and replay the instruction, so you only take the overhead for threads
that *use* AVX-512. But that works badly for apps that follow Intel's
guide to detecting whether AVX-512 is available."
https://nondeterministic.computer/@mjg59/109824790414027030
"Apple took this a step further by trying to optimise for whether they
needed to restore the AVX-512 registers by simply checking whether
ZMM0-31 were all 0 or not. Unfortunately it's legitimate to have ZMM0-31
all be 0 and still have state in K0-7, which then blows up"
https://nondeterministic.computer/@mjg59/109824804903603481
M.D.
_______________________________________________
HelenOS-devel mailing list
[email protected]
http://lists.modry.cz/listinfo/helenos-devel