CVSROOT: /cvs Module name: src Changes by: guent...@cvs.openbsd.org 2018/06/05 00:39:11
Modified files: sys/arch/amd64/amd64: acpi_machdep.c cpu.c fpu.c genassym.cf ipifuncs.c locore.S machdep.c mptramp.S process_machdep.c trap.c vector.S via.c vm_machdep.c vmm.c sys/arch/amd64/include: codepatch.h cpu.h fpu.h intrdefs.h pcb.h proc.h specialreg.h Log message: Switch from lazy FPU switching to semi-eager FPU switching: track whether curproc's xstate ("extended state") is loaded in the CPU or not. - context switch, sendsig(), vmm, and doing CPU crypto in the kernel all check the flag and, if set, save the old thread's state to the PCB, clear the flag, and then load the _blank_ state - when returning to userspace, if the flag is clear then set it and restore the thread's state This simpler tracking also fixes the restoring of FPU state after nested signal handlers. With this, %cr0's TS flag is never set, the FPU #DNA trap can no longer happen, and IPIs are no longer necessary for flushing or syncing FPU state; on the other hand, restoring xstate while returning to userspace means we have to handle xrstor faulting if we could be loading an altered state. If that happens, reset the state, fake a #GP fault (SIGBUS), and recheck for ASTs. While here, regularize fxsave/fxrstor vs xsave/xrstor handling, by using codepatching to switch to xsave/xrstor when present in the CPU. In addition, code patch in use of xsaveopt in most places when the CPU supports that. Use the 64bit-wide variants of the instructions in all cases so that x87 instruction fault IPs are reported correctly. This change has three motivations: 1) with modern clang, SSE registers are used even in rcrt0.o, making lazy FPU switching a smaller benefit vs trap costs 2) the Intel SDM warns that lazy FPU switching may increase power costs 3) post-Spectre rumors suggest that the %cr0 TS flag might not block speculation, permitting leaking of information about FPU state (AES keys?) across protection boundaries. tested by many in snaps; prodding from deraadt@