On Wed, Aug 17, 2016 at 12:13 PM, Andy Lutomirski <l...@amacapital.net> wrote: > > It wouldn't surprise me if that were easier said than done. popf > potentially changes AC, and AC affects address translation.
Rigth. A lot of magical flags are in eflags, and popf *may* change them. But that's why it's slow today. The popf microcode probably unconditionally serializes things exactly because "things may change". And the interrupt flag actually *is* pretty special too, in some respects more so than AC (because it needs to serialize with any pending interrupts). And the microcode probably already has code that says "let's handle the easy case quickly", where the easy case is "only the arithmetic flags changed". The arithmetic flags are special anyway, because they aren't actually physically in the same register any more, but are separately tracked and renamed etc. But I'm sure Intel already treats IF specially in microcode, because IF is really special in other ways (VIF handling in vm86 mode, but also modern virtualization). Yes, intel people tend to be afraid of the microcode stuff, and generally not touch it. But the good news about popf is that is isn't a serializing instruction, so it really *could* be optimized pretty aggressively. And it does have to check for pending interrupts (and *clearing* IF in particular needs to make sure that there isn't some pending interrupt that the CPU is about to react to). So it's not trivial. But the "enable interrupts" case for popf is actually easier for hardware than the disable case from a serializing standpoint, and I suspect the ucode doesn't take advantage of that right now, and it's all just fairly unoptimized microcode. Linus