* Ingo Molnar <[email protected]> wrote:
> Is there a testcase for the SkyLake 16-deep-call-stack problem that I could
> run?
> Is there a description of the exact speculative execution vulnerability that
> has
> to be addressed to begin with?
Ok, so for now I'm assuming that this is the 16 entries return-stack-buffer
underflow condition where SkyLake falls back to the branch predictor (while
other
CPUs wrap the buffer).
> If this approach is workable I'd much prefer it to any MSR writes in the
> syscall
> entry path not just because it's fast enough in practice to not be turned off
> by
> everyone, but also because everyone would agree that per function call
> overhead
> needs to go away on new CPUs. Both deployment and backporting is also _much_
> more
> flexible, simpler, faster and more complete than microcode/firmware or
> compiler
> based solutions.
>
> Assuming the vulnerability can be addressed via this route that is, which is
> a big
> assumption!
So I talked this over with PeterZ, and I think it's all doable:
- the CALL __fentry__ callbacks maintain the depth tracking (on the kernel
stack, fast to access), and issue an "RSB-stuffing sequence" when depth
reaches
16 entries.
- "the RSB-stuffing sequence" is a return trampoline that pushes a CALL on the
stack which is executed on the RET.
- All asynchronous contexts (IRQs, NMIs, etc.) stuff the RSB before IRET. (The
tracking could probably made IRQ and maybe even NMI safe, but the worst-case
nesting scenarios make my head ache.)
I.e. IBRS can be mostly replaced with a kernel based solution that is better
than
IBRS and which does not negatively impact any other non-SkyLake CPUs or general
code quality.
I.e. a full upstream Spectre solution.
Thanks,
Ingo