On Thu, Mar 26, 2026, at 2:44 AM, Yi Lai wrote:
> The existing 'sysret_rip' selftest asserts that 'regs->r11 ==
> regs->flags'. This check relies on the behavior of the SYSCALL
> instruction on legacy x86_64, which saves 'RFLAGS' into 'R11'.
>
> However, on systems with FRED (Flexible Return and Event Delivery)
> enabled, instead of using registers, all state is saved onto the stack.
> Consequently, 'R11' retains its userspace value, causing the assertion
> to fail.
>
> Fix this by detecting if FRED is enabled and skipping the register
> assertion in that case. The detection is done by checking if the RPL
> bits of the GS selector are preserved after a hardware exception.
> IDT (via IRET) clears the RPL bits of NULL selectors, while FRED (via
> ERETU) preserves them.
>

I don't really like this.  I think we have two credible choices:

1. Define the Linux ABI to be that, on FRED systems, SYSCALL preserves R11 and 
RCX on entry and exit.  And update the test to actually test this.

2. Define the Linux ABI to be what it has been for quite a few years: SYSCALL 
entry copies RFLAGS to R11 and RIP to RCX and SYSCALL exit preserves all 
registers.

I'm in favor of #2.  People love making new programming languages and runtimes 
and inline asm and, these days, vibe coded crap.  And it's *easier* to emit a 
SYSCALL and forget to tell the compiler / code generator that RCX and R11 are 
clobbered than it is to remember that they're clobbered.  And it's easy to test 
on FRED (well, not really, but it hopefully will be some day) and it's easy to 
publish one's code, and then everyone is a bit screwed when the resulting 
program crashes sometimes on non-FRED systems.  And it will be miserable to 
debug.

(It's *really* *really* easy to screw this up in a way that sort of works even 
on non-FRED: RCX and R11 are usually clobbered across function calls, so one 
can get into a situation in which one's generated code usually doesn't require 
that SYSCALL preserve one of these registers until an inlining decision changes 
or some code gets reordered, and then it will start failing.  And making the 
failure depend on hardware details is just nasty.

So I think we should add the ~2 lines of code to fix the SYSCALL entry on FRED 
to match non-FRED.

--Andy

Reply via email to