Just ran into this issue again, with what I think may be most compelling example yet why this is problematic:
The tracee incurred a signal, we PTRACE_SYSEMU'd to the rt_sigreturn, which the tracer tried to emulate by applying the state from the signal frame. However, the PTRACE_SYSEMU stop is a syscall-stop, so the tracer's write to x7 was ignored and x7 retained the value it had in the signal handler, which broke the tracee. Keno On Sat, May 23, 2020 at 1:35 AM Keno Fischer <k...@juliacomputing.com> wrote: > > I got bitten by this again, so I decided to write up a simple example > that shows the problem: > > https://gist.github.com/Keno/cde691b26e32373307fb7449ad305739 > > This runs the same child twice. First vanilla where it prints "Hello world". > The second time, using a textbook ptrace example, to only print "world". > The problem here is that by the time the ptracer gets around to restoring > the registers, it's no longer in a syscall stop, so the write to x7 does not > get ignored and the correct value of x7 gets clobbered. > I copied the syscall definition from musl, so the compiler thinks x7 is > live, and we can see an assertion. > > Output on my machine (will depend on compiler version, etc.): > ``` > $ gcc -g3 -O3 ptrace_lies.c > $ ./a.out > Hello World > World > a.out: ptrace_lies.c:49: do_child: Assertion `v3 == values[2]' failed. > a.out: ptrace_lies.c:134: main: Assertion `WIFEXITED(status) && > WEXITSTATUS(status) == 0' failed. > Aborted (core dumped) > ``` > > However, I don't think that whether or not the compiler thinks that x7 is > live is the problem here. The problem is entirely that this mechanism > prevents the ptracer from precisely controlling the register state. While > basic ptracers don't need this feature (strace), > more advanced ptracers (think criu, etc.) absolutely do want to precisely > control what the register state is. > The ptracer I'm working on (https://rr-project.org/) > happens to be an extreme case of this, where it wants *bitwise* equivalent > register states such that it can run the same code many times and get > the exact same results. > > Also, if the issue was just that the kernel clobbered x7, that would be fine > we could deal with that no problem. However, it's much worse than that, > because the behavior of the kernel with respect to x7 depends on what > kind of ptrace stop we're in and even worse, in some kinds of stop, > there's absolutely no way to get at the actual value of x7. > > > Hmm, does that actually result in the SVC instruction getting inlined? I > > think that's quite dangerous, since we document that we can trash the SVE > > register state on a system call, for example. I'm also surprised that > > the register variables are honoured by compilers if that inlining can occur. > > I haven't gotten to trying SVE yet, so I appreciate the warning :). That said, > deterministic clobbering of registers is fine. Even changing the registers to > random junk is fine. We're happy to read those registers through ptrace. > The problem here is that the kernel lies about what the contents of the x7 > register is and discards any writes to it. > > I really hope we can come up with a solution here, I'm already dreading > the next time I unexpectedly run into this and have to add yet > another special case :(. > > Keno