On Thu, Oct 7, 2021 at 8:56 AM Alex Bennée <alex.ben...@linaro.org> wrote:
> Hi, > > I came across a use-case this week for ARM although this may be also > applicable to architectures where QEMU's emulation is ahead of the > hardware currently widely available - for example if you want to > exercise SVE code on AArch64. When the linux-user architecture is not > the same as the host architecture then binfmt_misc works perfectly fine. > > However in the case you are running same-on-same you can't use > binfmt_misc to redirect execution to using QEMU because any attempt to > trap native binaries will cause your userspace to hang as binfmt_misc > will be invoked to run the QEMU binary needed to run your application > and a deadlock ensues. > > There are some hacks you can apply at a local level like tweaking the > elf header of the binaries you want to run under emulation and adjusting > the binfmt_mask appropriately. This works but is messy and a faff to > set-up. > > An ideal setup would be would be for the kernel to catch a SIGILL from a > failing user space program and then to re-launch the process using QEMU > with the old processes maps and execution state so it could continue. > However I suspect there are enough moving parts to make this very > fragile (e.g. what happens to the results of library feature probing > code). So two approaches I can think of are: > 32-bit arm had an 'eabi' section in ELF binaries. There it would have been possible to look at that and make a decision before the binary starts executing to see whether it should just run, or fork the linux-user binary. It would take kernel changes, though. > Trap execve in QEMU linux-user > ------------------------------ > > We could add a flag to QEMU so at the point of execve it manually > invokes the new process with QEMU, passing on the flag to persist this > behaviour. > The bsd-user code differs a little from linux-user in that it looks at the binary being exec'd to determine what to do. It works OK, but isn't really for this situation (we use it to optimize our package builds with additional path processing for our mixed binary situation where we have native binaries execing emulated binaries that then exec native binaries again. It is a bit of a hack, though, and I'm not completely happy with it. Add path mask to binfmt_misc > ---------------------------- > > The other option would be to extend binfmt_misc to have a path mask so > it only applies it's alternative execution scheme to binaries in a > particular section of the file-system (or maybe some sort of pattern?). > > Are there any other approaches you could take? Which do you think has > the most merit? > In by-gone times, brandelf has bene used for situations where you wanted to run an ELF binary with one emulation that looks like another. But that's also kernel hacks and also touching the local binary. There's also the option of doing a VM86-like thing that allowed people to run 16-bit x86 binaries on 32-bit processors. There the system calls would SEGV and you'd decode them inline, execute the emulation and move the IP to execute the next instruction after the INT XX system call. You could create a loader that knows how to load load the binaries and catch SIGILL and then emulate the new instructions on the old processor, but that's somewhat different than how qemu user-mode works today. But knowing you'd need to do this is hard, potentially. But one could expand the kernel to load in SIGILL handlers on-demand for programs that do this, but that wouldn't work with old kernels and just feels weird... Warner