On Thu, Oct 7, 2021 at 4:32 PM Alex Bennée <alex.ben...@linaro.org> wrote: > > I came across a use-case this week for ARM although this may be also > applicable to architectures where QEMU's emulation is ahead of the > hardware currently widely available - for example if you want to > exercise SVE code on AArch64. When the linux-user architecture is not > the same as the host architecture then binfmt_misc works perfectly fine. > > However in the case you are running same-on-same you can't use > binfmt_misc to redirect execution to using QEMU because any attempt to > trap native binaries will cause your userspace to hang as binfmt_misc > will be invoked to run the QEMU binary needed to run your application > and a deadlock ensues.
Can you clarify how the code would run in this case? Does qemu-user still emulate every single instruction, both the compatible and the incompatible ones, or is the idea here to run as much as possible natively and only emulate the instructions that are not available natively, using either SIGILL or searching through the object code for those instructions? > Trap execve in QEMU linux-user > ------------------------------ > > We could add a flag to QEMU so at the point of execve it manually > invokes the new process with QEMU, passing on the flag to persist this > behaviour. This sounds like the obvious approach if you already do a full instruction emulation. You'd still need to run the parent process by calling qemu-user manually, but I suppose you need to do something like this in any case. > Add path mask to binfmt_misc > ---------------------------- > > The other option would be to extend binfmt_misc to have a path mask so > it only applies it's alternative execution scheme to binaries in a > particular section of the file-system (or maybe some sort of pattern?). The main downside I see here is that it requires kernel modification, so it would not work for old kernels. > Are there any other approaches you could take? Which do you think has > the most merit? If we modify binfmt_misc in the kernel, it might be helpful to do it by extending it with namespace support, so it could be constrained to a single container without having to do the emulation outside. Unfortunately that does not solve the problem of preventing the qemu-user binary from triggering the binfmt_misc lookup. Arnd