Hi, I came across a use-case this week for ARM although this may be also applicable to architectures where QEMU's emulation is ahead of the hardware currently widely available - for example if you want to exercise SVE code on AArch64. When the linux-user architecture is not the same as the host architecture then binfmt_misc works perfectly fine.
However in the case you are running same-on-same you can't use binfmt_misc to redirect execution to using QEMU because any attempt to trap native binaries will cause your userspace to hang as binfmt_misc will be invoked to run the QEMU binary needed to run your application and a deadlock ensues. There are some hacks you can apply at a local level like tweaking the elf header of the binaries you want to run under emulation and adjusting the binfmt_mask appropriately. This works but is messy and a faff to set-up. An ideal setup would be would be for the kernel to catch a SIGILL from a failing user space program and then to re-launch the process using QEMU with the old processes maps and execution state so it could continue. However I suspect there are enough moving parts to make this very fragile (e.g. what happens to the results of library feature probing code). So two approaches I can think of are: Trap execve in QEMU linux-user ------------------------------ We could add a flag to QEMU so at the point of execve it manually invokes the new process with QEMU, passing on the flag to persist this behaviour. Add path mask to binfmt_misc ---------------------------- The other option would be to extend binfmt_misc to have a path mask so it only applies it's alternative execution scheme to binaries in a particular section of the file-system (or maybe some sort of pattern?). Are there any other approaches you could take? Which do you think has the most merit? Thanks, -- Alex Bennée