On Thu, Oct 07, 2021 at 03:32:19PM +0100, Alex Bennée wrote: > Hi, > > I came across a use-case this week for ARM although this may be also > applicable to architectures where QEMU's emulation is ahead of the > hardware currently widely available - for example if you want to > exercise SVE code on AArch64. When the linux-user architecture is not > the same as the host architecture then binfmt_misc works perfectly fine. > > However in the case you are running same-on-same you can't use > binfmt_misc to redirect execution to using QEMU because any attempt to > trap native binaries will cause your userspace to hang as binfmt_misc > will be invoked to run the QEMU binary needed to run your application > and a deadlock ensues. > > There are some hacks you can apply at a local level like tweaking the > elf header of the binaries you want to run under emulation and adjusting > the binfmt_mask appropriately. This works but is messy and a faff to > set-up. > > An ideal setup would be would be for the kernel to catch a SIGILL from a > failing user space program and then to re-launch the process using QEMU > with the old processes maps and execution state so it could continue. > However I suspect there are enough moving parts to make this very > fragile (e.g. what happens to the results of library feature probing > code). So two approaches I can think of are: > > Trap execve in QEMU linux-user > ------------------------------ > > We could add a flag to QEMU so at the point of execve it manually > invokes the new process with QEMU, passing on the flag to persist this > behaviour. > > > Add path mask to binfmt_misc > ---------------------------- > > The other option would be to extend binfmt_misc to have a path mask so > it only applies it's alternative execution scheme to binaries in a > particular section of the file-system (or maybe some sort of pattern?). > > Are there any other approaches you could take? Which do you think has > the most merit?
Could a new Linux personality flag be useful in combination with a new flag in binfmt_misc. eg a flag "E" for binfmt_misc which indicates the rule must only be applied if the process is execve()d with PER_USE_BINFMT personality set. That would let you add a native match rule to binfmt_misc without it affecting your system initially. To then run native binaries via qemu-user you just need to set the personality() flag and the only that sub-process tree gets redirected. Regards, Daniel -- |: https://berrange.com -o- https://www.flickr.com/photos/dberrange :| |: https://libvirt.org -o- https://fstop138.berrange.com :| |: https://entangle-photo.org -o- https://www.instagram.com/dberrange :|