Approaches for same-on-same linux-user execve?

Alex Bennée Thu, 07 Oct 2021 07:55:02 -0700

Hi,

I came across a use-case this week for ARM although this may be also
applicable to architectures where QEMU's emulation is ahead of the
hardware currently widely available - for example if you want to
exercise SVE code on AArch64. When the linux-user architecture is not
the same as the host architecture then binfmt_misc works perfectly fine.


However in the case you are running same-on-same you can't use
binfmt_misc to redirect execution to using QEMU because any attempt to
trap native binaries will cause your userspace to hang as binfmt_misc
will be invoked to run the QEMU binary needed to run your application
and a deadlock ensues.

There are some hacks you can apply at a local level like tweaking the
elf header of the binaries you want to run under emulation and adjusting
the binfmt_mask appropriately. This works but is messy and a faff to
set-up.

An ideal setup would be would be for the kernel to catch a SIGILL from a
failing user space program and then to re-launch the process using QEMU
with the old processes maps and execution state so it could continue.
However I suspect there are enough moving parts to make this very
fragile (e.g. what happens to the results of library feature probing
code). So two approaches I can think of are:

Trap execve in QEMU linux-user
------------------------------

We could add a flag to QEMU so at the point of execve it manually
invokes the new process with QEMU, passing on the flag to persist this
behaviour.


Add path mask to binfmt_misc
----------------------------

The other option would be to extend binfmt_misc to have a path mask so
it only applies it's alternative execution scheme to binaries in a
particular section of the file-system (or maybe some sort of pattern?).

Are there any other approaches you could take? Which do you think has
the most merit?

Thanks,

-- 
Alex Bennée

Approaches for same-on-same linux-user execve?

Reply via email to