Hi,

This patch adds a single plugin, contrib/plugins/dlcall.c (~240 lines,
no changes to QEMU core), that lets a linux-user guest call functions in the
host's native shared libraries instead of emulating them.

It is the natural next step on top of the vCPU syscall-filter callback that I
contributed and that was merged earlier:

  
https://lore.kernel.org/qemu-devel/[email protected]/

Why bother? Because it turns slow, instruction-by-instruction emulation of a
library into a native host call. Some results, all on completely unmodified
guest binaries:

  * minizip (the stock zlib utility) compresses several times faster, because
    the actual deflate runs natively on the host instead of being translated.
  * Real OpenGL/Vulkan games run under qemu-user: SuperTuxKart and Hollow
    Knight are playable, with their graphics calls going straight to the host
    GPU.

You can watch the demos build, run, and report timings in CI, without checking
anything out:

  https://github.com/rover2024/qemu-passthrough-test/actions/runs/27671747420

How it works
============

The guest makes a system call with a reserved number (4096 by default) that no
real Linux ABI uses. Its first argument selects a pass-through operation; the
rest carry operands:

  syscall(4096, op, arg1, arg2, ...)
          |     |    \............ operands (pointers / values)
          |     \................. which pass-through operation
          \....................... the reserved "magic" number

The plugin registers a vCPU syscall filter: before QEMU forwards a syscall to
the host kernel, the filter runs, sees 4096, performs the operation on the
host, writes the result back, and tells QEMU the syscall is consumed, so the
real kernel never sees it.

The whole interface is just a handful of primitives:

  * query a host attribute
  * dlopen / dlclose a host shared library
  * dlsym a symbol, and read the last dlerror
  * invoke a resolved host function with a void(void *, void *) signature

That is all the plugin does. It knows nothing about zlib, X11 or OpenGL, or
about any library's calling convention.

The same machinery also runs in reverse: when a host function needs to call
back into the guest (a qsort comparator, an allocator, a GUI or game callback),
control re-enters the guest to run the callback and then resumes the suspended
host call. This reentry is what lets stateful, callback-driven APIs work, not
just leaf functions.

Why the plugin belongs in QEMU, and the rest does not
=====================================================

Only the plugin lives in the tree. Everything else is ordinary userspace:

  --- userspace (out of tree, not tied to any DBT) -------------
      guest: unmodified program  ->  guest runtime + thunk libs
  --------------------------------------------------------------
                 |  syscall(4096, op, args)   (only crossing point)
                 v
  === inside QEMU: THIS PATCH, ~240 lines ======================
      dlcall plugin:  dlopen / dlsym / invoke a host fn
  ==============================================================
                 |
                 v
  --- userspace (out of tree) ----------------------------------
      host: host runtime + thunk libs  ->  real libz / libGL ...
  --------------------------------------------------------------

A complete reference implementation, with the minizip and OpenGL/X11 examples
above, is here:

  https://github.com/rover2024/qemu-passthrough-test

The split is deliberate, and it is why only this one file is proposed for the
tree:

  * This plugin defines the most general interaction interface for native
    pass-through: the magic-syscall ABI between an emulated guest and its
    emulator. That contract is what every pass-through implementation builds
    on, so it belongs in a stable, shared place.
  * It is also the only piece that is inherently QEMU-specific: it plugs into
    QEMU's syscall-filter hook and runs inside the QEMU process. The argument
    marshalling, calling conventions, callbacks/reentry and per-library
    coverage are not tied to any particular DBT and behave as ordinary
    userspace, so they should stay out of tree rather than couple QEMU to them.

Background: we presented this approach at KVM Forum 2025, "Lorelei: Enable QEMU
to Leverage Native Shared Libraries":

  https://www.youtube.com/watch?v=_jioQFm7wyU

A note on automation
====================

The userspace thunks in that reference implementation are currently
hand-written rather than generated by the LLVM-based toolchain from the talk.
That is a deliberate choice for a demo: the automated toolchain pulls in a full
LLVM installation, which adds substantial setup time, and the example projects
are slow to build and cannot be reduced to a single Makefile. Hand-writing the
thunks was the cheaper path to a self-contained, reproducible demo, and it was
already enough to get minizip working end to end.

For real, large-scale use I will rely on the automated toolchain. The key point
is that hand-written vs. generated thunks is entirely decoupled from this plugin
and from the magic-syscall interface it defines: the toolchain only emits
out-of-tree userspace code and never touches the in-tree plugin. Automation
becomes mandatory for complex, callback-heavy targets such as the OpenGL/Vulkan
games, which is the direction this work is heading next.

It is fully opt-in (loaded with -plugin) and targets linux-user, where the
guest and host already share a trust domain. The test cases use x86_64 guests
and run on x86_64, arm64 and riscv64 Linux hosts.

The changes to the documentation (docs/about/emulation.rst) will be sent as a
separate patch.

Feedback on the plugin and on the pass-through approach is welcome.

Changes since v2:

  * Dropped the RFC tag; the approach was positively received on v2.
  * Rebased on master and adjusted the syscall-filter callback to its updated
    signature (int64_t sysret, added userdata, dropped the plugin id
    argument).

Changes since v1:

  * Renamed the plugin from "passthrough" to "dlcall" (Pierrick Bouvier).
    The old name was too generic; "dlcall" reflects what the plugin actually
    does (dlopen/dlsym a host symbol and call it) and avoids confusion with
    QEMU's existing plugin hostcall concept (QEMU_PLUGIN_*_HOSTCALL).
  * Made the magic syscall number configurable at load time via the
    "syscall_num=N" argument, defaulting to 4096 and rejecting values low
    enough to clash with a real syscall (Pierrick Bouvier).

v1: 
https://lore.kernel.org/qemu-devel/[email protected]/
v2: 
https://lore.kernel.org/qemu-devel/[email protected]/

Thanks,
Ziyang Zhang

Ziyang Zhang (1):
  contrib/plugins: add a minimal dlcall plugin

 contrib/plugins/dlcall.c    | 238 ++++++++++++++++++++++++++++++++++++
 contrib/plugins/meson.build |   5 +
 2 files changed, 243 insertions(+)
 create mode 100644 contrib/plugins/dlcall.c

-- 
2.34.1


Reply via email to