On 4/30/26 10:47 AM, Teddy Astie wrote:
Le 30/04/2026 à 10:51, Val Packett a écrit :
On 4/30/26 5:11 AM, Teddy Astie wrote:
Le 30/04/2026 à 06:06, Val Packett a écrit :
[..]
I'd like to get some early feedback for this patch, particularly
the general stuff:
* is this whole thing acceptable in general?
* should it be extracted into a different file?
* (from the Xen side) any input on the xenstore keys, what goes where?
* anything else to keep in mind?
It does seem simple enough, so hopefully this can be done?
The corresponding userspace-side WIP is available at:
https://github.com/QubesOS/xen-vhost-frontend
And the required DMOP for firing the evtchn events will be sent
to xen-devel shortly as well.
Could that be done through evtchn_send (or its userland counterpart) ?
Actually, yes… The use of DMOPs is only dictated by the current Linux
privcmd.c code (the irqfds created by the kernel react to events by
executing HYPERVISOR_dm_op with a stored operation), we can avoid the
need to modify Xen by simply expanding the privcmd driver to make
"evtchn fds". Sounds good, will do.
Given that the event channel used by device models is exposed through
ioreq.vp_eport ("evtchn for notifications to/from device model"). I
don't think you need to expand the privcmd interface, and you should be
able to do this instead :
open /dev/xen/evtchn
perform IOCTL_EVTCHN_BIND_INTERDOMAIN (for each guest vCPU)
with remote_domain=guest_domid, remote_port=ioreq.vp_eport
Then interact with the event channel through IOCTL_EVTCHN_NOTIFY (with
local port given by IOCTL_EVTCHN_BIND_INTERDOMAIN) and read/write on the
file descriptor.
So the reason there's currently an ioctl to bind an eventfd to fire a
stored DMOP is that the whole idea is to (efficiently!) support generic,
hypervisor-neutral device server implementations via the vhost-user
protocol.
Now of course, the current implementation isn't *entirely* hypervisor-
neutral as e.g. the vm-memory Rust crate (inside of the "neutral" vhost-
user device servers) does need to be built with the `xen` feature. But
still, that's how it works. What can be made generic is generic.
xen-vhost-frontend, which is the thing that integrates these with Xen,
actually used to handle the interrupts in userspace[1] by firing the
DMOP itself (which is where I could "just replace that with
IOCTL_EVTCHN_NOTIFY") but that was offloaded to the kernel with the
introduction of IOCTL_PRIVCMD_IRQFD[2], similarly to KVM_IRQFD.
I think what would be preferable for your usecase would be to have a way
to bind a event channel with a eventfd object, which should be a
primitive that lives in the evtchn device.
Yeah, it would be an ioctl on the evtchn device, definitely. I wasn't
being exact when I said "extend privcmd", sorry. I just meant "handling
it on the Linux side" generally!
The current interface kinda assume that you're looking to emulate a
completely emulated virtio device with no Xen specifics, it looks like
it's not exactly what you're implementing.
It's already implemented, and I'm not looking to change it much, just to
make it work on x86_64. The only thing that wasn't already compatible
was firing the host-to-guest interrupt, because on x86_64 we don't have
anything like the (v)GIC with its massive arbitrary IRQ number space.
Event channels are the only way to interrupt a PVH guest, hence using
xenbus in the guest to provision the device.
As you actually plan to switch to using event channels for notifying the
guest, I think it would be preferable to do the same the other way
(event channels to notify the host) so you only have event channels to
worry about here.
The other direction is already implemented perfectly well in
IOCTL_PRIVCMD_IOEVENTFD. The MMIO area is set up like so:
- ioreq is mapped with
IOCTL_PRIVCMD_MMAP_RESOURCE(XENMEM_resource_ioreq_server, ..);
- vp_eport event channels (per cpu) are bound to the current domain via
IOCTL_EVTCHN_BIND_INTERDOMAIN;
- those are passed, along with the ioreq page itself, to
IOCTL_PRIVCMD_IOEVENTFD to get an eventfd that fires when a virtqueue is
ready;
- which is an eventfd that xen-vhost-frontend passes to the vhost-user
device server.
So for this direction, it's not a 1:1 mapping but rather a specific
contraption designed to efficiently handle this use case:
- when an ioreq event channel (for any of the vcpus) fires,
- the kernel handler (ioeventfd_interrupt) checks if it's specifically
an IOREQ_WRITE write to the VIRTIO_MMIO_QUEUE_NOTIFY offset,
- and if so, it signals the eventfd for any virtqueue that has new data
(waking the generic device server which has the eventfd, so bypassing
xen-vhost-frontend), pings the guest back via evtchn, and returns
IRQ_HANDLED;
- otherwise the request is handled in userspace by xen-vhost-frontend
(virtio configuration register access).
It just works :)
~val