On 18/11/2025 13.02, Halil Pasic wrote:
On Tue, 18 Nov 2025 10:39:45 +0100
Thomas Huth <[email protected]> wrote:

Consider the following nested setup: An L1 host uses some virtio device
(e.g. virtio-keyboard) for the L2 guest, and this L2 guest passes this
device through to the L3 guest. Since the L3 guest sees a virtio device,
it might send virtio notifications to the QEMU in L2 for that device.

Hm, but conceptually the notification is sent to the virtio device,
regardless of hypervisors, right? But because for virtio-ccw the
notification is an DIAG 500, we have the usual cascade of intercept
handling. And because we have never considered this scenario up till now
the intercept handler in L2 QEMU gets called, because it is usually the
responsibility of L2 QEMU to emulate instructions for an L3 guest.

Right.

I think vfio-ccw pass through was supposed to be only about DASD.

Yes. And we only noticed this bug by accident - while trying to pass through a DASD device, the wrong device was used for VFIO and suddenly QEMU crashed.

But since the QEMU in L2 defined this device as vfio-ccw, the function
handle_virtio_ccw_notify() cannot handle this and crashes: It calls
virtio_ccw_get_vdev() that casts sch->driver_data into a VirtioCcwDevice,
but since "sch" belongs to a vfio-ccw device, that driver_data rather
points to a CcwDevice instead. So as soon as QEMU tries to use some
VirtioCcwDevice specific data from that device, we've lost.

We must not take virtio notifications for such devices. Thus fix the
issue by adding a check to the handle_virtio_ccw_notify() handler to
refuse all devices that are not our own virtio devices.

I'm on board with this patch! Virtio notifications are only supported
for virtio devices and if a guest for what ever reason attempts
to do a virtio notification on a non-virtio device, that should be
handled accordingly. Which would be some sort of a program exception
I guess. Maybe you could add what kind of exception do we end up
with to the commit message. I would guess specification exception.

But I would argue that the L3 guest didn't do anything wrong.

That's the point - the L3 guest just sees a virtio device, so we should not punish it with program exceptions just because it tried to send a notification for the device.

Pass-through of virtio-ccw devices is simply not implemented yet
properly. And even  if we were to swallow that notification silently,
it would be effectively loss of initiative I guess.

I think the current patch does the right thing: It returns an error value to the guest (just like we're doing it in other spots in this function already), so the guest sees that error value and then can finally give up on using the device.

So I think it would really make sense to prevent passing through
virtio-ccw devices with vfio-ccw.

That could be a nice addition on top (in another patch), but we have to fix handle_virtio_ccw_notify() anyway to avoid that the L3 guest can crash QEMU, so it's certainly not a replacement for this patch, I think.

 Thomas


Reply via email to