On Tue, 9 Jan 2024, Dave Voutila wrote:

> 
> Stefan Fritsch <[email protected]> writes:
> 
> > On 08.01.24 22:24, Alexander Bluhm wrote:
> >> Hi,
> >> When running a guest in vmm and doing ifconfig operations on vio
> >> interface, I can crash the guest.
> >> I run these loops in the guest:
> >> while doas ifconfig vio1 inet 10.188.234.74/24; do :; done
> >> while doas ifconfig vio1 -inet; do :; done
> >> while doas ifconfig vio1 down; do :; done
> >> And from host I ping the guest:
> >> ping -f 10.188.234.74
> >
> > I suspect there is a race condition in vmd. The vio(4) kernel driver
> > resets the device and then frees all the mbufs from the tx and rx
> > rings. If vmd continues doing dma for a bit after the reset, this
> > could result in corruption. From this code in vmd's vionet.c
> >
> >                 case VIODEV_MSG_IO_WRITE:
> >                         /* Write IO: no reply needed */
> >                         if (handle_io_write(&msg, dev) == 1)
> >                                 virtio_assert_pic_irq(dev, 0);
> >                         break;
> >
> > it looks like the main vmd process will just send a pio write message
> > to the vionet process but does not wait for the vionet process to
> > actually execute the device reset. The pio write instruction in the
> > vcpu must complete after the device reset is complete.
> 
> Are you saying we need to wait for the emulation of the OUT instruction
> that the vcpu is executing? I don't believe we should be blocking the
> vcpu here as that's not how port io works with real hardware. It makes
> no sense to block on an OUT until the device finishes emulation.
> 
> I *do* think there could be something wrong in the device status
> register emulation, but blocking the vcpu on an OUT isn't the way to
> solve this. In fact, that's what previously happened before I split
> device emulation out into subprocesses...so if there's a bug in the
> emulation logic, it was hiding it.

I am pretty sure that this is what qemu is doing with the OUT instruction. 
This is the safe thing to do, because virtio 0.9 to 1.1 do not specify 
exactly when the reset is complete. However, virtio 1.2 states:

  The driver SHOULD consider a driver-initiated reset complete when it 
  reads device status as 0.

Linux reads the value back once after writing 0. 

So, the virtio kernel driver should read the value back, too. What vmd 
should do is debatable. Blocking the OUT instruction for the device reset 
would be more robust, but that's not a strong opinion.

@bluhm: Does the attached patch fix the panic? 

The fdt part is completely untested, testers welcome.

diff --git a/sys/dev/fdt/virtio_mmio.c b/sys/dev/fdt/virtio_mmio.c
index 4f1e9eba9b7..27fb17d6102 100644
--- a/sys/dev/fdt/virtio_mmio.c
+++ b/sys/dev/fdt/virtio_mmio.c
@@ -200,11 +200,19 @@ virtio_mmio_set_status(struct virtio_softc *vsc, int 
status)
        struct virtio_mmio_softc *sc = (struct virtio_mmio_softc *)vsc;
        int old = 0;
 
-       if (status != 0)
+       if (status == 0) {
+               bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_STATUS,
+                   0);
+               while (bus_space_read_4(sc->sc_iot, sc->sc_ioh,
+                   VIRTIO_MMIO_STATUS) != 0) {
+                       CPU_BUSY_CYCLE();
+               }
+       } else  {
                old = bus_space_read_4(sc->sc_iot, sc->sc_ioh,
-                                      VIRTIO_MMIO_STATUS);
-       bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_STATUS,
-                         status|old);
+                   VIRTIO_MMIO_STATUS);
+               bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_STATUS,
+                   status|old);
+       }
 }
 
 int
diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c
index 398dc960f6d..ef95c834823 100644
--- a/sys/dev/pci/virtio_pci.c
+++ b/sys/dev/pci/virtio_pci.c
@@ -282,15 +282,29 @@ virtio_pci_set_status(struct virtio_softc *vsc, int 
status)
        int old = 0;
 
        if (sc->sc_sc.sc_version_1) {
-               if (status != 0)
+               if (status == 0) {
+                       CWRITE(sc, device_status, 0);
+                       while (CREAD(sc, device_status) != 0) {
+                               CPU_BUSY_CYCLE();
+                       }
+               } else {
                        old = CREAD(sc, device_status);
-               CWRITE(sc, device_status, status|old);
+                       CWRITE(sc, device_status, status|old);
+               }
        } else {
-               if (status != 0)
+               if (status == 0) {
+                       bus_space_write_1(sc->sc_iot, sc->sc_ioh,
+                           VIRTIO_CONFIG_DEVICE_STATUS, status|old);
+                       while (bus_space_read_1(sc->sc_iot, sc->sc_ioh,
+                           VIRTIO_CONFIG_DEVICE_STATUS) != 0) {
+                               CPU_BUSY_CYCLE();
+                       }
+               } else {
                        old = bus_space_read_1(sc->sc_iot, sc->sc_ioh,
                            VIRTIO_CONFIG_DEVICE_STATUS);
-               bus_space_write_1(sc->sc_iot, sc->sc_ioh,
-                   VIRTIO_CONFIG_DEVICE_STATUS, status|old);
+                       bus_space_write_1(sc->sc_iot, sc->sc_ioh,
+                           VIRTIO_CONFIG_DEVICE_STATUS, status|old);
+               }
        }
 }
 

Reply via email to