Stefan Fritsch <s...@openbsd.org> writes:
> On Tue, 9 Jan 2024, Dave Voutila wrote: > >> >> Stefan Fritsch <s...@openbsd.org> writes: >> >> > On 08.01.24 22:24, Alexander Bluhm wrote: >> >> Hi, >> >> When running a guest in vmm and doing ifconfig operations on vio >> >> interface, I can crash the guest. >> >> I run these loops in the guest: >> >> while doas ifconfig vio1 inet 10.188.234.74/24; do :; done >> >> while doas ifconfig vio1 -inet; do :; done >> >> while doas ifconfig vio1 down; do :; done >> >> And from host I ping the guest: >> >> ping -f 10.188.234.74 >> > >> > I suspect there is a race condition in vmd. The vio(4) kernel driver >> > resets the device and then frees all the mbufs from the tx and rx >> > rings. If vmd continues doing dma for a bit after the reset, this >> > could result in corruption. From this code in vmd's vionet.c >> > >> > case VIODEV_MSG_IO_WRITE: >> > /* Write IO: no reply needed */ >> > if (handle_io_write(&msg, dev) == 1) >> > virtio_assert_pic_irq(dev, 0); >> > break; >> > >> > it looks like the main vmd process will just send a pio write message >> > to the vionet process but does not wait for the vionet process to >> > actually execute the device reset. The pio write instruction in the >> > vcpu must complete after the device reset is complete. >> >> Are you saying we need to wait for the emulation of the OUT instruction >> that the vcpu is executing? I don't believe we should be blocking the >> vcpu here as that's not how port io works with real hardware. It makes >> no sense to block on an OUT until the device finishes emulation. >> >> I *do* think there could be something wrong in the device status >> register emulation, but blocking the vcpu on an OUT isn't the way to >> solve this. In fact, that's what previously happened before I split >> device emulation out into subprocesses...so if there's a bug in the >> emulation logic, it was hiding it. > > I am pretty sure that this is what qemu is doing with the OUT instruction. > This is the safe thing to do, because virtio 0.9 to 1.1 do not specify > exactly when the reset is complete. However, virtio 1.2 states: > > The driver SHOULD consider a driver-initiated reset complete when it > reads device status as 0. > > Linux reads the value back once after writing 0. It looks like FreeBSD's virtio pci does as well, using a busy loop like you're proposing. > > So, the virtio kernel driver should read the value back, too. What vmd > should do is debatable. Blocking the OUT instruction for the device reset > would be more robust, but that's not a strong opinion. > > @bluhm: Does the attached patch fix the panic? > > The fdt part is completely untested, testers welcome. > Diff reads fine to me. ok dv@, but I can't test the fdt part. > diff --git a/sys/dev/fdt/virtio_mmio.c b/sys/dev/fdt/virtio_mmio.c > index 4f1e9eba9b7..27fb17d6102 100644 > --- a/sys/dev/fdt/virtio_mmio.c > +++ b/sys/dev/fdt/virtio_mmio.c > @@ -200,11 +200,19 @@ virtio_mmio_set_status(struct virtio_softc *vsc, int > status) > struct virtio_mmio_softc *sc = (struct virtio_mmio_softc *)vsc; > int old = 0; > > - if (status != 0) > + if (status == 0) { > + bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_STATUS, > + 0); > + while (bus_space_read_4(sc->sc_iot, sc->sc_ioh, > + VIRTIO_MMIO_STATUS) != 0) { > + CPU_BUSY_CYCLE(); > + } > + } else { > old = bus_space_read_4(sc->sc_iot, sc->sc_ioh, > - VIRTIO_MMIO_STATUS); > - bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_STATUS, > - status|old); > + VIRTIO_MMIO_STATUS); > + bus_space_write_4(sc->sc_iot, sc->sc_ioh, VIRTIO_MMIO_STATUS, > + status|old); > + } > } > > int > diff --git a/sys/dev/pci/virtio_pci.c b/sys/dev/pci/virtio_pci.c > index 398dc960f6d..ef95c834823 100644 > --- a/sys/dev/pci/virtio_pci.c > +++ b/sys/dev/pci/virtio_pci.c > @@ -282,15 +282,29 @@ virtio_pci_set_status(struct virtio_softc *vsc, int > status) > int old = 0; > > if (sc->sc_sc.sc_version_1) { > - if (status != 0) > + if (status == 0) { > + CWRITE(sc, device_status, 0); > + while (CREAD(sc, device_status) != 0) { > + CPU_BUSY_CYCLE(); > + } > + } else { > old = CREAD(sc, device_status); > - CWRITE(sc, device_status, status|old); > + CWRITE(sc, device_status, status|old); > + } > } else { > - if (status != 0) > + if (status == 0) { > + bus_space_write_1(sc->sc_iot, sc->sc_ioh, > + VIRTIO_CONFIG_DEVICE_STATUS, status|old); > + while (bus_space_read_1(sc->sc_iot, sc->sc_ioh, > + VIRTIO_CONFIG_DEVICE_STATUS) != 0) { > + CPU_BUSY_CYCLE(); > + } > + } else { > old = bus_space_read_1(sc->sc_iot, sc->sc_ioh, > VIRTIO_CONFIG_DEVICE_STATUS); > - bus_space_write_1(sc->sc_iot, sc->sc_ioh, > - VIRTIO_CONFIG_DEVICE_STATUS, status|old); > + bus_space_write_1(sc->sc_iot, sc->sc_ioh, > + VIRTIO_CONFIG_DEVICE_STATUS, status|old); > + } > } > } >