Stefan Fritsch <[email protected]> writes:
> On 08.01.24 22:24, Alexander Bluhm wrote: >> Hi, >> When running a guest in vmm and doing ifconfig operations on vio >> interface, I can crash the guest. >> I run these loops in the guest: >> while doas ifconfig vio1 inet 10.188.234.74/24; do :; done >> while doas ifconfig vio1 -inet; do :; done >> while doas ifconfig vio1 down; do :; done >> And from host I ping the guest: >> ping -f 10.188.234.74 > > I suspect there is a race condition in vmd. The vio(4) kernel driver > resets the device and then frees all the mbufs from the tx and rx > rings. If vmd continues doing dma for a bit after the reset, this > could result in corruption. From this code in vmd's vionet.c > > case VIODEV_MSG_IO_WRITE: > /* Write IO: no reply needed */ > if (handle_io_write(&msg, dev) == 1) > virtio_assert_pic_irq(dev, 0); > break; > > it looks like the main vmd process will just send a pio write message > to the vionet process but does not wait for the vionet process to > actually execute the device reset. The pio write instruction in the > vcpu must complete after the device reset is complete. Are you saying we need to wait for the emulation of the OUT instruction that the vcpu is executing? I don't believe we should be blocking the vcpu here as that's not how port io works with real hardware. It makes no sense to block on an OUT until the device finishes emulation. I *do* think there could be something wrong in the device status register emulation, but blocking the vcpu on an OUT isn't the way to solve this. In fact, that's what previously happened before I split device emulation out into subprocesses...so if there's a bug in the emulation logic, it was hiding it. > > I could not reproduce this issue with kvm/qemu. > Thanks! > >> Then I see various kind of mbuf corruption: >> kernel: protection fault trap, code=0 >> Stopped at pool_do_put+0xc9: movq 0x8(%rcx),%rcx >> ddb> trace >> pool_do_put(ffffffff82519e30,fffffd807db89000) at pool_do_put+0xc9 >> pool_put(ffffffff82519e30,fffffd807db89000) at pool_put+0x53 >> m_extfree(fffffd807d330300) at m_extfree+0xa5 >> m_free(fffffd807d330300) at m_free+0x97 >> soreceive(fffffd806f33ac88,0,ffff80002a3e97f8,0,0,ffff80002a3e9724,76299c799030 >> 1bf1) at soreceive+0xa3e >> soo_read(fffffd807ed4a168,ffff80002a3e97f8,0) at soo_read+0x4a >> dofilereadv(ffff80002a399548,7,ffff80002a3e97f8,0,ffff80002a3e98c0) at >> dofilere >> adv+0x143 >> sys_read(ffff80002a399548,ffff80002a3e9870,ffff80002a3e98c0) at sys_read+0x55 >> syscall(ffff80002a3e9930) at syscall+0x33a >> Xsyscall() at Xsyscall+0x128 >> end of kernel >> end trace frame: 0x7469f8836930, count: -10 >> pool_do_put(ffffffff8259a500,fffffd807e7fa800) at pool_do_put+0xc9 >> pool_put(ffffffff8259a500,fffffd807e7fa800) at pool_put+0x53 >> m_extfree(fffffd807f838a00) at m_extfree+0xa5 >> m_free(fffffd807f838a00) at m_free+0x97 >> m_freem(fffffd807f838a00) at m_freem+0x38 >> vio_txeof(ffff800000030118) at vio_txeof+0x11d >> vio_tx_intr(ffff800000030118) at vio_tx_intr+0x31 >> virtio_check_vqs(ffff800000024800) at virtio_check_vqs+0x102 >> virtio_pci_legacy_intr(ffff800000024800) at virtio_pci_legacy_intr+0x65 >> intr_handler(ffff80002a52dae0,ffff800000081000) at intr_handler+0x3c >> Xintr_legacy5_untramp() at Xintr_legacy5_untramp+0x1a3 >> Xspllower() at Xspllower+0x1d >> vio_ioctl(ffff8000000822a8,80206910,ffff80002a52dd00) at vio_ioctl+0x16a >> ifioctl(fffffd807c0ba7a0,80206910,ffff80002a52dd00,ffff80002a41c810) at >> ifioctl >> +0x721 >> sys_ioctl(ffff80002a41c810,ffff80002a52de00,ffff80002a52de50) at >> sys_ioctl+0x2a >> b >> syscall(ffff80002a52dec0) at syscall+0x33a >> Xsyscall() at Xsyscall+0x128 >> end of kernel >> end trace frame: 0x7b3d36d55eb0, count: -17 >> panic: pool_do_get: mcl2k free list modified: page >> 0xfffffd80068bd000; item add >> r 0xfffffd80068bf800; offset 0x0=0xa0000 != 0x83dcdb591c6b8bf >> Stopped at db_enter+0x14: popq %rbp >> TID PID UID PRFLAGS PFLAGS CPU COMMAND >> *143851 19121 0 0x3 0 0 ifconfig >> db_enter() at db_enter+0x14 >> panic(ffffffff8206e651) at panic+0xb5 >> pool_do_get(ffffffff824a1b30,2,ffff80002a4a55d4) at pool_do_get+0x320 >> pool_get(ffffffff824a1b30,2) at pool_get+0x7d >> m_clget(fffffd807c4e4f00,2,800) at m_clget+0x18d >> rtm_msg1(e,ffff80002a4a56f0) at rtm_msg1+0xde >> rtm_ifchg(ffff8000000822a8) at rtm_ifchg+0x65 >> if_down(ffff8000000822a8) at if_down+0xa4 >> ifioctl(fffffd8006898978,80206910,ffff80002a4a58c0,ffff80002a474ff0) at >> ifioctl >> +0xcd5 >> sys_ioctl(ffff80002a474ff0,ffff80002a4a59c0,ffff80002a4a5a10) at >> sys_ioctl+0x2a >> b >> syscall(ffff80002a4a5a80) at syscall+0x33a >> Xsyscall() at Xsyscall+0x128 >> end of kernel >> end trace frame: 0x7f6c22492130, count: 3 >> OpenBSD 7.4-current (GENERIC) #3213: Mon Jan 8 22:05:58 CET 2024 >> >> [email protected]:/home/bluhm/openbsd/cvs/src/sys/arch/amd64/compile/GENERIC*master >> real mem = 2130706432 (2032MB) >> avail mem = 2046525440 (1951MB) >> random: boothowto does not indicate good seed >> mpath0 at root >> scsibus0 at mpath0: 256 targets >> mainbus0 at root >> bios0 at mainbus0 >> acpi at bios0 not configured >> cpu0 at mainbus0: (uniprocessor) >> cpu0: Intel(R) Core(TM) i7-3520M CPU @ 2.90GHz, 2893.78 MHz, 06-3a-09 >> cpu0: >> FPU,VME,DE,PSE,TSC,MSR,PAE,CX8,SEP,PGE,CMOV,PAT,PSE36,CFLUSH,MMX,FXSR,SSE,SSE2,SSE3,PCLMUL,SSSE3,CX16,SSE4.1,SSE4.2,POPCNT,AES,XSAVE,AVX,F16C,RDRAND,HV,NXE,LONG,LAHF,ITSC,FSGSBASE,SMEP,ERMS,MD_CLEAR,MELTDOWN >> cpu0: 32KB 64b/line 8-way D-cache, 32KB 64b/line 8-way I-cache, 256KB >> 64b/line 8-way L2 cache, 4MB 64b/line 16-way L3 cache >> cpu0: smt 0, core 0, package 0 >> cpu0: using VERW MDS workaround >> pvbus0 at mainbus0: OpenBSD >> pvclock0 at pvbus0 >> pci0 at mainbus0 bus 0 >> pchb0 at pci0 dev 0 function 0 "OpenBSD VMM Host" rev 0x00 >> virtio0 at pci0 dev 1 function 0 "Qumranet Virtio RNG" rev 0x00 >> viornd0 at virtio0 >> virtio0: irq 3 >> virtio1 at pci0 dev 2 function 0 "Qumranet Virtio Network" rev 0x00 >> vio0 at virtio1: address 70:5f:ca:21:8d:74 >> virtio1: irq 5 >> virtio2 at pci0 dev 3 function 0 "Qumranet Virtio Network" rev 0x00 >> vio1 at virtio2: address 70:5f:ca:21:8d:84 >> virtio2: irq 6 >> virtio3 at pci0 dev 4 function 0 "Qumranet Virtio Network" rev 0x00 >> vio2 at virtio3: address 70:5f:ca:21:8d:94 >> virtio3: irq 7 >> virtio4 at pci0 dev 5 function 0 "Qumranet Virtio Storage" rev 0x00 >> vioblk0 at virtio4 >> scsibus1 at vioblk0: 1 targets >> sd0 at scsibus1 targ 0 lun 0: <VirtIO, Block Device, > >> sd0: 10240MB, 512 bytes/sector, 20971520 sectors >> virtio4: irq 9 >> virtio5 at pci0 dev 6 function 0 "Qumranet Virtio SCSI" rev 0x00 >> vioscsi0 at virtio5: qsize 128 >> scsibus2 at vioscsi0: 1 targets >> cd0 at scsibus2 targ 0 lun 0: <OpenBSD, VMM CD-ROM, 001> removable >> virtio5: irq 10 >> virtio6 at pci0 dev 7 function 0 "OpenBSD VMM Control" rev 0x00 >> vmmci0 at virtio6 >> virtio6: irq 11 >> isa0 at mainbus0 >> isadma0 at isa0 >> com0 at isa0 port 0x3f8/8 irq 4: ns8250, no fifo >> com0: console >> vscsi0 at root >> scsibus3 at vscsi0: 256 targets >> softraid0 at root >> scsibus4 at softraid0: 256 targets >> root on sd0a (16c451c199802b57.a) swap on sd0b dump on sd0b
