On Sat, Jan 30, 2021 at 12:37:31AM +0100, Reinoud Zandijk wrote: > On Thu, Jan 28, 2021 at 11:56:30PM +1100, Paul Ripke wrote: > > Just tried running a newly built kernel on a GCE instance, and ran into > > this panic. The previously running kernel is 9.99.73 from back around > > October last year. > > > > Anyone else tried booting -current on GCE recently? My suspicion is > > the VirtIO changes committed around Jan 20. I'll sync back prior to > > those and retry, if nobody else beats me to it. > > Not on GCE no. Have you tried the earlier version?
>From the old, old kernel: piixpm0: SMBus disabled virtio0 at pci0 dev 3 function 0 virtio0: Virtio SCSI Device (rev. 0x00) vioscsi0 at virtio0: Features: 0 virtio0: allocated 221184 byte for virtqueue 0 for control, size 8192 virtio0: allocated 221184 byte for virtqueue 1 for event, size 8192 virtio0: allocated 221184 byte for virtqueue 2 for request, size 8192 vioscsi0: cmd_per_lun 8 qsize 8192 seg_max 64 max_target 253 max_lun 1 virtio0: config interrupting at msix0 vec 0 virtio0: queues interrupting at msix0 vec 1 scsibus0 at vioscsi0: 16 targets, 1 lun per target virtio1 at pci0 dev 4 function 0 virtio1: Virtio Network Device (rev. 0x00) vioif0 at virtio1: Features: 0x30020<CTRL_VQ,STATUS,MAC> vioif0: Ethernet address 42:01:0a:98:00:02 virtio1: allocated 114688 byte for virtqueue 0 for rx0, size 4096 virtio1: allocated 114688 byte for virtqueue 1 for tx0, size 4096 virtio1: config interrupting at msix1 vec 0 virtio1: queues interrupting at msix1 vec 1 virtio2 at pci0 dev 5 function 0 virtio2: Virtio Memory Balloon Device (rev. 0x00) viomb0 at virtio2virtio2: allocated 12288 byte for virtqueue 0 for inflate, size 256 virtio2: allocated 12288 byte for virtqueue 1 for deflate, size 256 : Features: 0 virtio2: interrupting at ioapic0 pin 10 virtio3 at pci0 dev 6 function 0 virtio3: Virtio Entropy Device (rev. 0x00) viornd0 at virtio3: Features: 0 virtio3: allocated 12288 byte for virtqueue 0 for Entropy request, size 256 virtio3: interrupting at ioapic0 pin 10 isa0 at pcib0 Confirmed that a kernel built immediately prior to the following commit works, and fails after this commit: https://github.com/NetBSD/src/commit/7bca0bcf21c9b3465a6ee4eef6c01be32c9de1eb > > [ 1.0303647] piixpm0: SMBus disabled > > [ 1.0303647] virtio0 at pci0 dev 3 function 0 > > [ 1.0303647] virtio0: SCSI device (rev. 0x00) > > [ 1.0303647] vioscsi0 at virtio0: features: 0 > > [ 1.0303647] vioscsi0: cmd_per_lun 8 qsize 8192 seg_max 64 max_target 253 > > max_lun 1 > > [ 1.0303647] virtio0: config interrupting at msix0 vec 0 > > [ 1.0303647] virtio0: queues interrupting at msix0 vec 1 > > [ 1.0303647] scsibus0 at vioscsi0: 16 targets, 1 lun per target > > [ 1.0303647] virtio1 at pci0 dev 4 function 0 > > [ 1.0303647] virtio1: network device (rev. 0x00) > > [ 1.0303647] vioif0 at virtio1: features: > > 0x20030020<EVENT_IDX,CTRL_VQ,STATUS,MAC> > > Could you A) test with virtio v1 PCI devices? ie without legacy and if that > fails too could you B) test with src/sys/dev/pci/if_vioif.c:832 commented out > and see if that makes a difference? That's a new virtio 1.0 feature that was > apparently negotiated and should work in transitional devices and should not > be accepted in older. It could be that CGE is making a mistake there but > negotiating EVENT_IDX shifts registers so has a big impact if it goes wrong. A) Erm, how? Read thru some of the source and saw mentions of v1.0 vs v0.9, but didn't see a way of just disabling legacy support? B) [ 1.0265446] piixpm0: SMBus disabled [ 1.0265446] virtio0 at pci0 dev 3 function 0 [ 1.0265446] virtio0: SCSI device (rev. 0x00) [ 1.0265446] vioscsi0 at virtio0: features: 0 [ 1.0265446] vioscsi0: cmd_per_lun 8 qsize 8192 seg_max 64 max_target 253 max_lun 1 [ 1.0265446] virtio0: config interrupting at msix0 vec 0 [ 1.0265446] virtio0: queues interrupting at msix0 vec 1 [ 1.0265446] scsibus0 at vioscsi0: 16 targets, 1 lun per target [ 1.0265446] virtio1 at pci0 dev 4 function 0 [ 1.0265446] virtio1: network device (rev. 0x00) [ 1.0265446] vioif0 at virtio1: features: 0x30020<CTRL_VQ,STATUS,MAC> [ 1.0265446] vioif0: Ethernet address 42:01:0a:98:00:02 [ 1.0265446] panic: _bus_virt_to_bus [ 1.0265446] cpu0: Begin traceback... [ 1.0265446] vpanic() at netbsd:vpanic+0x156 [ 1.0265446] snprintf() at netbsd:snprintf [ 1.0265446] _bus_dma_alloc_bouncebuf() at netbsd:_bus_dma_alloc_bouncebuf [ 1.0265446] bus_dmamap_load() at netbsd:bus_dmamap_load+0x9c [ 1.0265446] vioif_dmamap_create_load.constprop.0() at netbsd:vioif_dmamap_create_load.constprop.0+0x7e [ 1.0265446] vioif_attach() at netbsd:vioif_attach+0x1088 [ 1.0265446] config_attach_loc() at netbsd:config_attach_loc+0x17e [ 1.0265446] virtio_pci_rescan() at netbsd:virtio_pci_rescan+0x48 [ 1.0265446] virtio_pci_attach() at netbsd:virtio_pci_attach+0x23a [ 1.0265446] config_attach_loc() at netbsd:config_attach_loc+0x17e [ 1.0265446] pci_probe_device() at netbsd:pci_probe_device+0x585 [ 1.0265446] pci_enumerate_bus() at netbsd:pci_enumerate_bus+0x1b5 [ 1.0265446] pcirescan() at netbsd:pcirescan+0x4e [ 1.0265446] pciattach() at netbsd:pciattach+0x186 [ 1.0265446] config_attach_loc() at netbsd:config_attach_loc+0x17e [ 1.0265446] mp_pci_scan() at netbsd:mp_pci_scan+0x9e [ 1.0265446] amd64_mainbus_attach() at netbsd:amd64_mainbus_attach+0x236 [ 1.0265446] mainbus_attach() at netbsd:mainbus_attach+0x84 [ 1.0265446] config_attach_loc() at netbsd:config_attach_loc+0x17e [ 1.0265446] cpu_configure() at netbsd:cpu_configure+0x38 [ 1.0265446] main() at netbsd:main+0x32c [ 1.0265446] cpu0: End traceback... [ 1.0265446] fatal breakpoint trap in supervisor mode [ 1.0265446] trap type 1 code 0 rip 0xffffffff80221a35 cs 0x8 rflags 0x202 cr2 0 ilevel 0x8 rsp 0xffffffff81cfa5d0 [ 1.0265446] curlwp 0xffffffff81886e40 pid 0.0 lowest kstack 0xffffffff81cf52c0 > If commenting out the EVENT_IDX works, it is easily solvable by disabling it > on PCI v0.9 attachments to work around GCE. > > Qemu 5.1.0 does work fine with the new kernel: > > [ 1.0189426] virtio5 at pci0 dev 8 function 0 > [ 1.0189426] virtio5: network device (rev. 0x00) > [ 1.0189426] vioif0 at virtio5: features: > 0x31070020<EVENT_IDX,INDIRECT_DESC,NOTIFY_ON_EMPTY,CTRL_RX,CTRL_VQ,STATUS,MAC> > [ 1.0189426] vioif0: Ethernet address 52:54:00:12:34:56 > [ 1.0189426] virtio5: config interrupting at msix2 vec 0 > [ 1.0189426] virtio5: queues interrupting at msix2 vec 1 > [ 1.0189426] isa0 at pcib0 > > interestingly INDIRECT_DESC and NOTIFY_ON_EMPTY (v0.9) are not negotiated with > GCE in your config; EVENT_IDX is the successor of NOTIFTY_ON_EMPTY and should > work fine on itself since its always alone in v1.0. > > What is strange, is the INDIRECT_DESC not being negotiated. I haven't touched > that code at all and Qemu always gives it. Is this also the case with older > kernels? > > Thanks in advance, > Reinoud > -- Paul Ripke "Great minds discuss ideas, average minds discuss events, small minds discuss people." -- Disputed: Often attributed to Eleanor Roosevelt. 1948.
