The VM is fine until I issue a reboot on the guest OS, in this case it happened right at [315765.858431]. That is, once I boot the host, the guest starts fine, I am able to use the tape drive fine, but when I reboot for whatever reason, I guess the segfault.
Is this enough or do you want the full dumpxml ? <hostdev mode='subsystem' type='pci' managed='yes'> <driver name='vfio'/> <source> <address domain='0x0000' bus='0x1a' slot='0x00' function='0x0'/> </source> <alias name='hostdev0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </hostdev> Makes sense to me what you've explained about libvirt returning to the host driver, but unfortunately I don't have enough knowledge to comment, sorry. lspci -vvv: 1a:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev 08) Subsystem: Hewlett-Packard Company SC44Ge Host Bus Adapter Physical Slot: 3 Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr+ Stepping- SERR+ FastB2B- DisINTx+ Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx- Latency: 0, Cache Line Size: 64 bytes Interrupt: pin A routed to IRQ 27 Region 0: I/O ports at 2000 [disabled] [size=256] Region 1: Memory at 97a10000 (64-bit, non-prefetchable) [size=16K] Region 3: Memory at 97a00000 (64-bit, non-prefetchable) [size=64K] Expansion ROM at <ignored> [disabled] Capabilities: [50] Power Management version 2 Flags: PMEClk- DSI- D1+ D2+ AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-) Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME- Capabilities: [68] Express (v1) Endpoint, MSI 00 DevCap: MaxPayload 4096 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us ExtTag+ AttnBtn- AttnInd- PwrInd- RBE- FLReset- DevCtl: Report errors: Correctable- Non-Fatal+ Fatal+ Unsupported- RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ MaxPayload 256 bytes, MaxReadReq 4096 bytes DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend- LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us ClockPM- Surprise- LLActRep- BwNot- LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- CommClk- ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt- LnkSta: Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt- Capabilities: [98] MSI: Enable- Count=1/1 Maskable- 64bit+ Address: 0000000000000000 Data: 0000 Capabilities: [b0] MSI-X: Enable+ Count=1 Masked- Vector table: BAR=1 offset=00002000 PBA: BAR=1 offset=00003000 Capabilities: [100 v1] Advanced Error Reporting UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq+ ACSViol- UESvrt: DLP+ SDES- TLP+ FCP+ CmpltTO+ CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC+ UnsupReq- ACSViol- CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr- CEMsk: RxErr- BadTLP+ BadDLLP+ Rollover+ Timeout+ NonFatalErr- AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn- Kernel driver in use: vfio-pci Installed debuginfo packages and abrt caught it. The coredump is 11G but tar.bz2 everything is at 367M. Anything in specific you'd like from all the debug info generated? -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1596579 Title: segfault upon reboot Status in QEMU: New Bug description: [ 31.167946] VFIO - User Level meta-driver version: 0.3 [ 34.969182] kvm: zapping shadow pages for mmio generation wraparound [ 43.095077] vfio-pci 0000:1a:00.0: irq 50 for MSI/MSI-X [166493.891331] perf interrupt took too long (2506 > 2500), lowering kernel.perf_event_max_sample_rate to 50000 [315765.858431] qemu-kvm[1385]: segfault at 0 ip (null) sp 00007ffe5430db18 error 14 [315782.002077] vfio-pci 0000:1a:00.0: transaction is not cleared; proceeding with reset anyway [315782.910854] mptsas 0000:1a:00.0: Refused to change power state, currently in D3 [315782.911236] mptbase: ioc1: Initiating bringup [315782.911238] mptbase: ioc1: WARNING - Unexpected doorbell active! [315842.957613] mptbase: ioc1: ERROR - Failed to come READY after reset! IocState=f0000000 [315842.957670] mptbase: ioc1: WARNING - ResetHistory bit failed to clear! [315842.957675] mptbase: ioc1: ERROR - Diagnostic reset FAILED! (ffffffffh) [315842.957717] mptbase: ioc1: WARNING - NOT READY WARNING! [315842.957720] mptbase: ioc1: ERROR - didn't initialize properly! (-1) [315842.957890] mptsas: probe of 0000:1a:00.0 failed with error -1 The qemu-kvm segfault happens when I issue a reboot on the Windows VM. The card I have is: 1a:00.0 SCSI storage controller: LSI Logic / Symbios Logic SAS1068E PCI-Express Fusion-MPT SAS (rev ff) I have two of these cards (bought with many years difference), exact same model, and they fail the same way. I'm using PCI passthrough on this card for access to the tape drive. This is very easy to reproduce, so feel free to let me know what to try. Kernel 3.10.0-327.18.2.el7.x86_64 (Centos 7.2.1511). qemu-kvm-1.5.3-105.el7_2.4.x86_64 Reporting it here because of the segfault, but I guess I might have to open a bug report with mptbase as well? To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1596579/+subscriptions