The qtest reproducers are so nice. writel 0x0 0xffffffff
outw 0x171 0x32a features := 0x2a b8cb count := 0x03; b8cb outw 0x176 0x3570 device := 0x70 (select device1) b8cb command := 0x35 (DMA WRITE EXT) 8f98 outl 0xcf8 0x80000903 outl 0xcfc 0x4e002700 outl 0xcf8 0x80000920 outb 0xcfc 0x5e outb 0x58 0xe1 bmdma_cmd_writeb val = 0xe1 [1110 0001] DMA READ ^ ^ DMA Start outw 0x57 0x0 bmdma_cmd_writeb val = 0x00 [0000 0000] ^ DMA Cancel EOF This should be a straightforward DMA cancel. I added some more traces; # After the 0x35 command write: ide_exec_cmd IDE exec cmd: bus 0x561808b0ecc0; state 0x561808b0f118; cmd 0x35 ide_sector_start_dma IDEState 0x561808b0f118; ide_start_dma IDEState 0x561808b0f118; # After the 0xe1 bmdma kick: ide_dma_cb_entry IDEState 0x561808b0f118; ret 0; ide_dma_cb IDEState 0x561808b0f118; sector_num=1 n=259 cmd=DMA WRITE ide_dma_cb_next IDEState 0x561808b0f118; So far, pretty normal. IDE calls the HBA's DMA start, but the HBA doesn't have DMA enabled, so it stalls. Later, when we turn on DMA, the HBA engages the DMA callback and sets up the first transfer. This sets s->bus->dma->aiocb. Then, we try to cancel DMA: ide_cancel_dma_sync IDEState 0x561808b0f118; ide_cancel_dma_sync_remaining draining all remaining requests 1343877@1595891049.469050:dma_blk_cb dbs=0x55baededdc50 ret=0 1343877@1595891049.469054:dma_map_wait dbs=0x55baededdc50 qemu-system-i386: /home/jsnow/src/qemu/hw/ide/core.c:732: void ide_cancel_dma_sync(IDEState *): Assertion `s->bus->dma->aiocb == NULL' failed. We still have a DMA callback out, so we try to synchronously cancel it; but the blk_drain doesn't appear to be effective! We apparently wind up here: if (dbs->iov.size == 0) { trace_dma_map_wait(dbs); dbs->bh = aio_bh_new(dbs->ctx, reschedule_dma, dbs); cpu_register_map_client(dbs->bh); return; } ... The DMA simply re-schedules itself (?) when iov.size is zero. unfortunately for us, that means that the original point of scheduling the drain doesn't work, because the DMA never returns all the way to the IDE device emulation code. -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1681439 Title: qemu-system-x86_64: hw/ide/core.c:685: ide_cancel_dma_sync: Assertion `s->bus->dma->aiocb == NULL' failed. Status in QEMU: Confirmed Bug description: Since upgrading to QEMU 2.8.0, my Windows 7 64-bit virtual machines started crashing due to the assertion quoted in the summary failing. The assertion in question was added by commit 9972354856 ("block: add BDS field to count in-flight requests"). My tests show that setting discard=unmap is needed to reproduce the issue. Speaking of reproduction, it is a bit flaky, because I have been unable to come up with specific instructions that would allow the issue to be triggered outside of my environment, but I do have a semi-sane way of testing that appears to depend on a specific initial state of data on the underlying storage volume, actions taken within the VM and waiting for about 20 minutes. Here is the shortest QEMU command line that I managed to reproduce the bug with: qemu-system-x86_64 \ -machine pc-i440fx-2.7,accel=kvm \ -m 3072 \ -drive file=/dev/lvm/qemu,format=raw,if=ide,discard=unmap \ -netdev tap,id=hostnet0,ifname=tap0,script=no,downscript=no,vhost=on \ -device virtio-net-pci,netdev=hostnet0 \ -vnc :0 The underlying storage (/dev/lvm/qemu) is a thin LVM snapshot. QEMU was compiled using: ./configure --python=/usr/bin/python2.7 --target-list=x86_64-softmmu make -j3 My virtualization environment is not really a critical one and reproduction is not that much of a hassle, so if you need me to gather further diagnostic information or test patches, I will be happy to help. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1681439/+subscriptions