I tried bisecting as well, and I wound up at: 1a423896 -- five out of five boot attempts succeeded. d759c951 -- five out of five boot attempts failed.
d759c951f3287fad04210a52f2dc93f94cf58c7f is the first bad commit commit d759c951f3287fad04210a52f2dc93f94cf58c7f Author: Alex Bennée <alex.ben...@linaro.org> Date: Tue Feb 27 12:52:48 2018 +0300 replay: push replay_mutex_lock up the call tree My methodology was to boot QEMU like this: ./x86_64-softmmu/qemu-system-x86_64 -m 4096 -cpu host -M q35 -enable-kvm -smp 4 -drive id=sda,if=none,file=/home/bos/jhuston/windows_10.qcow -device ide-hd,drive=sda -qmp tcp::4444,server,nowait and run it three times with -snapshot and see if it hung during boot; if it did, I marked the commit bad. If it did not, I booted and attempted to log in and run CrystalDiskMark. If it froze before I even launched CDM, I marked it bad. Interestingly enough, on a subsequent (presumably bad) commit (6dc0f529) which hangs fairly reliably on bootup (66%) I can occasionally get into Windows 10 and run CDM -- and that unfortunately does not seem to trigger the error again, so CDM doesn't look like a reliable way to trigger the hangs. Anyway, d759c951 definitely appears to change the odds of AHCI locking up during boot for me, and I suppose it might have something to do with how it is changing the BQL acquisition/release in main-loop.c, but I am not sure why/what yet. Before this patch, we only lock the iothread and re-lock it if there was a timeout, and after this patch we *always* lock and unlock the iothread. This is probably just exposing some latent bug in the AHCI emulator that has always existed, but now the odds of seeing it are much higher. I'll have to dig as to what the race is -- I'm not sure just yet. If those of you who are seeing this bug too could confirm for me that d759c951 appears to be the guilty party, that probably wouldn't hurt. Thanks! --js -- You received this bug notification because you are a member of qemu- devel-ml, which is subscribed to QEMU. https://bugs.launchpad.net/bugs/1769189 Title: Issue with qemu 2.12.0 + SATA Status in QEMU: New Bug description: [EDIT: I first thought that OVMF was the issue, but it turns out to be SATA] I had a Windows 10 VM running perfectly fine with a SATA drive, since I upgraded to qemu 2.12, the guests hangs for a couple of minutes, works for a few seconds, and hangs again, etc. By "hang" I mean it doesn't freeze, but it looks like it's waiting on IO or something, I can move the mouse but everything needing disk access is unresponsive. What doesn't work: qemu 2.12 with SATA What works: using VirIO-SCSI with qemu 2.12 or downgrading qemu to 2.11.1 and keep using SATA. Platform is arch linux 4.16.7 on skylake and Haswell, I have attached the vm xml file. To manage notifications about this bug go to: https://bugs.launchpad.net/qemu/+bug/1769189/+subscriptions