Hi Sascha, On Sat, Apr 4, 2020 at 5:52 PM Sascha Lucas <[email protected]> wrote:
> Apparently the qemu process crashed. I wonder if there is something in > the logs[1] (/var/log/ganeti/kvm/<instance name>.log)? > I entirely forgot about the QEMU log files. Yes, we can see an error message from the crashing QEMU process: qemu-system-x86_64: /build/qemu-oknQD6/qemu-4.2/accel/kvm/kvm-all.c:653: kvm_log_clear_one_slot: Assertion `mem->dirty_bmap' failed. Looking into that, I stumbled over these bugs: https://bugzilla.redhat.com/show_bug.cgi?id=1771032 https://bugzilla.redhat.com/show_bug.cgi?id=1772774 It seems that it's related to starting a migration while the QEMU process (or rather: the VM inside) is still in the boot phase. The fix for this is already upstream: https://github.com/qemu/qemu/commit/9b3a31c745b61758aaa5466a3a9fc0526d409188 However, it seems it is only in for the next QEMU 5 release. I think we should open a Debian Bug for this (however, after quickly reading through the guide I am not a 100% sure I understood how to open a bug for QEMU in bullseye). Maybe the fix can be backported. Any ideas how to work around this issue in the QA suite for now? Swap failover / migration tests? Ganeti users issuing failover/reboot/start + migrate in rapid order is probably not very likely for production systems. But from what I understand, this can also be an issue when someone triggers a live migration through ganeti while the VM is in a rebooting state internally. That again is something which might hit people in production. > > > Simply adding a 'sleep 2' between the two ganeti commands fixes the > issue. > > Sounds why DRBD does not trigger the bug. The disk transition from > disconnect, primary/primary reconnect includes this extra seconds. > >From what we know now, this slight difference in timing should be enough to get the QEMU process/VM out of its (re)booting state. Cheers, Rudi -- Rudolph Bott - [email protected] Telefon: +49 (0)211-63 55 56-41 Telefax: +49 (0)211-63 55 55-22 sipgate GmbH - Gladbacher Str. 74 - 40219 Düsseldorf HRB Düsseldorf 39841 - Geschäftsführer: Thilo Salmon, Tim Mois Steuernummer: 106/5724/7147, Umsatzsteuer-ID: DE219349391 www.sipgate.de - www.sipgate.co.uk -- You received this message because you are subscribed to the Google Groups "ganeti-devel" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/ganeti-devel/CAPG4N%3DawfZBTHPyrxSArtniPE8O7h1AgwZd%3D79bZ8MY%3D_7ga7Q%40mail.gmail.com.
