Package: qemu-guest-agent Version: 1:2.1+dfsg-12+deb8u6 Severity: important
On Jessie I noticed that several of my Proxmox backups was hanging forever on Jessie VMs with qemu guest agent enabled, while my Stretch VMs are OK. Upon further inspection, backups (or actually fs freeze), do not work until after restarting qemu-guest-agent. This is from the systemd journal after boot. Note the "transport endpoint not found" warning. #### Apr 12 16:26:32 cns-2 systemd[1]: Starting LSB: QEMU Guest Agent startup script... Apr 12 16:26:32 cns-2 qemu-guest-agent[1126]: qemu-ga: transport endpoint not found, not starting ... (warning). Apr 12 16:26:32 cns-2 systemd[1]: Started LSB: QEMU Guest Agent startup script. Apr 12 16:26:32 cns-2 systemd[1]: Starting LSB: QEMU Guest Agent startup script... -- Subject: Unit qemu-guest-agent.service has begun start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit qemu-guest-agent.service has begun starting up. Apr 12 16:26:32 cns-2 qemu-guest-agent[1126]: qemu-ga: transport endpoint not found, not starting ... (warning). Apr 12 16:26:32 cns-2 systemd[1]: Started LSB: QEMU Guest Agent startup script. -- Subject: Unit qemu-guest-agent.service has finished start-up -- Defined-By: systemd -- Support: http://lists.freedesktop.org/mailman/listinfo/systemd-devel -- -- Unit qemu-guest-agent.service has finished starting up. -- -- The start-up result is done. #### During boot what happens seems to be that /dev/virtio-ports/org.qemu.guest_agent.0 is not available although it always is when I check after boot, which of course makes the error disappear when I restart qemu-guest-agent later. A simple "sleep 1" in /etc/default/qemu-guest-agent as a ugly hack solves the issue, so there seems to be some kind of race condition during boot, with possible a dependency of virtio-serial device that should be added somewhere during the boot process. Although Stretch seems to have no issues I do not know if this is by design or just that the race condition happens to end differently most of the time. I have tried with kernel and qemu-guest-agent (1:2.8+dfsg-3~bpo8+1) from jessie-backports, but the issue persists. I marked the issue important as a non-functioning qemu guest agent when enabled for the VM actually in an unreliable way not only prevents, but also in some cases locks Jessie VMs, preventing further operations on platforms like Proxmox in common scenarios like doing backups or snapshots, and the VMs also do not shutdown when guest agent enabled VMs are told to shutdown which again causes the operation to hang or time out often ending with a hard stop with possible corrupt data to follow.