Hello qemu-devel! I’m working on a KubeVirt fix for Windows VSS fsfreeze timeouts (PR #16653 <https://github.com/kubevirt/kubevirt/pull/16653>). Up to now we’ve relied on libvirt’s default QEMU agent response timeout of 5 seconds, and that often isn’t enough for VSS fsfreeze to complete. This PR proposes increasing the timeout to 60 seconds so the freeze can finish successfully.
The challenge and the reason for this email is that qemu-ga processes commands synchronously on a single connection. While guest-fsfreeze-freeze is running, the agent is effectively busy and other commands (e.g. ping, status) will hang until it returns, which can impact pod readiness probes. I’m checking what we can do about this. I’m mainly looking to understand whether this can be addressed in qemu-ga, and to get guidance on the right direction. Is there a supported way to use multiple agent connections/channels, or is an async guest-fsfreeze-freeze with a completion event the more appropriate solution? More generally, any best‑practice guidance around Windows fsfreeze timeouts and responsiveness would be very helpful! Thanks in advance, and cc’ing qemu-ga maintainers. Noam KubeVirt Storage Ecosystem team
