Hi Noam, I think Kostiantyn explained quite well the deficiencies of the current qemu-ga. I can suggest an alternative - there is a virtio-vsock to SSH bridge for Windows. It can be used to develop a snapshot mechanism that will give you more control over what happens on the Windows side.
Best regards, Yan. On Tue, Feb 3, 2026 at 2:58 PM Kostiantyn Kostiuk <[email protected]> wrote: > > Hi Noam > > QEMU agent was developed as a tool with a synchronous API, and adding any > async commands requires a redesign of the API. QGA also does not support > sending any event to the host. > > First of all, if you send 2 asing command "file-open-async" and get 2 > responses with FD, how can you know which FD is for which file? Yes, asked > about FS freeze API, but the idea is the same. FS-freeze allow to provide a > list of volumes to freeze, so you can have 2 requests to freeze 2 sets of > volumes. And get the same question. > > Regarding multiple agents, this is theoretically possible because QGA is an > independent application. If you run each QGA instance with a proper different > state folder and a different communication channel, it should work. The main > problem is that QGA instances will be independant and when QGA1 blocks all > API execution because the guest has frozen FS, QGA2 will allow any command, > including FS freeze. > > Unfortunately, I have no good answer for you. Windows VSS has a lot of > limitations, and we are trying to somehow work with it. Windows VSS doesn't > even have an API to report a FS state, so QGA builds and uses internal > knowledge that will be out of sync after snapshot restoring. > > CC: @Yan Vugenfirer @Qianqian Zhu Do you have any idea? > > Best Regards, > Kostiantyn Kostiuk. > > > On Tue, Feb 3, 2026 at 12:23 PM Noam Assouline <[email protected]> wrote: >> >> Hello qemu-devel! >> >> I’m working on a KubeVirt fix for Windows VSS fsfreeze timeouts (PR #16653). >> Up to now we’ve relied on libvirt’s default QEMU agent response timeout of 5 >> seconds, and that often isn’t enough for VSS fsfreeze to complete. This PR >> proposes increasing the timeout to 60 seconds so the freeze can finish >> successfully. >> >> The challenge and the reason for this email is that qemu-ga processes >> commands synchronously on a single connection. While guest-fsfreeze-freeze >> is running, the agent is effectively busy and other commands (e.g. ping, >> status) will hang until it returns, which can impact pod readiness probes. >> I’m checking what we can do about this. >> >> I’m mainly looking to understand whether this can be addressed in qemu-ga, >> and to get guidance on the right direction. Is there a supported way to use >> multiple agent connections/channels, or is an async guest-fsfreeze-freeze >> with a completion event the more appropriate solution? More generally, any >> best‑practice guidance around Windows fsfreeze timeouts and responsiveness >> would be very helpful! >> >> Thanks in advance, and cc’ing qemu-ga maintainers. >> >> Noam >> KubeVirt Storage Ecosystem team
