Hi Noam,

I think Kostiantyn explained quite well the deficiencies of the current qemu-ga.
I can suggest an alternative - there is a virtio-vsock to SSH bridge
for Windows. It can be used to develop a snapshot mechanism that will
give you more control over what happens on the Windows side.

Best regards,
Yan.


On Tue, Feb 3, 2026 at 2:58 PM Kostiantyn Kostiuk <[email protected]> wrote:
>
> Hi Noam
>
> QEMU agent was developed as a tool with a synchronous API, and adding any 
> async commands requires a redesign of the API. QGA also does not support 
> sending any event to the host.
>
> First of all, if you send 2 asing command "file-open-async" and get 2 
> responses with FD, how can you know which FD is for which file? Yes, asked 
> about FS freeze API, but the idea is the same. FS-freeze allow to provide a 
> list of volumes to freeze, so you can have 2 requests to freeze 2 sets of 
> volumes. And get the same question.
>
> Regarding multiple agents, this is theoretically possible because QGA is an 
> independent application. If you run each QGA instance with a proper different 
> state folder and a different communication channel, it should work. The main 
> problem is that QGA instances will be independant and when QGA1 blocks all 
> API execution because the guest has frozen FS, QGA2 will allow any command, 
> including FS freeze.
>
> Unfortunately, I have no good answer for you. Windows VSS has a lot of 
> limitations, and we are trying to somehow work with it. Windows VSS doesn't 
> even have an API to report a FS state, so QGA builds and uses internal 
> knowledge that will be out of sync after snapshot restoring.
>
> CC: @Yan Vugenfirer @Qianqian Zhu Do you have any idea?
>
> Best Regards,
> Kostiantyn Kostiuk.
>
>
> On Tue, Feb 3, 2026 at 12:23 PM Noam Assouline <[email protected]> wrote:
>>
>> Hello qemu-devel!
>>
>> I’m working on a KubeVirt fix for Windows VSS fsfreeze timeouts (PR #16653). 
>> Up to now we’ve relied on libvirt’s default QEMU agent response timeout of 5 
>> seconds, and that often isn’t enough for VSS fsfreeze to complete. This PR 
>> proposes increasing the timeout to 60 seconds so the freeze can finish 
>> successfully.
>>
>> The challenge and the reason for this email is that qemu-ga processes 
>> commands synchronously on a single connection. While guest-fsfreeze-freeze 
>> is running, the agent is effectively busy and other commands (e.g. ping, 
>> status) will hang until it returns, which can impact pod readiness probes. 
>> I’m checking what we can do about this.
>>
>> I’m mainly looking to understand whether this can be addressed in qemu-ga, 
>> and to get guidance on the right direction. Is there a supported way to use 
>> multiple agent connections/channels, or is an async guest-fsfreeze-freeze 
>> with a completion event the more appropriate solution? More generally, any 
>> best‑practice guidance around Windows fsfreeze timeouts and responsiveness 
>> would be very helpful!
>>
>> Thanks in advance, and cc’ing qemu-ga maintainers.
>>
>> Noam
>> KubeVirt Storage Ecosystem team


Reply via email to