Re: QGA fsfreeze blocks QMP: need async fsfreeze or alternative?

Stefan Hajnoczi Tue, 03 Feb 2026 07:28:05 -0800

On Tue, Feb 03, 2026 at 02:58:29PM +0200, Kostiantyn Kostiuk wrote:
> Hi Noam
> 
> QEMU agent was developed as a tool with a synchronous API, and adding any
> async commands requires a redesign of the API. QGA also does not support
> sending any event to the host.


Regarding the lack of QMP events, this is a bummer because it introduces
latency but polling for completion is a possibility. Enabling QMP events
might also be an option?

> First of all, if you send 2 asing command "file-open-async" and get 2
> responses with FD, how can you know which FD is for which file? Yes, asked
> about FS freeze API, but the idea is the same. FS-freeze allow to provide a
> list of volumes to freeze, so you can have 2 requests to freeze 2 sets of
> volumes. And get the same question.

An async freeze command could take a unique identifier argument that is
passed back to the client when completion is reported. This way the
client can correlate the completion to a specific command.

There are existing async QAPI APIs that can be used as a reference. For
example, qapi/jobs.json. It's a 3-part API where jobs are launched, can
be queried, and can be managed (pause/cancel/dismiss). Querying is
read-only, so the dismiss command can be used to actually reap the job
and make it go away. Something similar could be done for fsfreeze. The
job API was supposed to be generic, but it's only used by the block
layer as far as I'm aware - maybe it could be reused here too?

> 
> Regarding multiple agents, this is theoretically possible because QGA is an
> independent application. If you run each QGA instance with a proper
> different state folder and a different communication channel, it should
> work. The main problem is that QGA instances will be independant and when
> QGA1 blocks all API execution because the guest has frozen FS, QGA2 will
> allow any command, including FS freeze.
> 
> Unfortunately, I have no good answer for you. Windows VSS has a lot of
> limitations, and we are trying to somehow work with it. Windows VSS doesn't
> even have an API to report a FS state, so QGA builds and uses internal
> knowledge that will be out of sync after snapshot restoring.
> 
> CC: @Yan Vugenfirer <[email protected]> @Qianqian Zhu <[email protected]> Do
> you have any idea?
> 
> Best Regards,
> Kostiantyn Kostiuk.
> 
> 
> On Tue, Feb 3, 2026 at 12:23 PM Noam Assouline <[email protected]> wrote:
> 
> > Hello qemu-devel!
> >
> > I’m working on a KubeVirt fix for Windows VSS fsfreeze timeouts (PR #16653
> > <https://github.com/kubevirt/kubevirt/pull/16653>). Up to now we’ve
> > relied on libvirt’s default QEMU agent response timeout of 5 seconds, and
> > that often isn’t enough for VSS fsfreeze to complete. This PR proposes
> > increasing the timeout to 60 seconds so the freeze can finish successfully.
> >
> > The challenge and the reason for this email is that qemu-ga processes
> > commands synchronously on a single connection. While guest-fsfreeze-freeze
> > is running, the agent is effectively busy and other commands (e.g. ping,
> > status) will hang until it returns, which can impact pod readiness probes.
> > I’m checking what we can do about this.
> >
> > I’m mainly looking to understand whether this can be addressed in qemu-ga,
> > and to get guidance on the right direction. Is there a supported way to use
> > multiple agent connections/channels, or is an async guest-fsfreeze-freeze
> > with a completion event the more appropriate solution? More generally, any
> > best‑practice guidance around Windows fsfreeze timeouts and responsiveness
> > would be very helpful!
> >
> > Thanks in advance, and cc’ing qemu-ga maintainers.
> >
> > Noam
> > KubeVirt Storage Ecosystem team
> >

signature.asc
Description: PGP signature

Re: QGA fsfreeze blocks QMP: need async fsfreeze or alternative?

Reply via email to