Hi

On Wed, Jun 12, 2024 at 10:50 PM Kim, Dongwon <dongwon....@intel.com> wrote:

> On 6/11/2024 10:44 PM, Marc-André Lureau wrote:
> > Hi
> >
> > On Wed, Jun 12, 2024 at 5:29 AM Kim, Dongwon <dongwon....@intel.com
> > <mailto:dongwon....@intel.com>> wrote:
> >
> >     Hi,
> >
> >     From: Marc-André Lureau <marcandre.lur...@gmail.com
> >     <mailto:marcandre.lur...@gmail.com>>
> >     Sent: Wednesday, June 5, 2024 12:56 AM
> >     To: Kim, Dongwon <dongwon....@intel.com <mailto:
> dongwon....@intel.com>>
> >     Cc: qemu-devel@nongnu.org <mailto:qemu-devel@nongnu.org>; Peter Xu
> >     <pet...@redhat.com <mailto:pet...@redhat.com>>
> >     Subject: Re: [PATCH] ui/gtk: Wait until the current guest frame is
> >     rendered before switching to RUN_STATE_SAVE_VM
> >
> >     Hi
> >
> >     On Tue, Jun 4, 2024 at 9:49 PM Kim, Dongwon
> >     <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>> wrote:
> >     On 6/4/2024 4:12 AM, Marc-André Lureau wrote:
> >      > Hi
> >      >
> >      > On Thu, May 30, 2024 at 2:44 AM <mailto:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com>
> >      > <mailto:mailto <mailto:mailto>:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com>>> wrote:
> >      >
> >      >     From: Dongwon <mailto:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com> <mailto:mailto
> >     <mailto:mailto>:dongwon....@intel.com <mailto:dongwon....@intel.com
> >>>
> >      >
> >      >     Make sure rendering of the current frame is finished before
> >     switching
> >      >     the run state to RUN_STATE_SAVE_VM by waiting for egl-sync
> >     object to be
> >      >     signaled.
> >      >
> >      >
> >      > Can you expand on what this solves?
> >
> >     In current scheme, guest waits for the fence to be signaled for each
> >     frame it submits before moving to the next frame. If the guest’s
> state
> >     is saved while it is still waiting for the fence, The guest will
> >     continue to  wait for the fence that was signaled while ago when it
> is
> >     restored to the point. One way to prevent it is to get it finish the
> >     current frame before changing the state.
> >
> >     After the UI sets a fence, hw_ops->gl_block(true) gets called, which
> >     will block virtio-gpu/virgl from processing commands (until the
> >     fence is signaled and gl_block/false called again).
> >
> >     But this "blocking" state is not saved. So how does this affect
> >     save/restore? Please give more details, thanks
> >
> >     Yeah sure. "Blocking" state is not saved but guest's state is saved
> >     while it was still waiting for the response for its last
> >     resource-flush virtio msg. This virtio response, by the way is set
> >     to be sent to the guest when the pipeline is unblocked (and when the
> >     fence is signaled.). Once the guest's state is saved, current
> >     instance of guest will be continued and receives the response as
> >     usual. The problem is happening when we restore the saved guest's
> >     state again because what guest does will be waiting for the response
> >     that was sent a while ago to the original instance.
> >
> >
> > Where is the pending response saved? Can you detail how you test this?
> >
>
> There is no pending response for the guest's restored point, which is a
> problem. The response is sent out after saving is done.
>
> Normal cycle :
>
> resource-flush (scanout flush) -> gl block -> render -> gl unblock
> (after fence is signaled) -> pending response sent out to the guest ->
> guest (virtio-gpu drv) processes the next scanout frame -> (next cycle)
> resource-flush -> gl block ......
>
> When vm state is saved in the middle :
>
> resource-flush (scanout-flush) -> gl block -> saving vm-state -> render
> -> gl unblock -> pending response (resp #1) sent out to the guest ->
> guest (virtio-gpu drv) processes the next scanout frame -> (next cycle)
> resource-flush -> gl block ......
>
> Now, we restore the vm-state we saved
>
> vm-state is restored -> guest (virtio-gpu drv) can't move on as this
> state is still waiting for the response (resp #1)
>

Ok, so actually it's more of a device state issue than a UI/GTK. We end up
not saving a state that reflects the guest state. My understanding is that
the guest is waiting for a fence reply, and we don't save that. Imho, a
better fix would be to either save the fenceq (but then, what else is
missing to complete the operation on resume?), or have a wait to delay the
migration until the fences are flushed.


> So we need to make sure vm-state is saved after the cycle is completed.
>
> This situation would be only happening if you use blob=true with
> virtio-gpu drv as KMS on the linux guest. Do you have any similar setup?
>
>
No, further details to reproduce would help. Even better would be having
some automated test.


-- 
Marc-André Lureau

Reply via email to