Hi

On Thu, Jun 13, 2024 at 9:27 PM Kim, Dongwon <dongwon....@intel.com> wrote:

> Hi Marc-André,
>
> On 6/13/2024 6:16 AM, Marc-André Lureau wrote:
> > Hi
> >
> > On Wed, Jun 12, 2024 at 10:50 PM Kim, Dongwon <dongwon....@intel.com
> > <mailto:dongwon....@intel.com>> wrote:
> >
> >     On 6/11/2024 10:44 PM, Marc-André Lureau wrote:
> >      > Hi
> >      >
> >      > On Wed, Jun 12, 2024 at 5:29 AM Kim, Dongwon
> >     <dongwon....@intel.com <mailto:dongwon....@intel.com>
> >      > <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>>>
> wrote:
> >      >
> >      >     Hi,
> >      >
> >      >     From: Marc-André Lureau <marcandre.lur...@gmail.com
> >     <mailto:marcandre.lur...@gmail.com>
> >      >     <mailto:marcandre.lur...@gmail.com
> >     <mailto:marcandre.lur...@gmail.com>>>
> >      >     Sent: Wednesday, June 5, 2024 12:56 AM
> >      >     To: Kim, Dongwon <dongwon....@intel.com
> >     <mailto:dongwon....@intel.com> <mailto:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com>>>
> >      >     Cc: qemu-devel@nongnu.org <mailto:qemu-devel@nongnu.org>
> >     <mailto:qemu-devel@nongnu.org <mailto:qemu-devel@nongnu.org>>;
> Peter Xu
> >      >     <pet...@redhat.com <mailto:pet...@redhat.com>
> >     <mailto:pet...@redhat.com <mailto:pet...@redhat.com>>>
> >      >     Subject: Re: [PATCH] ui/gtk: Wait until the current guest
> >     frame is
> >      >     rendered before switching to RUN_STATE_SAVE_VM
> >      >
> >      >     Hi
> >      >
> >      >     On Tue, Jun 4, 2024 at 9:49 PM Kim, Dongwon
> >      >     <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>
> >     <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>>>
> wrote:
> >      >     On 6/4/2024 4:12 AM, Marc-André Lureau wrote:
> >      >      > Hi
> >      >      >
> >      >      > On Thu, May 30, 2024 at 2:44 AM
> >     <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>
> >      >     <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>>
> >      >      > <mailto:mailto <mailto:mailto> <mailto:mailto
> >     <mailto:mailto>>:dongwon....@intel.com <mailto:dongwon....@intel.com
> >
> >      >     <mailto:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com>>>> wrote:
> >      >      >
> >      >      >     From: Dongwon <mailto:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com>
> >      >     <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>>
> >     <mailto:mailto <mailto:mailto>
> >      >     <mailto:mailto <mailto:mailto>>:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com> <mailto:dongwon....@intel.com
> >     <mailto:dongwon....@intel.com>>>>
> >      >      >
> >      >      >     Make sure rendering of the current frame is finished
> >     before
> >      >     switching
> >      >      >     the run state to RUN_STATE_SAVE_VM by waiting for
> egl-sync
> >      >     object to be
> >      >      >     signaled.
> >      >      >
> >      >      >
> >      >      > Can you expand on what this solves?
> >      >
> >      >     In current scheme, guest waits for the fence to be signaled
> >     for each
> >      >     frame it submits before moving to the next frame. If the
> >     guest’s state
> >      >     is saved while it is still waiting for the fence, The guest
> will
> >      >     continue to  wait for the fence that was signaled while ago
> >     when it is
> >      >     restored to the point. One way to prevent it is to get it
> >     finish the
> >      >     current frame before changing the state.
> >      >
> >      >     After the UI sets a fence, hw_ops->gl_block(true) gets
> >     called, which
> >      >     will block virtio-gpu/virgl from processing commands (until
> the
> >      >     fence is signaled and gl_block/false called again).
> >      >
> >      >     But this "blocking" state is not saved. So how does this
> affect
> >      >     save/restore? Please give more details, thanks
> >      >
> >      >     Yeah sure. "Blocking" state is not saved but guest's state is
> >     saved
> >      >     while it was still waiting for the response for its last
> >      >     resource-flush virtio msg. This virtio response, by the way
> >     is set
> >      >     to be sent to the guest when the pipeline is unblocked (and
> >     when the
> >      >     fence is signaled.). Once the guest's state is saved, current
> >      >     instance of guest will be continued and receives the response
> as
> >      >     usual. The problem is happening when we restore the saved
> guest's
> >      >     state again because what guest does will be waiting for the
> >     response
> >      >     that was sent a while ago to the original instance.
> >      >
> >      >
> >      > Where is the pending response saved? Can you detail how you test
> >     this?
> >      >
> >
> >     There is no pending response for the guest's restored point, which
> is a
> >     problem. The response is sent out after saving is done.
> >
> >     Normal cycle :
> >
> >     resource-flush (scanout flush) -> gl block -> render -> gl unblock
> >     (after fence is signaled) -> pending response sent out to the guest
> ->
> >     guest (virtio-gpu drv) processes the next scanout frame -> (next
> cycle)
> >     resource-flush -> gl block ......
> >
> >     When vm state is saved in the middle :
> >
> >     resource-flush (scanout-flush) -> gl block -> saving vm-state ->
> render
> >     -> gl unblock -> pending response (resp #1) sent out to the guest ->
> >     guest (virtio-gpu drv) processes the next scanout frame -> (next
> cycle)
> >     resource-flush -> gl block ......
> >
> >     Now, we restore the vm-state we saved
> >
> >     vm-state is restored -> guest (virtio-gpu drv) can't move on as this
> >     state is still waiting for the response (resp #1)
> >
> >
> > Ok, so actually it's more of a device state issue than a UI/GTK. We end
> > up not saving a state that reflects the guest state. My understanding is
> > that the guest is waiting for a fence reply, and we don't save that.
> > Imho, a better fix would be to either save the fenceq (but then, what
> > else is missing to complete the operation on resume?), or have a wait to
> > delay the migration until the fences are flushed.
>
> The second method you are proposing here - 'have a wait'. I understand
> you mean delaying the start point of migration but don't you think the
> current patch is basically doing the similar thing? Assuming egl wait
> sync is what we need to use for a wait, do you have any suggestion where
> that should be called other than 'gd_change_runstate'?
>

It should be handled at virtio-gpu side. I am not sure if runstate handler
or pre_save are the right place.

Peter, what is the correct way to delay migration until the host finishes
some work? (in this case we would need to wait for the rendering/UI, to
signal pending fences)

thanks

-- 
Marc-André Lureau

Reply via email to