Hi On Wed, Jun 12, 2024 at 10:50 PM Kim, Dongwon <dongwon....@intel.com> wrote:
> On 6/11/2024 10:44 PM, Marc-André Lureau wrote: > > Hi > > > > On Wed, Jun 12, 2024 at 5:29 AM Kim, Dongwon <dongwon....@intel.com > > <mailto:dongwon....@intel.com>> wrote: > > > > Hi, > > > > From: Marc-André Lureau <marcandre.lur...@gmail.com > > <mailto:marcandre.lur...@gmail.com>> > > Sent: Wednesday, June 5, 2024 12:56 AM > > To: Kim, Dongwon <dongwon....@intel.com <mailto: > dongwon....@intel.com>> > > Cc: qemu-devel@nongnu.org <mailto:qemu-devel@nongnu.org>; Peter Xu > > <pet...@redhat.com <mailto:pet...@redhat.com>> > > Subject: Re: [PATCH] ui/gtk: Wait until the current guest frame is > > rendered before switching to RUN_STATE_SAVE_VM > > > > Hi > > > > On Tue, Jun 4, 2024 at 9:49 PM Kim, Dongwon > > <mailto:dongwon....@intel.com <mailto:dongwon....@intel.com>> wrote: > > On 6/4/2024 4:12 AM, Marc-André Lureau wrote: > > > Hi > > > > > > On Thu, May 30, 2024 at 2:44 AM <mailto:dongwon....@intel.com > > <mailto:dongwon....@intel.com> > > > <mailto:mailto <mailto:mailto>:dongwon....@intel.com > > <mailto:dongwon....@intel.com>>> wrote: > > > > > > From: Dongwon <mailto:dongwon....@intel.com > > <mailto:dongwon....@intel.com> <mailto:mailto > > <mailto:mailto>:dongwon....@intel.com <mailto:dongwon....@intel.com > >>> > > > > > > Make sure rendering of the current frame is finished before > > switching > > > the run state to RUN_STATE_SAVE_VM by waiting for egl-sync > > object to be > > > signaled. > > > > > > > > > Can you expand on what this solves? > > > > In current scheme, guest waits for the fence to be signaled for each > > frame it submits before moving to the next frame. If the guest’s > state > > is saved while it is still waiting for the fence, The guest will > > continue to wait for the fence that was signaled while ago when it > is > > restored to the point. One way to prevent it is to get it finish the > > current frame before changing the state. > > > > After the UI sets a fence, hw_ops->gl_block(true) gets called, which > > will block virtio-gpu/virgl from processing commands (until the > > fence is signaled and gl_block/false called again). > > > > But this "blocking" state is not saved. So how does this affect > > save/restore? Please give more details, thanks > > > > Yeah sure. "Blocking" state is not saved but guest's state is saved > > while it was still waiting for the response for its last > > resource-flush virtio msg. This virtio response, by the way is set > > to be sent to the guest when the pipeline is unblocked (and when the > > fence is signaled.). Once the guest's state is saved, current > > instance of guest will be continued and receives the response as > > usual. The problem is happening when we restore the saved guest's > > state again because what guest does will be waiting for the response > > that was sent a while ago to the original instance. > > > > > > Where is the pending response saved? Can you detail how you test this? > > > > There is no pending response for the guest's restored point, which is a > problem. The response is sent out after saving is done. > > Normal cycle : > > resource-flush (scanout flush) -> gl block -> render -> gl unblock > (after fence is signaled) -> pending response sent out to the guest -> > guest (virtio-gpu drv) processes the next scanout frame -> (next cycle) > resource-flush -> gl block ...... > > When vm state is saved in the middle : > > resource-flush (scanout-flush) -> gl block -> saving vm-state -> render > -> gl unblock -> pending response (resp #1) sent out to the guest -> > guest (virtio-gpu drv) processes the next scanout frame -> (next cycle) > resource-flush -> gl block ...... > > Now, we restore the vm-state we saved > > vm-state is restored -> guest (virtio-gpu drv) can't move on as this > state is still waiting for the response (resp #1) > Ok, so actually it's more of a device state issue than a UI/GTK. We end up not saving a state that reflects the guest state. My understanding is that the guest is waiting for a fence reply, and we don't save that. Imho, a better fix would be to either save the fenceq (but then, what else is missing to complete the operation on resume?), or have a wait to delay the migration until the fences are flushed. > So we need to make sure vm-state is saved after the cycle is completed. > > This situation would be only happening if you use blob=true with > virtio-gpu drv as KMS on the linux guest. Do you have any similar setup? > > No, further details to reproduce would help. Even better would be having some automated test. -- Marc-André Lureau