On Fri, Apr 05, 2024 at 12:48:15AM +0800, Wang, Lei wrote: > On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 10:12 PM, > Peter > Xu wrote: > >> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote: > >>> Before loading the guest states, ensure that the preempt channel has > >>> been ready to use, as some of the states (e.g. via virtio_load) might > >>> trigger page faults that will be handled through the preempt channel. > >>> So yield to the main thread in the case that the channel create event > >>> has been dispatched. > >>> > >>> Originally-by: Lei Wang <lei4.w...@intel.com> > >>> Link: > >>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@intel > >>> .com/T/ > >>> Suggested-by: Peter Xu <pet...@redhat.com> > >>> Signed-off-by: Lei Wang <lei4.w...@intel.com> > >>> Signed-off-by: Wei Wang <wei.w.w...@intel.com> > >>> --- > >>> migration/savevm.c | 17 +++++++++++++++++ > >>> 1 file changed, 17 insertions(+) > >>> > >>> diff --git a/migration/savevm.c b/migration/savevm.c index > >>> 388d7af7cd..fbc9f2bdd4 100644 > >>> --- a/migration/savevm.c > >>> +++ b/migration/savevm.c > >>> @@ -2342,6 +2342,23 @@ static int > >>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis) > >>> > >>> QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc)); > >>> > >>> + /* > >>> + * Before loading the guest states, ensure that the preempt channel > >>> has > >>> + * been ready to use, as some of the states (e.g. via virtio_load) > >>> might > >>> + * trigger page faults that will be handled through the preempt > >>> channel. > >>> + * So yield to the main thread in the case that the channel create > >>> event > >>> + * has been dispatched. > >>> + */ > >>> + do { > >>> + if (!migrate_postcopy_preempt() || !qemu_in_coroutine() || > >>> + mis->postcopy_qemufile_dst) { > >>> + break; > >>> + } > >>> + > >>> + aio_co_schedule(qemu_get_current_aio_context(), > >> qemu_coroutine_self()); > >>> + qemu_coroutine_yield(); > >>> + } while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, > >>> + 1)); > >> > >> I think we need s/!// here, so the same mistake I made? I think we need to > >> rework the retval of qemu_sem_timedwait() at some point later.. > > > > No. qemu_sem_timedwait returns false when timeout, which means sem isn’t > > posted yet. > > So it needs to go back to the loop. (the patch was tested) > > When timeout, qemu_sem_timedwait() will return -1. I think the patch test > passed > may because you will always have at least one yield (the first yield in the do > ...while ...) when loadvm_handle_cmd_packaged()?
My guess is that here the kick will work and qemu_sem_timedwait() later will ETIMEOUT -> qemu_sem_timedwait() returns -1, then the loop just broke. That aio schedule should make sure anyway that the file is ready; the preempt thread must run before this to not hang that thread. I think it more kind of justifies that the retval needs to be properly defined. :( It's confusion is on top of when I know libpthread returns positive error codes. Thans, -- Peter Xu