On Friday, April 5, 2024 10:33 AM, Peter Xu wrote: > On Fri, Apr 05, 2024 at 01:38:31AM +0000, Wang, Wei W wrote: > > On Friday, April 5, 2024 4:57 AM, Peter Xu wrote: > > > On Fri, Apr 05, 2024 at 12:48:15AM +0800, Wang, Lei wrote: > > > > On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 > > > > 10:12 PM, Peter Xu wrote: > > > > >> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote: > > > > >>> Before loading the guest states, ensure that the preempt > > > > >>> channel has been ready to use, as some of the states (e.g. via > > > > >>> virtio_load) might trigger page faults that will be handled > > > > >>> through the > > > preempt channel. > > > > >>> So yield to the main thread in the case that the channel > > > > >>> create event has been dispatched. > > > > >>> > > > > >>> Originally-by: Lei Wang <lei4.w...@intel.com> > > > > >>> Link: > > > > >>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced2 > > > > >>> 49@i > > > > >>> ntel > > > > >>> .com/T/ > > > > >>> Suggested-by: Peter Xu <pet...@redhat.com> > > > > >>> Signed-off-by: Lei Wang <lei4.w...@intel.com> > > > > >>> Signed-off-by: Wei Wang <wei.w.w...@intel.com> > > > > >>> --- > > > > >>> migration/savevm.c | 17 +++++++++++++++++ > > > > >>> 1 file changed, 17 insertions(+) > > > > >>> > > > > >>> diff --git a/migration/savevm.c b/migration/savevm.c index > > > > >>> 388d7af7cd..fbc9f2bdd4 100644 > > > > >>> --- a/migration/savevm.c > > > > >>> +++ b/migration/savevm.c > > > > >>> @@ -2342,6 +2342,23 @@ static int > > > > >>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis) > > > > >>> > > > > >>> QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc)); > > > > >>> > > > > >>> + /* > > > > >>> + * Before loading the guest states, ensure that the > > > > >>> + preempt channel > > > has > > > > >>> + * been ready to use, as some of the states (e.g. via > > > > >>> virtio_load) > might > > > > >>> + * trigger page faults that will be handled through the > > > > >>> + preempt > > > channel. > > > > >>> + * So yield to the main thread in the case that the > > > > >>> + channel create > > > event > > > > >>> + * has been dispatched. > > > > >>> + */ > > > > >>> + do { > > > > >>> + if (!migrate_postcopy_preempt() || !qemu_in_coroutine() || > > > > >>> + mis->postcopy_qemufile_dst) { > > > > >>> + break; > > > > >>> + } > > > > >>> + > > > > >>> + aio_co_schedule(qemu_get_current_aio_context(), > > > > >> qemu_coroutine_self()); > > > > >>> + qemu_coroutine_yield(); > > > > >>> + } while > > > > >>> + (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, > > > > >>> + 1)); > > > > >> > > > > >> I think we need s/!// here, so the same mistake I made? I > > > > >> think we need to rework the retval of qemu_sem_timedwait() at some > point later.. > > > > > > > > > > No. qemu_sem_timedwait returns false when timeout, which means > > > > > sem > > > isn’t posted yet. > > > > > So it needs to go back to the loop. (the patch was tested) > > > > > > > > When timeout, qemu_sem_timedwait() will return -1. I think the > > > > patch test passed may because you will always have at least one > > > > yield (the first yield in the do ...while ...) when > loadvm_handle_cmd_packaged()? > > > > > > My guess is that here the kick will work and qemu_sem_timedwait() > > > later will ETIMEOUT -> qemu_sem_timedwait() returns -1, then the loop > just broke. > > > That aio schedule should make sure anyway that the file is ready; > > > the preempt thread must run before this to not hang that thread. > > > > Yes, misread of the return value. It still worked because the loop > > broke at the "if (mis->postcopy_qemufile_dst)" check. > > > > Even below will work: > > do { > > if (mis->postcopy_qemufile_dst) { > > break; > > } > > ... > > } while (1); > > > > I still don’t see the value of using postcopy_qemufile_dst_done sem in > > the code though It simplify blocks the main thread from creating the > > preempt channel for 1ms (regardless of the possibility about whether > > the sem has been be posted or not. We add it for the case it is not posted > and need to go back to the loop). > > I think it used to only wait() in the preempt thread, so that is needed. > It's also needed when postcopy is interrupted and need a recover, see > loadvm_postcopy_handle_resume(), in that case it's the postcopy ram load > thread that waits for it rather than the main thread or preempt thread. > > Indeed if we move channel creation out of the preempt thread then it seems > we don't need the sem in this path. However the other path will still need > it, > then when the new channel created (postcopy_preempt_new_channel) we'll > need to identify a "switch to postcopy" case or "postcopy recovery" case, only > post the sem when the former. I think it might complicate the code, I'll > think > again tomorrow after a sleep so my brain will work better, but I doubt this is > what we want to do at rc3.
Yes, it's a bit rushed (no need to remove it completely at rc3). > > If you feel comfortable, please feel free to send a version that you think is > the > most correct so far (if you prefer no timedwait it's fine), and make sure the > test > works the best on your side. Then I'll smoke it a bit during weekends. Please > always keep that in mind if that will be for rc3 it should be non-intrusive > change, or we keep it slow for 9.1 and we don't need to rush. > Sounds good.