On 4/5/2024 0:25, Wang, Wei W wrote:> On Thursday, April 4, 2024 10:12 PM, Peter
Xu wrote:
>> On Thu, Apr 04, 2024 at 06:05:50PM +0800, Wei Wang wrote:
>>> Before loading the guest states, ensure that the preempt channel has
>>> been ready to use, as some of the states (e.g. via virtio_load) might
>>> trigger page faults that will be handled through the preempt channel.
>>> So yield to the main thread in the case that the channel create event
>>> has been dispatched.
>>>
>>> Originally-by: Lei Wang <lei4.w...@intel.com>
>>> Link:
>>> https://lore.kernel.org/all/9aa5d1be-7801-40dd-83fd-f7e041ced249@intel
>>> .com/T/
>>> Suggested-by: Peter Xu <pet...@redhat.com>
>>> Signed-off-by: Lei Wang <lei4.w...@intel.com>
>>> Signed-off-by: Wei Wang <wei.w.w...@intel.com>
>>> ---
>>>  migration/savevm.c | 17 +++++++++++++++++
>>>  1 file changed, 17 insertions(+)
>>>
>>> diff --git a/migration/savevm.c b/migration/savevm.c index
>>> 388d7af7cd..fbc9f2bdd4 100644
>>> --- a/migration/savevm.c
>>> +++ b/migration/savevm.c
>>> @@ -2342,6 +2342,23 @@ static int
>>> loadvm_handle_cmd_packaged(MigrationIncomingState *mis)
>>>
>>>      QEMUFile *packf = qemu_file_new_input(QIO_CHANNEL(bioc));
>>>
>>> +    /*
>>> +     * Before loading the guest states, ensure that the preempt channel has
>>> +     * been ready to use, as some of the states (e.g. via virtio_load) 
>>> might
>>> +     * trigger page faults that will be handled through the preempt 
>>> channel.
>>> +     * So yield to the main thread in the case that the channel create 
>>> event
>>> +     * has been dispatched.
>>> +     */
>>> +    do {
>>> +        if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
>>> +            mis->postcopy_qemufile_dst) {
>>> +            break;
>>> +        }
>>> +
>>> +        aio_co_schedule(qemu_get_current_aio_context(),
>> qemu_coroutine_self());
>>> +        qemu_coroutine_yield();
>>> +    } while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done,
>>> + 1));
>>
>> I think we need s/!// here, so the same mistake I made?  I think we need to
>> rework the retval of qemu_sem_timedwait() at some point later..
> 
> No. qemu_sem_timedwait returns false when timeout, which means sem isn’t 
> posted yet.
> So it needs to go back to the loop. (the patch was tested)

When timeout, qemu_sem_timedwait() will return -1. I think the patch test passed
may because you will always have at least one yield (the first yield in the do
...while ...) when loadvm_handle_cmd_packaged()?

> 
>>
>> Besides, this patch kept the sem_wait() in postcopy_preempt_thread() so it
>> will wait() on this sem again.  If this qemu_sem_timedwait() accidentally
>> consumed the sem count then I think the other thread can hang forever?
> 
> I can get the issue you mentioned, and seems better to be placed before the 
> creation of
> the preempt thread. Then we probably don’t need to wait_sem in the preempt 
> thread, as the
> channel is guaranteed to be ready when it runs?
> 
> Update will be:
> 
> diff --git a/migration/postcopy-ram.c b/migration/postcopy-ram.c
> index eccff499cb..5a70ce4f23 100644
> --- a/migration/postcopy-ram.c
> +++ b/migration/postcopy-ram.c
> @@ -1254,6 +1254,15 @@ int postcopy_ram_incoming_setup(MigrationIncomingState 
> *mis)
>      }
> 
>      if (migrate_postcopy_preempt()) {
> +        do {
> +            if (!migrate_postcopy_preempt() || !qemu_in_coroutine() ||
> +                mis->postcopy_qemufile_dst) {
> +                break;
> +            }
> +            aio_co_schedule(qemu_get_current_aio_context(), 
> qemu_coroutine_self());
> +            qemu_coroutine_yield();
> +        } while (!qemu_sem_timedwait(&mis->postcopy_qemufile_dst_done, 1));
> +
>          /*
>           * This thread needs to be created after the temp pages because
>           * it'll fetch RAM_CHANNEL_POSTCOPY PostcopyTmpPage immediately.
> @@ -1743,12 +1752,6 @@ void *postcopy_preempt_thread(void *opaque)
> 
>      qemu_sem_post(&mis->thread_sync_sem);
> 
> -    /*
> -     * The preempt channel is established in asynchronous way.  Wait
> -     * for its completion.
> -     */
> -    qemu_sem_wait(&mis->postcopy_qemufile_dst_done);
> 
> 
> 
> 
> 
> 
> 

Reply via email to