On Wed, Jul 21, 2021 at 03:34:06PM -0400, Peter Xu wrote:
> Accessing from_dst_file is potentially racy in current code base like below:
> 
>   if (s->from_dst_file)
>     do_something(s->from_dst_file);
> 
> Because from_dst_file can be reset right after the check in another
> thread (rp_thread).  One example is migrate_fd_cancel().
> 
> Use the same qemu_file_lock to protect it too, just like to_dst_file.
> 
> When it's safe to access without lock, comment it.
> 
> There's one special reference in migration_thread() that can be replaced by
> the newly introduced rp_thread_created flag.
> 
> Reported-by: Dr. David Alan Gilbert <dgilb...@redhat.com>
> Signed-off-by: Peter Xu <pet...@redhat.com>
> ---
>  migration/migration.c | 32 +++++++++++++++++++++++++-------
>  migration/migration.h |  8 +++++---
>  migration/ram.c       |  1 +
>  3 files changed, 31 insertions(+), 10 deletions(-)
> 
> diff --git a/migration/migration.c b/migration/migration.c
> index 21b94f75a3..fa70400f98 100644
> --- a/migration/migration.c
> +++ b/migration/migration.c
> @@ -1879,10 +1879,12 @@ static void migrate_fd_cancel(MigrationState *s)
>      QEMUFile *f = migrate_get_current()->to_dst_file;
>      trace_migrate_fd_cancel();
>  
> +    qemu_mutex_lock(&s->qemu_file_lock);
>      if (s->rp_state.from_dst_file) {
>          /* shutdown the rp socket, so causing the rp thread to shutdown */
>          qemu_file_shutdown(s->rp_state.from_dst_file);
>      }
> +    qemu_mutex_unlock(&s->qemu_file_lock);

Worth using WITH_QEMU_LOCK_GUARD?

> @@ -2827,11 +2845,13 @@ out:
>               * Maybe there is something we can do: it looks like a
>               * network down issue, and we pause for a recovery.
>               */
> -            qemu_fclose(rp);
> -            ms->rp_state.from_dst_file = NULL;
> +            migration_release_from_dst_file(ms);
>              rp = NULL;
>              if (postcopy_pause_return_path_thread(ms)) {
> -                /* Reload rp, reset the rest */
> +                /*
> +                 * Reload rp, reset the rest.  Referencing it is save since

s/save/safe/

> +                 * it's reset only by us above, or when migration completes
> +                 */
>                  rp = ms->rp_state.from_dst_file;
>                  ms->rp_state.error = false;
>                  goto retry;

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.           +1-919-301-3266
Virtualization:  qemu.org | libvirt.org


Reply via email to