Re: [PATCH v5 8/8] migration: Add a wrapper to cleanup migration files

2023-09-05 Thread Fabiano Rosas
Peter Xu  writes:

> On Fri, Sep 01, 2023 at 03:29:51PM -0300, Fabiano Rosas wrote:
>> Peter Xu  writes:
>> 
>> > On Thu, Aug 31, 2023 at 03:39:16PM -0300, Fabiano Rosas wrote:
>> >> @@ -1166,16 +1183,9 @@ static void migrate_fd_cleanup(MigrationState *s)
>> >>  qemu_mutex_lock_iothread();
>> >>  
>> >>  multifd_save_cleanup();
>> >> -qemu_mutex_lock(&s->qemu_file_lock);
>> >> -tmp = s->to_dst_file;
>> >> -s->to_dst_file = NULL;
>> >> -qemu_mutex_unlock(&s->qemu_file_lock);
>> >> -/*
>> >> - * Close the file handle without the lock to make sure the
>> >> - * critical section won't block for long.
>> >> - */
>> >> -migration_ioc_unregister_yank_from_file(tmp);
>> >> -qemu_fclose(tmp);
>> >> +
>> >> +migration_ioc_unregister_yank_from_file(s->to_dst_file);
>> >
>> > I think you suggested that we should always take the file lock when
>> > operating on them, so this is slightly going backwards to not hold any lock
>> > when doing it. But doing so in migrate_fd_cleanup() is probably fine (as it
>> > serializes with bql on all the rest qmp commands, neither should migration
>> > thread exist at this point).  Your call; it's still much cleaner.
>> 
>> I think I was mistaken. We need the lock on the thread that clears the
>> pointer so that we can safely dereference it on another thread under the
>> lock.
>> 
>> Here we're accessing it from the same thread that later does the
>> clearing. So that's a slightly different problem.
>
> But this is not the only place to clear it, so you still need to justify
> why the other call sites (e.g., postcopy_pause() won't happen in parallel
> with this call site.
>
> The good thing about your proposal (of always taking that lock) is we avoid
> those justifications, as you said before. :)
>

Yes, I should probably try harder to keep it under the lock.

The issue is that without using the QIOChannel reference count or
keeping a flag there's no way to pair the register/unregister of the
yank. Because 1) we'll never be sure whether the yank was previously
registered when calling the unregister and 2) we don't store the ioc, so
we need to access it from the QEMUFile, but then several QEMUFiles can
have the same ioc.

The easiest way to keep it under the lock would be to add a flag:

migration_file_release(QEMUFile **file, bool unregister_yank);

... and only set it when we're sure the yank has been registered. It is
still a bit hand-wavy though.



Re: [PATCH v5 8/8] migration: Add a wrapper to cleanup migration files

2023-09-05 Thread Peter Xu
On Fri, Sep 01, 2023 at 03:29:51PM -0300, Fabiano Rosas wrote:
> Peter Xu  writes:
> 
> > On Thu, Aug 31, 2023 at 03:39:16PM -0300, Fabiano Rosas wrote:
> >> @@ -1166,16 +1183,9 @@ static void migrate_fd_cleanup(MigrationState *s)
> >>  qemu_mutex_lock_iothread();
> >>  
> >>  multifd_save_cleanup();
> >> -qemu_mutex_lock(&s->qemu_file_lock);
> >> -tmp = s->to_dst_file;
> >> -s->to_dst_file = NULL;
> >> -qemu_mutex_unlock(&s->qemu_file_lock);
> >> -/*
> >> - * Close the file handle without the lock to make sure the
> >> - * critical section won't block for long.
> >> - */
> >> -migration_ioc_unregister_yank_from_file(tmp);
> >> -qemu_fclose(tmp);
> >> +
> >> +migration_ioc_unregister_yank_from_file(s->to_dst_file);
> >
> > I think you suggested that we should always take the file lock when
> > operating on them, so this is slightly going backwards to not hold any lock
> > when doing it. But doing so in migrate_fd_cleanup() is probably fine (as it
> > serializes with bql on all the rest qmp commands, neither should migration
> > thread exist at this point).  Your call; it's still much cleaner.
> 
> I think I was mistaken. We need the lock on the thread that clears the
> pointer so that we can safely dereference it on another thread under the
> lock.
> 
> Here we're accessing it from the same thread that later does the
> clearing. So that's a slightly different problem.

But this is not the only place to clear it, so you still need to justify
why the other call sites (e.g., postcopy_pause() won't happen in parallel
with this call site.

The good thing about your proposal (of always taking that lock) is we avoid
those justifications, as you said before. :)

Thanks,

-- 
Peter Xu




Re: [PATCH v5 8/8] migration: Add a wrapper to cleanup migration files

2023-09-01 Thread Fabiano Rosas
Peter Xu  writes:

> On Thu, Aug 31, 2023 at 03:39:16PM -0300, Fabiano Rosas wrote:
>> @@ -1166,16 +1183,9 @@ static void migrate_fd_cleanup(MigrationState *s)
>>  qemu_mutex_lock_iothread();
>>  
>>  multifd_save_cleanup();
>> -qemu_mutex_lock(&s->qemu_file_lock);
>> -tmp = s->to_dst_file;
>> -s->to_dst_file = NULL;
>> -qemu_mutex_unlock(&s->qemu_file_lock);
>> -/*
>> - * Close the file handle without the lock to make sure the
>> - * critical section won't block for long.
>> - */
>> -migration_ioc_unregister_yank_from_file(tmp);
>> -qemu_fclose(tmp);
>> +
>> +migration_ioc_unregister_yank_from_file(s->to_dst_file);
>
> I think you suggested that we should always take the file lock when
> operating on them, so this is slightly going backwards to not hold any lock
> when doing it. But doing so in migrate_fd_cleanup() is probably fine (as it
> serializes with bql on all the rest qmp commands, neither should migration
> thread exist at this point).  Your call; it's still much cleaner.

I think I was mistaken. We need the lock on the thread that clears the
pointer so that we can safely dereference it on another thread under the
lock.

Here we're accessing it from the same thread that later does the
clearing. So that's a slightly different problem.



Re: [PATCH v5 8/8] migration: Add a wrapper to cleanup migration files

2023-09-01 Thread Peter Xu
On Thu, Aug 31, 2023 at 03:39:16PM -0300, Fabiano Rosas wrote:
> @@ -1166,16 +1183,9 @@ static void migrate_fd_cleanup(MigrationState *s)
>  qemu_mutex_lock_iothread();
>  
>  multifd_save_cleanup();
> -qemu_mutex_lock(&s->qemu_file_lock);
> -tmp = s->to_dst_file;
> -s->to_dst_file = NULL;
> -qemu_mutex_unlock(&s->qemu_file_lock);
> -/*
> - * Close the file handle without the lock to make sure the
> - * critical section won't block for long.
> - */
> -migration_ioc_unregister_yank_from_file(tmp);
> -qemu_fclose(tmp);
> +
> +migration_ioc_unregister_yank_from_file(s->to_dst_file);

I think you suggested that we should always take the file lock when
operating on them, so this is slightly going backwards to not hold any lock
when doing it. But doing so in migrate_fd_cleanup() is probably fine (as it
serializes with bql on all the rest qmp commands, neither should migration
thread exist at this point).  Your call; it's still much cleaner.

Reviewed-by: Peter Xu 

-- 
Peter Xu