Prasad Pandit <[email protected]> writes:

> On Wed, 21 Jan 2026 at 18:12, Fabiano Rosas <[email protected]> wrote:
>> In general, I think that whenever we determine what protects a data
>> structure from concurrent access we should make it obvious.
>>
>> So this assert is here to provide _some_ assurance that this routine is 
>> protected.
>
> * Yes, but it is not obvious that we are checking bql_locked() to
> protect from concurrent access to the global MigrationState object.
> Secondly for concurrent access, other threads should be running.
> ===
> Thread 1 "qemu-system-x86" hit Breakpoint 1, migrate_params_apply
> (params=0x7ffcb6627840) at ../migration/options.c:1392
> 1392    {
> (gdb) n
> 1393        MigrationState *s = migrate_get_current();
> (gdb) p bql_locked()
> $3 = true
> (gdb) p migration_is_running()
> $4 = false
> (gdb)
>
> Thread 1 "qemu-system-x86" hit Breakpoint 1, migrate_params_apply
> (params=0x7ffcb6627840) at ../migration/options.c:1392
> 1392    {
> (gdb) n
> 1393        MigrationState *s = migrate_get_current();
> (gdb) p bql_locked()
> $5 = true
> (gdb) p migration_is_running()
> $6 = false
> (gdb)
> ===
> * migrate_params_apply() is called twice:
>     1. At the beginning when libvirtd(8) initiates migration
>     2. To reset migration options if we kill migration on the
> destination side by killing VM or connection.

This is one usage. People are free to invoke QMP commands whenever they want.

> Both times it is called from the main Thread-1, BQL is locked and
> other migration threads are not running.
>
> * If we are trying to futureproof this function so that no one should
> be able to call migrate_params_apply() from other threads without bql
> - that seems like a long shot. Because it is reachable only from
> qmp_migrate_set_parameters(), which is invoked by an external QMP
> user. Ie. someone would have to explicitly send
> qmp_migrate_set_parameters() command while migration is running, but
> even then it'll still be invoked by the main thread-1, with
> bql_locked=true. no?
>

Setting parameters while running is common, our tests do it all the time
to adjust convergence options.

Where the command is invoked from is a bad assumption to make. There's
coroutines, iocontexts, oob, etc. These things change all the time and
we don't see it. Look at all the work done in the block layer to assert 

In this case, it will be invoked from the main loop and with BQL taken,
yes. That's not code under the migration subsystem's control, we should
not rely on that never changing. If we can check the requirements at the
entry point into the subsystem, we'll be better protected against
changes in other parts of QEMU that we didn't oversee.

Peter is now going through the painful exercise of removing the incoming
coroutine and playing the whack-a-mole of "BQL or no BQL?". Locks should
be fine-grained, protect specific bits of data and be heavily
documented. Analysing where the function is currently invoked from and
inferring about thread-safety is a pitfall.

(note that a simple /*called with BQL taken*/ would already count)

> * I'm thinking maybe we can skip assert(bql_locked()) and related
> header inclusion.
>

Fine, on the basis that it's out of scope for the patch anyway.

> Thank you.
> ---
>   - Prasad

Reply via email to