"Denis V. Lunev" <d...@openvz.org> wrote:
> Release qemu global mutex before call synchronize_rcu().
> synchronize_rcu() waiting for all readers to finish their critical
> sections. There is at least one critical section in which we try
> to get QGM (critical section is in address_space_rw() and
> prepare_mmio_access() is trying to aquire QGM).
>
> Both functions (migration_end() and migration_bitmap_extend())
> are called from main thread which is holding QGM.
>
> Thus there is a race condition that ends up with deadlock:
> main thread     working thread
> Lock QGA                |
> |             Call KVM_EXIT_IO handler
> |                       |
> |        Open rcu reader's critical section
> Migration cleanup bh    |
> |                       |
> synchronize_rcu() is    |
> waiting for readers     |
> |            prepare_mmio_access() is waiting for QGM
>   \                   /
>          deadlock
>
> The patch changes bitmap freeing from direct g_free after synchronize_rcu
> to free inside call_rcu.
>
> Signed-off-by: Denis V. Lunev <d...@openvz.org>
> Reported-by: Igor Redko <red...@virtuozzo.com>
> Tested-by: Igor Redko <red...@virtuozzo.com>
> CC: Anna Melekhova <an...@virtuozzo.com>
> CC: Juan Quintela <quint...@redhat.com>
> CC: Amit Shah <amit.s...@redhat.com>
> CC: Paolo Bonzini <pbonz...@redhat.com>
> CC: Wen Congyang <we...@cn.fujitsu.com>

Reviewed-by: Juan Quintela <quint...@redhat.com>

Appliefd to my tree.

PD, no I still don't understood how RCU gave us so many corner cases wrong.

Reply via email to