On Пт., 2015-09-25 at 17:46 +0800, Wen Congyang wrote: > On 09/25/2015 05:09 PM, Denis V. Lunev wrote: > > Release qemu global mutex before call synchronize_rcu(). > > synchronize_rcu() waiting for all readers to finish their critical > > sections. There is at least one critical section in which we try > > to get QGM (critical section is in address_space_rw() and > > prepare_mmio_access() is trying to aquire QGM). > > > > Both functions (migration_end() and migration_bitmap_extend()) > > are called from main thread which is holding QGM. > > > > Thus there is a race condition that ends up with deadlock: > > main thread working thread > > Lock QGA | > > | Call KVM_EXIT_IO handler > > | | > > | Open rcu reader's critical section > > Migration cleanup bh | > > | | > > synchronize_rcu() is | > > waiting for readers | > > | prepare_mmio_access() is waiting for QGM > > \ / > > deadlock > > > > Patches here are quick and dirty, compile-tested only to validate the > > architectual approach. > > > > Igor, Anna, can you pls start your tests with these patches instead of your > > original one. Thank you. > > Can you give me the backtrace of the working thread? > > I think it is very bad to wait some lock in rcu reader's cirtical section.
#0 __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f1ef113ccfd in __GI___pthread_mutex_lock (mutex=0x7f1ef4145ce0 <qemu_global_mutex>) at ../nptl/pthread_mutex_lock.c:80 #2 0x00007f1ef3c36546 in qemu_mutex_lock (mutex=0x7f1ef4145ce0 <qemu_global_mutex>) at util/qemu-thread-posix.c:73 #3 0x00007f1ef387ff46 in qemu_mutex_lock_iothread () at /home/user/my_qemu/qemu/cpus.c:1170 #4 0x00007f1ef38514a2 in prepare_mmio_access (mr=0x7f1ef612f200) at /home/user/my_qemu/qemu/exec.c:2390 #5 0x00007f1ef385157e in address_space_rw (as=0x7f1ef40ec940 <address_space_io>, addr=49402, attrs=..., buf=0x7f1ef3f97000 "\001", len=1, is_write=true) at /home/user/my_qemu/qemu/exec.c:2425 #6 0x00007f1ef3897c53 in kvm_handle_io (port=49402, attrs=..., data=0x7f1ef3f97000, direction=1, size=1, count=1) at /home/user/my_qemu/qemu/kvm-all.c:1680 #7 0x00007f1ef3898144 in kvm_cpu_exec (cpu=0x7f1ef5010fc0) at /home/user/my_qemu/qemu/kvm-all.c:1849 #8 0x00007f1ef387fa91 in qemu_kvm_cpu_thread_fn (arg=0x7f1ef5010fc0) at /home/user/my_qemu/qemu/cpus.c:979 #9 0x00007f1ef113a6aa in start_thread (arg=0x7f1eef0b9700) at pthread_create.c:333 #10 0x00007f1ef0e6feed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 > > > > > Signed-off-by: Denis V. Lunev <d...@openvz.org> > > CC: Igor Redko <red...@virtuozzo.com> > > CC: Anna Melekhova <an...@virtuozzo.com> > > CC: Juan Quintela <quint...@redhat.com> > > CC: Amit Shah <amit.s...@redhat.com> > > > > Denis V. Lunev (2): > > migration: bitmap_set is unnecessary as bitmap_new uses g_try_malloc0 > > migration: fix deadlock > > > > migration/ram.c | 45 ++++++++++++++++++++++++++++----------------- > > 1 file changed, 28 insertions(+), 17 deletions(-) > > >