From: Hao Wang <wanghao...@huawei.com> If any error happens during multifd send thread creating (e.g. channel broke because new domain is destroyed by the dst), multifd_tls_handshake_thread may exit silently, leaving main migration thread hanging (ram_save_setup -> multifd_send_sync_main -> qemu_sem_wait(&p->sem_sync)). Fix that by adding error handling in multifd_tls_handshake_thread.
Signed-off-by: Hao Wang <wanghao...@huawei.com> Message-Id: <20210209104237.2250941-3-wanghao...@huawei.com> Reviewed-by: Daniel P. Berrangé <berra...@redhat.com> Reviewed-by: Chuan Zheng <zhengch...@huawei.com> Signed-off-by: Dr. David Alan Gilbert <dgilb...@redhat.com> --- migration/multifd.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/migration/multifd.c b/migration/multifd.c index 2a1ea85ade..03527c564c 100644 --- a/migration/multifd.c +++ b/migration/multifd.c @@ -739,7 +739,16 @@ static void multifd_tls_outgoing_handshake(QIOTask *task, } else { trace_multifd_tls_outgoing_handshake_complete(ioc); } - multifd_channel_connect(p, ioc, err); + + if (!multifd_channel_connect(p, ioc, err)) { + /* + * Error happen, mark multifd_send_thread status as 'quit' although it + * is not created, and then tell who pay attention to me. + */ + p->quit = true; + qemu_sem_post(&multifd_send_state->channels_ready); + qemu_sem_post(&p->sem_sync); + } } static void *multifd_tls_handshake_thread(void *opaque) -- 2.30.2