When a send thread encounters an error (as is the case with yank),
it sets multifd_send_state->exiting and the other threads exit too.
This races with multifd_send_sync_main() which now hangs at
qemu_sem_wait(&p->sem_sync) in multifd_send_sync_main() line 647
as it waits for threads that have exited.

Fix this by kicking the semaphores when exiting the send threads.

I encountered this hang when stress testing the colo unit test,
though I was unable to write a migration test to reliably hit this.

Signed-off-by: Lukas Straub <[email protected]>
---
 migration/multifd.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/migration/multifd.c b/migration/multifd.c
index 
220ed8564960fdabc58e4baa069dd252c8ad293c..e8c85cb6c48deaee2c9bda7b821a976166d78c9c
 100644
--- a/migration/multifd.c
+++ b/migration/multifd.c
@@ -677,6 +677,7 @@ static void *multifd_send_thread(void *opaque)
         qemu_sem_wait(&p->sem);
 
         if (multifd_send_should_exit()) {
+            multifd_send_kick_main(p);
             break;
         }
 

-- 
2.39.5


Reply via email to