On 2/2/24 15:42, Fabiano Rosas wrote:
Cédric Le Goater <c...@redhat.com> writes:
In case of error, close_return_path_on_source() can perform a shutdown
to exit the return-path thread. However, in migrate_fd_cleanup(),
'to_dst_file' is closed before calling close_return_path_on_source()
and the shutdown fails, leaving the source and destination waiting for
an event to occur.
At close_return_path_on_source, qemu_file_shutdown() and checking
ms->to_dst_file are done under the qemu_file_lock, so how could
migrate_fd_cleanup() have cleared the pointer but the ms->to_dst_file
check have passed?
This is not a locking issue, it's much simpler. migrate_fd_cleanup()
clears the ms->to_dst_file pointer and closes the QEMUFile and then
calls close_return_path_on_source() which then tries to use resources
which are not available anymore.
Thanks,
C.
Close the file after calling close_return_path_on_source() so that the
shutdown succeeds and the return-path thread exits.
Signed-off-by: Cédric Le Goater <c...@redhat.com>
---
migration/migration.c | 12 +++++-------
1 file changed, 5 insertions(+), 7 deletions(-)
diff --git a/migration/migration.c b/migration/migration.c
index
2c3362235c7651c11d581f3c3639571f1f9636ef..1e0b6acaedc272e8ce26ad40be2c42177f5fd14e
100644
--- a/migration/migration.c
+++ b/migration/migration.c
@@ -1314,6 +1314,7 @@ void migrate_set_state(int *state, int old_state, int
new_state)
static void migrate_fd_cleanup(MigrationState *s)
{
int file_error = 0;
+ QEMUFile *tmp = NULL;
g_free(s->hostname);
s->hostname = NULL;
@@ -1323,8 +1324,6 @@ static void migrate_fd_cleanup(MigrationState *s)
qemu_savevm_state_cleanup();
if (s->to_dst_file) {
- QEMUFile *tmp;
-
trace_migrate_fd_cleanup();
bql_unlock();
if (s->migration_thread_running) {
@@ -1344,15 +1343,14 @@ static void migrate_fd_cleanup(MigrationState *s)
* critical section won't block for long.
*/
migration_ioc_unregister_yank_from_file(tmp);
- qemu_fclose(tmp);
}
- /*
- * We already cleaned up to_dst_file, so errors from the return
- * path might be due to that, ignore them.
- */
close_return_path_on_source(s, file_error);
+ if (tmp) {
+ qemu_fclose(tmp);
+ }
+
assert(!migration_is_active(s));
if (s->state == MIGRATION_STATUS_CANCELLING) {