Cédric Le Goater <c...@redhat.com> writes: > On 2/2/24 15:42, Fabiano Rosas wrote: >> Cédric Le Goater <c...@redhat.com> writes: >> >>> In case of error, close_return_path_on_source() can perform a shutdown >>> to exit the return-path thread. However, in migrate_fd_cleanup(), >>> 'to_dst_file' is closed before calling close_return_path_on_source() >>> and the shutdown fails, leaving the source and destination waiting for >>> an event to occur. >> >> At close_return_path_on_source, qemu_file_shutdown() and checking >> ms->to_dst_file are done under the qemu_file_lock, so how could >> migrate_fd_cleanup() have cleared the pointer but the ms->to_dst_file >> check have passed? > > This is not a locking issue, it's much simpler. migrate_fd_cleanup() > clears the ms->to_dst_file pointer and closes the QEMUFile and then > calls close_return_path_on_source() which then tries to use resources > which are not available anymore.
I'm missing something here. Which resources? I assume you're talking about this: WITH_QEMU_LOCK_GUARD(&ms->qemu_file_lock) { if (ms->to_dst_file && ms->rp_state.from_dst_file && qemu_file_get_error(ms->to_dst_file)) { qemu_file_shutdown(ms->rp_state.from_dst_file); } } How do we get past the 'if (ms->to_dst_file)'?