Peter Xu <pet...@redhat.com> writes: > On Wed, Aug 02, 2023 at 05:04:45PM -0300, Fabiano Rosas wrote: >> Peter Xu <pet...@redhat.com> writes: >> >> >> + if (await_return_path_close_on_source(s)) { >> >> + trace_migration_return_path_pause_err(); >> >> + return MIG_THR_ERR_FATAL; >> >> + } >> > >> > I see that here on return path failures we'll bail out, and actually it's >> > against the instinction (that when pause it should have failed, so it's >> > weird why it's returning 0). >> > >> > So how about above suggestion, plus here we just call >> > await_return_path_close_on_source(), without caring about the retval? >> >> So you are suggesting to remove the knowledge of the retry entirely from >> the thread. It just reports the error and the postcopy_pause takes the >> responsibility of ignoring it when we want to retry... It could be >> clearer that way indeed. > > That error doesn't really important IMHO here, because the to-dst-file > should have already errored out anyway. > > I just think it cleaner if we reset rp_error only until the new thread > created. >
ok >> >> It would trigger when a rp error happened that wasn't related to the >> QEMUFile. If we go with your suggestion above, then this goes away. > > With your current patch where rp_error seems to be always reset when thread > quit, if that's true then it'll 100% happen that this will not trigger. > > But yeah this is a trivial spot, feel free to choose the best if you plan > to reorganize this patch a bit. Thanks. My patch just resets the error when doing postcopy and the error is a QEMUFile error. The header validation at the start of the loop could still set rp_state.error and return without going through the postcopy retry: if (header_type >= MIG_RP_MSG_MAX || header_type == MIG_RP_MSG_INVALID) { error_report("RP: Received invalid message 0x%04x length 0x%04x", header_type, header_len); mark_source_rp_bad(ms); goto out; }