* Fabiano Rosas (faro...@suse.de) wrote: > Peter Xu <pet...@redhat.com> writes: > > > On Thu, May 23, 2024 at 04:05:36PM -0300, Fabiano Rosas wrote: > >> We've been up until now cleaning up any file descriptors that have > >> been passed into QEMU and never duplicated[1,2]. A file descriptor > >> without duplicates indicates that no part of QEMU has made use of > >> it. This approach is starting to show some cracks now that we're > >> starting to consume fds from the migration code: > >> > >> - Doing cleanup every time the last monitor connection closes works to > >> reap unused fds, but also has the side effect of forcing the > >> management layer to pass the file descriptors again in case of a > >> disconnect/re-connect, if that happened to be the only monitor > >> connection. > >> > >> Another side effect is that removing an fd with qmp_remove_fd() is > >> effectively delayed until the last monitor connection closes. > >> > >> The reliance on mon_refcount is also problematic because it's racy. > >> > >> - Checking runstate_is_running() skips the cleanup unless the VM is > >> running and avoids premature cleanup of the fds, but also has the > >> side effect of blocking the legitimate removal of an fd via > >> qmp_remove_fd() if the VM happens to be in another state. > >> > >> This affects qmp_remove_fd() and qmp_query_fdsets() in particular > >> because requesting a removal at a bad time (guest stopped) might > >> cause an fd to never be removed, or to be removed at a much later > >> point in time, causing the query command to continue showing the > >> supposedly removed fd/fdset. > >> > >> Note that file descriptors that *have* been duplicated are owned by > >> the code that uses them and will be removed after qemu_close() is > >> called. Therefore we've decided that the best course of action to > >> avoid the undesired side-effects is to stop managing non-duplicated > >> file descriptors. > >> > >> 1- efb87c1697 ("monitor: Clean up fd sets on monitor disconnect") > >> 2- ebe52b592d ("monitor: Prevent removing fd from set during init") > >> > >> Signed-off-by: Fabiano Rosas <faro...@suse.de> > >> --- > >> monitor/fds.c | 15 ++++++++------- > >> monitor/hmp.c | 2 -- > >> monitor/monitor-internal.h | 1 - > >> monitor/qmp.c | 2 -- > >> 4 files changed, 8 insertions(+), 12 deletions(-) > >> > >> diff --git a/monitor/fds.c b/monitor/fds.c > >> index 101e21720d..f7b98814fa 100644 > >> --- a/monitor/fds.c > >> +++ b/monitor/fds.c > >> @@ -169,6 +169,11 @@ int monitor_get_fd(Monitor *mon, const char *fdname, > >> Error **errp) > >> > >> static void monitor_fdset_free(MonFdset *mon_fdset) > >> { > >> + /* > >> + * Only remove an empty fdset. The fds are owned by the user and > >> + * should have been removed with qmp_remove_fd(). The dup_fds are > >> + * owned by QEMU and should have been removed with qemu_close(). > >> + */ > >> if (QLIST_EMPTY(&mon_fdset->fds) && QLIST_EMPTY(&mon_fdset->dup_fds)) > >> { > >> QLIST_REMOVE(mon_fdset, next); > >> g_free(mon_fdset); > >> @@ -189,9 +194,7 @@ static void monitor_fdset_cleanup(MonFdset *mon_fdset) > >> MonFdsetFd *mon_fdset_fd_next; > >> > >> QLIST_FOREACH_SAFE(mon_fdset_fd, &mon_fdset->fds, next, > >> mon_fdset_fd_next) { > >> - if ((mon_fdset_fd->removed || > >> - (QLIST_EMPTY(&mon_fdset->dup_fds) && mon_refcount == 0)) > >> && > >> - runstate_is_running()) { > >> + if (mon_fdset_fd->removed) { > > > > I don't know the code well, but I like it. > > > >> monitor_fdset_fd_free(mon_fdset_fd); > >> } > >> } > >> @@ -206,7 +209,7 @@ void monitor_fdsets_cleanup(void) > >> > >> QEMU_LOCK_GUARD(&mon_fdsets_lock); > >> QLIST_FOREACH_SAFE(mon_fdset, &mon_fdsets, next, mon_fdset_next) { > >> - monitor_fdset_cleanup(mon_fdset); > >> + monitor_fdset_free(mon_fdset); > >> } > >> } > >> > >> @@ -479,9 +482,7 @@ void monitor_fdset_dup_fd_remove(int dup_fd) > >> if (mon_fdset_fd_dup->fd == dup_fd) { > >> QLIST_REMOVE(mon_fdset_fd_dup, next); > >> g_free(mon_fdset_fd_dup); > >> - if (QLIST_EMPTY(&mon_fdset->dup_fds)) { > >> - monitor_fdset_cleanup(mon_fdset); > >> - } > >> + monitor_fdset_free(mon_fdset); > > > > This and above changes are not crystal clear to me where the _cleanup() > > does extra check "removed" and clean those fds. > > > > I think it'll make it easier for me to understand if this one and the next > > are squashed together. But maybe it's simply because I didn't fully > > understand. > > monitor_fdsets_cleanup() was doing three things previously: > > 1- Remove the removed=true fds. This is weird, but ok. > > 2- Remove fds from an fdset that has an empty dup_fds list, but only if > the guest is not running and the monitor is closed. This is problematic. > > 3- Remove/free fdsets that have become empty due to the above > removals. This is ok. > > This patch covers (2), because that is the only change that has a > complex reasoning behind it and we need to document that without > conflating it with the rest of the changes.
The ebe52b592d you reference, makes me think that the only reason for the 'is not running' was as a way to detect when init had finished; and there are certainly better ways to do that (now?). However, the efb87c1697 talks about cleaning up non-dup's on all monitors closed, to stop getting left-over fd's that were added but then not used (because a disconnect happened between adding and being used). In your world when do these get cleaned up? Dave > Since (3) is still a reasonable thing to do, this patch merely keeps it > around, but using the helpers introduced in the previous patch. > > The next patch removed the need for (1), making the removal immediate > instead of delayed. It has it's own much less complex reasoning, which > is: "we don't need to wait to remove the fd". > > I hope this clarifies a bit. I would prefer if we kept this and the next > patch separate to avoid having a single patch that does too many > things. I hope to be as explicit as possible with the reason behind > these changes to avoid putting future people in the situation that we > are in now, i.e. having to guess at the reasons behind this code. > > If you still think we should squash or if you have more questions, let > me know. > > >> return; > >> } > >> } > >> diff --git a/monitor/hmp.c b/monitor/hmp.c > >> index 69c1b7e98a..460e8832f6 100644 > >> --- a/monitor/hmp.c > >> +++ b/monitor/hmp.c > >> @@ -1437,11 +1437,9 @@ static void monitor_event(void *opaque, > >> QEMUChrEvent event) > >> monitor_resume(mon); > >> } > >> qemu_mutex_unlock(&mon->mon_lock); > >> - mon_refcount++; > >> break; > >> > >> case CHR_EVENT_CLOSED: > >> - mon_refcount--; > >> monitor_fdsets_cleanup(); > >> break; > >> > >> diff --git a/monitor/monitor-internal.h b/monitor/monitor-internal.h > >> index 252de85681..cb628f681d 100644 > >> --- a/monitor/monitor-internal.h > >> +++ b/monitor/monitor-internal.h > >> @@ -168,7 +168,6 @@ extern bool qmp_dispatcher_co_shutdown; > >> extern QmpCommandList qmp_commands, qmp_cap_negotiation_commands; > >> extern QemuMutex monitor_lock; > >> extern MonitorList mon_list; > >> -extern int mon_refcount; > >> > >> extern HMPCommand hmp_cmds[]; > >> > >> diff --git a/monitor/qmp.c b/monitor/qmp.c > >> index a239945e8d..5e538f34c0 100644 > >> --- a/monitor/qmp.c > >> +++ b/monitor/qmp.c > >> @@ -466,7 +466,6 @@ static void monitor_qmp_event(void *opaque, > >> QEMUChrEvent event) > >> data = qmp_greeting(mon); > >> qmp_send_response(mon, data); > >> qobject_unref(data); > >> - mon_refcount++; > >> break; > >> case CHR_EVENT_CLOSED: > >> /* > >> @@ -479,7 +478,6 @@ static void monitor_qmp_event(void *opaque, > >> QEMUChrEvent event) > >> json_message_parser_destroy(&mon->parser); > >> json_message_parser_init(&mon->parser, handle_qmp_command, > >> mon, NULL); > >> - mon_refcount--; > >> monitor_fdsets_cleanup(); > >> break; > >> case CHR_EVENT_BREAK: > > > > I like this too when mon_refcount can be dropped. It turns out I like this > > patch and the next a lot, and I hope nothing will break. > > > > Aside, you seem to have forgot removal of the "int mon_refcount" in > > monitor.c. > > Yes, I'll fix that. Thanks. -- -----Open up your eyes, open up your mind, open up your code ------- / Dr. David Alan Gilbert | Running GNU/Linux | Happy \ \ dave @ treblig.org | | In Hex / \ _________________________|_____ http://www.treblig.org |_______/