Peter Xu <pet...@redhat.com> writes: > On Fri, May 03, 2024 at 04:56:08PM -0300, Fabiano Rosas wrote: >> Peter Xu <pet...@redhat.com> writes: >> >> > On Fri, Apr 26, 2024 at 11:20:35AM -0300, Fabiano Rosas wrote: >> >> When the migration using the "file:" URI was implemented, I don't >> >> think any of us noticed that if you pass in a file name with the >> >> format "/dev/fdset/N", this allows a file descriptor to be passed in >> >> to QEMU and that behaves just like the "fd:" URI. So the "file:" >> >> support has been added without regard for the fdset part and we got >> >> some things wrong. >> >> >> >> The first issue is that we should not truncate the migration file if >> >> we're allowing an fd + offset. We need to leave the file contents >> >> untouched. >> > >> > I'm wondering whether we can use fallocate() instead on the ranges so that >> > we always don't open() with O_TRUNC. Before that.. could you remind me >> > why do we need to truncate in the first place? I definitely missed >> > something else here too. >> >> AFAIK, just to avoid any issues if the file is pre-existing. I don't see >> the difference between O_TRUNC and fallocate in this case. > > Then, shall we avoid truncations at all, leaving all the feasibility to > user (also errors prone to make)? >
Is this a big deal? I'd rather close that possible gap and avoid the bug reports. >> >> > >> >> >> >> The second issue is that there's an expectation that QEMU removes the >> >> fd after the migration has finished. That's what the "fd:" code >> >> does. Otherwise a second migration on the same VM could attempt to >> >> provide an fdset with the same name and QEMU would reject it. >> > >> > Let me check what we do when with "fd:" and when migration completes or >> > cancels. >> > >> > IIUC it's qio_channel_file_close() that does the final cleanup work on >> > e.g. to_dst_file, right? Then there's qemu_close(), and it has: >> > >> > /* Close fd that was dup'd from an fdset */ >> > fdset_id = monitor_fdset_dup_fd_find(fd); >> > if (fdset_id != -1) { >> > int ret; >> > >> > ret = close(fd); >> > if (ret == 0) { >> > monitor_fdset_dup_fd_remove(fd); >> > } >> > >> > return ret; >> > } >> > >> > Shouldn't this done the work already? >> >> That removes the mon_fdset_fd_dup->fd, we want to remove the >> mon_fdset_fd->fd. > > What I read so far is when we are removing the dup-fds, we'll do one more > thing: > > monitor_fdset_dup_fd_find_remove(): > if (QLIST_EMPTY(&mon_fdset->dup_fds)) { > monitor_fdset_cleanup(mon_fdset); > } > > It means if we removed all the dup-fds correctly, we should also remove the > whole fdset, which includes the ->fds, IIUC. > Since mon_fdset_fd->removed == false, we hit the runstate_is_running() problem. I'm not sure, but probably mon_refcount > 0 as well. So the fd would not be removed. But I'll retest this on Monday just be sure, it's been a while since I wrote some parts of this. >> >> > >> > Off topic: I think this code is over complicated too, maybe I missed >> > something, but afaict we don't need monitor_fdset_dup_fd_find at all.. we >> > simply walk the list and remove stuff.. I attach a patch at the end that I >> > tried to clean that up, just in case there's early comments. But we can >> > ignore that so we don't get side-tracked, and focus on the direct-io >> > issues. >> >> Well, I'm not confident touching this code. This is more than a decade >> old, I have no idea what the original motivations were. The possible >> interactions with the user via command-line (-add-fd), QMP (add-fd) and >> the monitor lifetime make me confused. Not to mention the fdset part >> being plumbed into the guts of a widely used qemu_open_internal() that >> very misleadingly presents itself as just a wrapper for open(). > > If to make QEMU long live, we'll probably need to touch it at some > point.. or at least discuss about it and figure things out. We pay tech > debts like this when there's no good comment / docs to refer in this case, > then the earlier, perhaps also the better.. to try taking the stab, imho. > > Definitely not a request to clean everything up. :) Let's see whether > others can chim in with better knowledge of the history. > >> >> > >> > Thanks, >> > >> > ======= >> > >> > From 2f6b6d1224486d8ee830a7afe34738a07003b863 Mon Sep 17 00:00:00 2001 >> > From: Peter Xu <pet...@redhat.com> >> > Date: Fri, 3 May 2024 11:27:20 -0400 >> > Subject: [PATCH] monitor: Drop monitor_fdset_dup_fd_add() >> > MIME-Version: 1.0 >> > Content-Type: text/plain; charset=UTF-8 >> > Content-Transfer-Encoding: 8bit >> > >> > This function is not needed, one remove function should already work. >> > Clean it up. >> > >> > Here the code doesn't really care about whether we need to keep that dupfd >> > around if close() failed: when that happens something got very wrong, >> > keeping the dup_fd around the fdsets may not help that situation so far. >> > >> > Cc: Dr. David Alan Gilbert <d...@treblig.org> >> > Cc: Markus Armbruster <arm...@redhat.com> >> > Cc: Philippe Mathieu-Daudé <phi...@linaro.org> >> > Cc: Paolo Bonzini <pbonz...@redhat.com> >> > Cc: Daniel P. Berrangé <berra...@redhat.com> >> > Signed-off-by: Peter Xu <pet...@redhat.com> >> > --- >> > include/monitor/monitor.h | 1 - >> > monitor/fds.c | 27 +++++---------------------- >> > stubs/fdset.c | 5 ----- >> > util/osdep.c | 15 +-------------- >> > 4 files changed, 6 insertions(+), 42 deletions(-) >> > >> > diff --git a/include/monitor/monitor.h b/include/monitor/monitor.h >> > index 965f5d5450..fd9b3f538c 100644 >> > --- a/include/monitor/monitor.h >> > +++ b/include/monitor/monitor.h >> > @@ -53,7 +53,6 @@ AddfdInfo *monitor_fdset_add_fd(int fd, bool >> > has_fdset_id, int64_t fdset_id, >> > const char *opaque, Error **errp); >> > int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags); >> > void monitor_fdset_dup_fd_remove(int dup_fd); >> > -int64_t monitor_fdset_dup_fd_find(int dup_fd); >> > >> > void monitor_register_hmp(const char *name, bool info, >> > void (*cmd)(Monitor *mon, const QDict *qdict)); >> > diff --git a/monitor/fds.c b/monitor/fds.c >> > index d86c2c674c..d5aecfb70e 100644 >> > --- a/monitor/fds.c >> > +++ b/monitor/fds.c >> > @@ -458,7 +458,7 @@ int monitor_fdset_dup_fd_add(int64_t fdset_id, int >> > flags) >> > #endif >> > } >> > >> > -static int64_t monitor_fdset_dup_fd_find_remove(int dup_fd, bool remove) >> > +void monitor_fdset_dup_fd_remove(int dup_fd) >> > { >> > MonFdset *mon_fdset; >> > MonFdsetFd *mon_fdset_fd_dup; >> > @@ -467,31 +467,14 @@ static int64_t monitor_fdset_dup_fd_find_remove(int >> > dup_fd, bool remove) >> > QLIST_FOREACH(mon_fdset, &mon_fdsets, next) { >> > QLIST_FOREACH(mon_fdset_fd_dup, &mon_fdset->dup_fds, next) { >> > if (mon_fdset_fd_dup->fd == dup_fd) { >> > - if (remove) { >> > - QLIST_REMOVE(mon_fdset_fd_dup, next); >> > - g_free(mon_fdset_fd_dup); >> > - if (QLIST_EMPTY(&mon_fdset->dup_fds)) { >> > - monitor_fdset_cleanup(mon_fdset); >> > - } >> > - return -1; >> > - } else { >> > - return mon_fdset->id; >> > + QLIST_REMOVE(mon_fdset_fd_dup, next); >> > + g_free(mon_fdset_fd_dup); >> > + if (QLIST_EMPTY(&mon_fdset->dup_fds)) { >> > + monitor_fdset_cleanup(mon_fdset); >> > } >> > } >> > } >> > } >> > - >> > - return -1; >> > -} >> > - >> > -int64_t monitor_fdset_dup_fd_find(int dup_fd) >> > -{ >> > - return monitor_fdset_dup_fd_find_remove(dup_fd, false); >> > -} >> > - >> > -void monitor_fdset_dup_fd_remove(int dup_fd) >> > -{ >> > - monitor_fdset_dup_fd_find_remove(dup_fd, true); >> > } >> > >> > int monitor_fd_param(Monitor *mon, const char *fdname, Error **errp) >> > diff --git a/stubs/fdset.c b/stubs/fdset.c >> > index d7c39a28ac..389e368a29 100644 >> > --- a/stubs/fdset.c >> > +++ b/stubs/fdset.c >> > @@ -9,11 +9,6 @@ int monitor_fdset_dup_fd_add(int64_t fdset_id, int flags) >> > return -1; >> > } >> > >> > -int64_t monitor_fdset_dup_fd_find(int dup_fd) >> > -{ >> > - return -1; >> > -} >> > - >> > void monitor_fdset_dup_fd_remove(int dupfd) >> > { >> > } >> > diff --git a/util/osdep.c b/util/osdep.c >> > index e996c4744a..2d9749d060 100644 >> > --- a/util/osdep.c >> > +++ b/util/osdep.c >> > @@ -393,21 +393,8 @@ int qemu_open_old(const char *name, int flags, ...) >> > >> > int qemu_close(int fd) >> > { >> > - int64_t fdset_id; >> > - >> > /* Close fd that was dup'd from an fdset */ >> > - fdset_id = monitor_fdset_dup_fd_find(fd); >> > - if (fdset_id != -1) { >> > - int ret; >> > - >> > - ret = close(fd); >> > - if (ret == 0) { >> > - monitor_fdset_dup_fd_remove(fd); >> > - } >> > - >> > - return ret; >> > - } >> > - >> > + monitor_fdset_dup_fd_remove(fd); >> > return close(fd); >> > } >> > >> > -- >> > 2.44.0 >>