On Wed, Sep 17, 2025 at 06:06:55PM -0400, Brian Song wrote: > > > On 9/17/25 9:01 AM, Hanna Czenczek wrote: > > On 15.09.25 07:43, Brian Song wrote: > > > Hi Hanna, > > > > Hi Brian! > > > > (Thanks for your heads-up!) > > > > > Stefan raised the above issue and proposed a preliminary solution: keep > > > closing the file descriptor in the delete section, but perform > > > umount separately for FUSE uring and traditional FUSE in the shutdown > > > and delete sections respectively. This approach avoids the race > > > condition on the file descriptor. > > > > > > In the case of FUSE uring, umount must be performed in the shutdown > > > section. The reason is that the kernel currently lacks an interface to > > > explicitly cancel submitted SQEs. Performing umount forces the kernel to > > > flush all pending SQEs and return their CQEs. Without this step, CQEs > > > may arrive after the export has already been deleted, and invoking the > > > CQE handler at that point would dereference freed memory and trigger a > > > segmentation fault. > > > > The commit message says that incrementing the BB reference would be > > enough to solve the problem (i.e. deleting is delayed until all requests > > are done). Why isn’t it? > > Hanna: > > If we place umount in the delete section instead of the shutdown section, > the kernel FUSE driver will continue waiting for user FUSE requests and > therefore won't return CQEs to userspace. As a result, the BB reference > remains held (since the reference is acquired during registration and > submission and only released once the CQE returns), preventing the delete > operation from being invoked (invoked once the reference is decreased to 0). > This is why umount must be placed in the shutdown section. > > > > > > I’m curious about traditional FUSE: is it strictly necessary to perform > > > umount in the delete section, or could it also be done in shutdown? > > > > Looking into libfuse, fuse_session_unmount() (in fuse_kern_unmount()) > > closes the FUSE FD. I can imagine that might result in the potential > > problems Stefan described. > > > > > Additionally, what is the correct ordering between close(fd) and > > > umount, does one need to precede the other? > > > > fuse_kern_unmount() closes the (queue 0) FD first before actually > > unmounting, with a comment: “Need to close file descriptor, otherwise > > synchronous umount would recurse into filesystem, and deadlock.” > > > > Given that, I assume the FDs should all be closed before unmounting. > > > > (Though to be fair, before looking into it now, I don’t think I’ve ever > > given it much thought…) > > > > Hanna > > > Stefan: > > I roughly went through the umount and close system calls: > > umount: > fuse_kill_sb_anon -> fuse_sb_destroy -> fuse_abort_conn > > close: > __fput -> file->f_op->release(inode, file) -> fuse_dev_release -> > fuse_abort_conn > (this only runs after all /dev/fuse FDs have been closed). > > And as Hanna mentioned, libfuse points out: “Need to close file descriptor, > otherwise synchronous umount would recurse into filesystem, and deadlock.” > > So ideally, we should close each queue FD first, then call umount at the end > — even though calling umount directly also works. The root issue is that the > kernel doesn't provide an interface to cancel already submitted SQEs.
Hi Bernd,
I wanted to check with you to see if you have thought more about
ASYNC_CANCEL support for FUSE-over-io_uring SQEs?
If you don't have time to implement it, maybe you could share your
thoughts on how one would go about doing this? That would be a nice
starting point if someone else wants to try it out.
Thanks,
Stefan
>
> You mentioned that in fuse over io_uring mode we perform close in the
> shutdown path, but at that point the server may still be processing
> requests. While handling requests, it may still write to the FD, but that FD
> might not be /dev/fuse. I’m not sure how this gets triggered, since in fuse
> uring mode all FUSE requests are handled by io_uring, and our FUSE requests
> should be completed via io_uring. After shutdown closes the FD, it may call
> fuse_abort_conn, which terminates all request processing in the kernel.
> There’s also locking in place to protect the termination of requests and the
> subsequent uring cleanup.
>
> That’s why I think the best approach for now is:
>
> in shutdown, handle close and umount for fuse over io_uring;
>
> in delete, handle close and umount for traditional FUSE.
>
> > > Thanks,
> > > Brian
> > >
> > > On 9/9/25 3:33 PM, Stefan Hajnoczi wrote:
> > > > On Fri, Aug 29, 2025 at 10:50:24PM -0400, Brian Song wrote:
> > > >> @@ -901,24 +941,15 @@ static void fuse_export_shutdown(BlockExport
> > > *blk_exp)
> > > >> */
> > > >> g_hash_table_remove(exports, exp->mountpoint);
> > > >> }
> > > >> -}
> > > >> -
> > > >> -static void fuse_export_delete(BlockExport *blk_exp)
> > > >> -{
> > > >> - FuseExport *exp = container_of(blk_exp, FuseExport, common);
> > > >>
> > > >> - for (int i = 0; i < exp->num_queues; i++) {
> > > >> + for (size_t i = 0; i < exp->num_queues; i++) {
> > > >> FuseQueue *q = &exp->queues[i];
> > > >>
> > > >> /* Queue 0's FD belongs to the FUSE session */
> > > >> if (i > 0 && q->fuse_fd >= 0) {
> > > >> close(q->fuse_fd);
> > > >
> > > > This changes the behavior of the non-io_uring code. Now all fuse
> > > fds and
> > > > fuse_session are closed while requests are potentially still being
> > > > processed.
> > > >
> > > > There is a race condition: if an IOThread is processing a
> > > request here
> > > > then it may invoke a system call on q->fuse_fd just after it has been
> > > > closed but not set to -1. If another thread has also opened a
> > > new file
> > > > then the fd could be reused, resulting in an accidental write(2)
> > > to the
> > > > new file. I'm not sure whether there is a way to trigger this in
> > > > practice, but it looks like a problem waiting to happen.
> > > >
> > > > Simply setting q->fuse_fd to -1 here doesn't fix the race. It
> > > would be
> > > > necessary to stop processing fuse_fd in the thread before closing it
> > > > here or to schedule a BH in each thread so that fuse_fd can be closed
> > > > in the thread that uses the fd.
> > >
> >
>
signature.asc
Description: PGP signature
