On Wed, Jan 6, 2021 at 5:29 AM Amir Goldstein <[email protected]> wrote: > > On Mon, Jan 4, 2021 at 8:57 PM Dr. David Alan Gilbert > <[email protected]> wrote: > > > > * Vivek Goyal ([email protected]) wrote: > > > On Mon, Jan 04, 2021 at 04:00:13PM +0000, Dr. David Alan Gilbert wrote: > > > > Hi, > > > > On virtio-fs we're hitting a problem with NFS, where > > > > unlinking a file in a directory and then rmdir'ing that > > > > directory fails complaining about the directory not being empty. > > > > > > > > The problem here is that if a file has an open fd, NFS doesn't > > > > actually delete the file on unlink, it just renames it to > > > > a hidden file (e.g. .nfs*******). That open file is there because > > > > the 'FORGET' hasn't completed yet by the time the rmdir is issued. > > > > > > > > Question: > > > > a) In the FUSE protocol, are requests assumed to complete in order; > > > > i.e. unlink, forget, rmdir is it required that 'forget' completes > > > > before the rmdir is processed? > > > > (In virtiofs we've been processing requests, in parallel, and > > > > have sent forgets down a separate queue to keep them out of the way). > > > > > > > > b) 'forget' doesn't send a reply - so the kernel can't wait for the > > > > client to have finished it; do we need a synchronous forget here? > > > > > > Even if we introduce a synchronous forget, will that really fix the > > > issue. For example, this could also happen if file has been unlinked > > > but it is still open and directory is being removed. > > > > > > fd = open(foo/bar.txt) > > > unlink foo/bar.txt > > > rmdir foo > > > close(fd). > > > > > > In this case, final forget should go after fd has been closed. Its > > > not a forget race. > > > > > > I wrote a test case for this and it works on regular file systems. > > > > > > https://github.com/rhvgoyal/misc/blob/master/virtiofs-tests/rmdir.c > > > > > > I suspect it will fail on nfs because I am assuming that temporary > > > file will be there till final close(fd) happens. If that's the > > > case this is a NFS specific issue because its behavior is different > > > from other file systems. > > > > That's true; but that's NFS just being NFS; in our case we're keeping > > an fd open even though the guest has been smart enough not to; so we're > > causing the NFS oddity when it wouldn't normally happen. > > > > Are you sure that you really need this oddity? > > My sense from looking virtiofsd is that the open O_PATH fd > in InodeData for non-directories are an overkill and even the need > for open fd for all directories is questionable. > > If you store a FileHandle (name_to_handle_at(2)) instead of an open fd > for non-directories, you won't be keeping a reference on the underlying inode > so no unlink issue. > > open_by_handle_at(2) is very cheap for non-directory when underlying inode > is cached and as cheap as it can get even when inode is not in cache, so no > performance penalty is expected.
You are perfectly right that using file handles would solve a number of issues, one being too many open file descriptors. The issue with open_by_handle_at(2) is that it needs CAP_DAC_READ_SEARCH in the initial user namespace. That currently makes it impossible to use in containers and such. Not sure if there has been proposals for making open_by_handle_at(2) usable in user namespaces. The difficulty is in verifying that an open file would have been obtainable by path lookup by the given user. Thanks, Miklos _______________________________________________ Virtio-fs mailing list [email protected] https://www.redhat.com/mailman/listinfo/virtio-fs
