On Wed, Jan 6, 2021 at 5:29 AM Amir Goldstein <[email protected]> wrote:
>
> On Mon, Jan 4, 2021 at 8:57 PM Dr. David Alan Gilbert
> <[email protected]> wrote:
> >
> > * Vivek Goyal ([email protected]) wrote:
> > > On Mon, Jan 04, 2021 at 04:00:13PM +0000, Dr. David Alan Gilbert wrote:
> > > > Hi,
> > > >   On virtio-fs we're hitting a problem with NFS, where
> > > > unlinking a file in a directory and then rmdir'ing that
> > > > directory fails complaining about the directory not being empty.
> > > >
> > > > The problem here is that if a file has an open fd, NFS doesn't
> > > > actually delete the file on unlink, it just renames it to
> > > > a hidden file (e.g. .nfs*******).  That open file is there because
> > > > the 'FORGET' hasn't completed yet by the time the rmdir is issued.
> > > >
> > > > Question:
> > > >   a) In the FUSE protocol, are requests assumed to complete in order;
> > > > i.e.  unlink, forget, rmdir   is it required that 'forget' completes
> > > > before the rmdir is processed?
> > > >      (In virtiofs we've been processing requests, in parallel, and
> > > > have sent forgets down a separate queue to keep them out of the way).
> > > >
> > > >   b) 'forget' doesn't send a reply - so the kernel can't wait for the
> > > > client to have finished it;  do we need a synchronous forget here?
> > >
> > > Even if we introduce a synchronous forget, will that really fix the
> > > issue. For example, this could also happen if file has been unlinked
> > > but it is still open and directory is being removed.
> > >
> > > fd = open(foo/bar.txt)
> > > unlink foo/bar.txt
> > > rmdir foo
> > > close(fd).
> > >
> > > In this case, final forget should go after fd has been closed. Its
> > > not a forget race.
> > >
> > > I wrote a test case for this and it works on regular file systems.
> > >
> > > https://github.com/rhvgoyal/misc/blob/master/virtiofs-tests/rmdir.c
> > >
> > > I suspect it will fail on nfs because I am assuming that temporary
> > > file will be there till final close(fd) happens. If that's the
> > > case this is a NFS specific issue because its behavior is different
> > > from other file systems.
> >
> > That's true; but that's NFS just being NFS; in our case we're keeping
> > an fd open even though the guest has been smart enough not to; so we're
> > causing the NFS oddity when it wouldn't normally happen.
> >
>
> Are you sure that you really need this oddity?
>
> My sense from looking virtiofsd is that the open O_PATH fd
> in InodeData for non-directories are an overkill and even the need
> for open fd for all directories is questionable.
>
> If you store a FileHandle (name_to_handle_at(2)) instead of an open fd
> for non-directories, you won't be keeping a reference on the underlying inode
> so no unlink issue.
>
> open_by_handle_at(2) is very cheap for non-directory when underlying inode
> is cached and as cheap as it can get even when inode is not in cache, so no
> performance penalty is expected.

You are perfectly right that using file handles would solve a number
of issues, one being too many open file descriptors.

The issue with open_by_handle_at(2) is that it needs
CAP_DAC_READ_SEARCH in the initial user namespace. That currently
makes it impossible to use in containers and such.

Not sure if there has been proposals for making open_by_handle_at(2)
usable in user namespaces.  The difficulty is in verifying that an
open file would have been obtainable by path lookup by the given user.

Thanks,
Miklos

_______________________________________________
Virtio-fs mailing list
[email protected]
https://www.redhat.com/mailman/listinfo/virtio-fs

Reply via email to