commit call also uses global fd and RHEL6.3 clients do send commit in your
case. If you use any recent client, that is probably due to getattr though.
Ganesha seems to close global file descriptors in the reaper call if we set
cache_fds as FALSE in our ganesha config. Since reaper is run only
periodically, we may not have much performance degradation, if any, by
closing global fds (with NFSv3) every 90 seconds or so. That is a temporary
work around we are going to use in the short term. Opening and not closing
files is a bad implementation...
Regards, Malahal.
On Fri, Sep 22, 2017 at 5:15 AM, Frank Filz <ffilz...@mindspring.com> wrote:
> Philippe discovered that recent Ganesha will no longer allow compiling the
> linux kernel due to dangling open file descriptors.
>
> I'm not sure if there is any true leak, the simple test of echo foo >
> /mnt/foo does show a remaining open fd for /mnt/foo, however that is the
> global fd opened in the course of doing a getattrs on FSAL_ VFS.
>
> We have been talking about how the current management of open file
> descriptors doesn't really work, so I have a couple proposals:
>
> 1. We really should have a limit on the number of states we allow. Now that
> NLM locks and shares also have a state_t, it would be simple to have a
> count
> of how many are in use, and return a resource error if an operation
> requires
> creating a new one past the limit. This can be a hard limit with no grace,
> if the limit is hit, then alloc_state fails.
>
> 2. Management of the global fd is more complex, so here goes:
>
> Part of the proposal is a way for the FSAL to indicate that an FSAL call
> used the global fd in a way that consumes some kind of resource the FSAL
> would like managed.
>
> FSAL_PROXY should never indicate that (anonymous I/O should be done using a
> special stateid, and a simple file create should result in the open stateid
> immediately being closed, if that's not the case, then it's easy enough to
> indicate use of a limited resource.
>
> FSAL_VFS would indicate use of the resource any time it utilizes the global
> fd. If it uses a temp fd that is closed after performing the operation, it
> would not indicate use of the limited resource.
>
> FSAL_GPFS, FSAL_GLUSTER, and FSAL_CEPH should all be similar to FSAL_VFS.
>
> FSAL_RGW only has a global fd, and I don't quite understand how it is
> managed.
>
> The main part of the proposal is to actually create a new LRU queue for
> objects that are using the limited resource.
>
> If we are at the hard limit on the limited resource and an entry that is
> not
> already in the LRU uses the resource, then we would reap an existing entry
> and call fsal_close on it to release the resource. If an entry was not
> available to be reaped, we would temporarily exceed the limit just like we
> do with mdcache entries.
>
> If an FSAL call resulted in use of the resource and the entry was already
> in
> the resource LRU, then it would be bumped to MRU of L1.
>
> The LRU run thread for the resource would demote objects from LRU L1 to MRU
> of L2, and call fsal_close and remove objects from LRU of L2. I think it
> should work to close any files that have not been used in the amount of
> time, really using the L1 and L2 to give a shorter life to objects for
> which
> the resource is used once and then not used again, whereas a file that is
> accessed multiple times would have more resistance to being closed. I think
> the exact mechanics here may need some tuning, but that's the general idea.
>
> The idea here is to be constantly closing files that have not been accessed
> recently, and also to better manage a count of the files for which we are
> actually using the resources, and not keep a file open just because for
> some
> reason we do lots of lookups or stats of it (we might have to open it for
> getattrs, but then we might serve a bunch of cached attrs, which doesn't go
> to disk, might as well close the fd).
>
> I also propose making the limit for the resource configurable independent
> of
> the ulimit for file descriptors, though if an FSAL is loaded that actually
> uses file descriptors for open files should check that the ulimit is big
> enough, it should also include the limit on state_t also. Of course it will
> be impossible to account for file descriptors used for sockets, log files,
> config files, or random libraries that like to open files...
>
> The time has come to fix this...
>
> Frank
>
>
>
> ---
> This email has been checked for viruses by Avast antivirus software.
> https://www.avast.com/antivirus
>
>
> ------------------------------------------------------------
> ------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel
>
------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel