Philippe discovered that recent Ganesha will no longer allow compiling the linux kernel due to dangling open file descriptors.
I'm not sure if there is any true leak, the simple test of echo foo > /mnt/foo does show a remaining open fd for /mnt/foo, however that is the global fd opened in the course of doing a getattrs on FSAL_ VFS. We have been talking about how the current management of open file descriptors doesn't really work, so I have a couple proposals: 1. We really should have a limit on the number of states we allow. Now that NLM locks and shares also have a state_t, it would be simple to have a count of how many are in use, and return a resource error if an operation requires creating a new one past the limit. This can be a hard limit with no grace, if the limit is hit, then alloc_state fails. 2. Management of the global fd is more complex, so here goes: Part of the proposal is a way for the FSAL to indicate that an FSAL call used the global fd in a way that consumes some kind of resource the FSAL would like managed. FSAL_PROXY should never indicate that (anonymous I/O should be done using a special stateid, and a simple file create should result in the open stateid immediately being closed, if that's not the case, then it's easy enough to indicate use of a limited resource. FSAL_VFS would indicate use of the resource any time it utilizes the global fd. If it uses a temp fd that is closed after performing the operation, it would not indicate use of the limited resource. FSAL_GPFS, FSAL_GLUSTER, and FSAL_CEPH should all be similar to FSAL_VFS. FSAL_RGW only has a global fd, and I don't quite understand how it is managed. The main part of the proposal is to actually create a new LRU queue for objects that are using the limited resource. If we are at the hard limit on the limited resource and an entry that is not already in the LRU uses the resource, then we would reap an existing entry and call fsal_close on it to release the resource. If an entry was not available to be reaped, we would temporarily exceed the limit just like we do with mdcache entries. If an FSAL call resulted in use of the resource and the entry was already in the resource LRU, then it would be bumped to MRU of L1. The LRU run thread for the resource would demote objects from LRU L1 to MRU of L2, and call fsal_close and remove objects from LRU of L2. I think it should work to close any files that have not been used in the amount of time, really using the L1 and L2 to give a shorter life to objects for which the resource is used once and then not used again, whereas a file that is accessed multiple times would have more resistance to being closed. I think the exact mechanics here may need some tuning, but that's the general idea. The idea here is to be constantly closing files that have not been accessed recently, and also to better manage a count of the files for which we are actually using the resources, and not keep a file open just because for some reason we do lots of lookups or stats of it (we might have to open it for getattrs, but then we might serve a bunch of cached attrs, which doesn't go to disk, might as well close the fd). I also propose making the limit for the resource configurable independent of the ulimit for file descriptors, though if an FSAL is loaded that actually uses file descriptors for open files should check that the ulimit is big enough, it should also include the limit on state_t also. Of course it will be impossible to account for file descriptors used for sockets, log files, config files, or random libraries that like to open files... The time has come to fix this... Frank --- This email has been checked for viruses by Avast antivirus software. https://www.avast.com/antivirus ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel