I don't intuit that removing the generality from the LRU framework is a desirable side effect.
Matt On Fri, Sep 22, 2017 at 8:51 AM, Daniel Gryniewicz <d...@redhat.com> wrote: > On 09/21/2017 07:45 PM, Frank Filz wrote: >> >> Philippe discovered that recent Ganesha will no longer allow compiling the >> linux kernel due to dangling open file descriptors. >> >> I'm not sure if there is any true leak, the simple test of echo foo > >> /mnt/foo does show a remaining open fd for /mnt/foo, however that is the >> global fd opened in the course of doing a getattrs on FSAL_ VFS. >> >> We have been talking about how the current management of open file >> descriptors doesn't really work, so I have a couple proposals: >> >> 1. We really should have a limit on the number of states we allow. Now >> that >> NLM locks and shares also have a state_t, it would be simple to have a >> count >> of how many are in use, and return a resource error if an operation >> requires >> creating a new one past the limit. This can be a hard limit with no grace, >> if the limit is hit, then alloc_state fails. > > > This I agree with. > >> >> 2. Management of the global fd is more complex, so here goes: >> >> Part of the proposal is a way for the FSAL to indicate that an FSAL call >> used the global fd in a way that consumes some kind of resource the FSAL >> would like managed. >> >> FSAL_PROXY should never indicate that (anonymous I/O should be done using >> a >> special stateid, and a simple file create should result in the open >> stateid >> immediately being closed, if that's not the case, then it's easy enough to >> indicate use of a limited resource. >> >> FSAL_VFS would indicate use of the resource any time it utilizes the >> global >> fd. If it uses a temp fd that is closed after performing the operation, it >> would not indicate use of the limited resource. >> >> FSAL_GPFS, FSAL_GLUSTER, and FSAL_CEPH should all be similar to FSAL_VFS. >> >> FSAL_RGW only has a global fd, and I don't quite understand how it is >> managed. > > > If only PROXY doesn't set this, then maybe it's added complexity we don't > need. Just assume it's set. > >> The main part of the proposal is to actually create a new LRU queue for >> objects that are using the limited resource. >> >> If we are at the hard limit on the limited resource and an entry that is >> not >> already in the LRU uses the resource, then we would reap an existing entry >> and call fsal_close on it to release the resource. If an entry was not >> available to be reaped, we would temporarily exceed the limit just like we >> do with mdcache entries. >> >> If an FSAL call resulted in use of the resource and the entry was already >> in >> the resource LRU, then it would be bumped to MRU of L1. >> >> The LRU run thread for the resource would demote objects from LRU L1 to >> MRU >> of L2, and call fsal_close and remove objects from LRU of L2. I think it >> should work to close any files that have not been used in the amount of >> time, really using the L1 and L2 to give a shorter life to objects for >> which >> the resource is used once and then not used again, whereas a file that is >> accessed multiple times would have more resistance to being closed. I >> think >> the exact mechanics here may need some tuning, but that's the general >> idea. >> >> The idea here is to be constantly closing files that have not been >> accessed >> recently, and also to better manage a count of the files for which we are >> actually using the resources, and not keep a file open just because for >> some >> reason we do lots of lookups or stats of it (we might have to open it for >> getattrs, but then we might serve a bunch of cached attrs, which doesn't >> go >> to disk, might as well close the fd). > > > This sounds almost exactly like the existing LRU thread, except that it > ignores refcount. If you remove global FD from the obj_handle, then the LRU > as it currently exists becomes unnecessary for MDCACHE entries, as they only > need a simple, single-level LRU based only on initial refcounts. The > current, multi-level LRU only exists to close the global FD when > transitioning LRU levels. > > So, what it sounds like to me is that you're splitting the LRU for entries > from the LRU for global FDs. Is this correct? If so, I think this > complicates the two sets of LRU transitions, but probably not insurmountably > so. > >> I also propose making the limit for the resource configurable independent >> of >> the ulimit for file descriptors, though if an FSAL is loaded that actually >> uses file descriptors for open files should check that the ulimit is big >> enough, it should also include the limit on state_t also. Of course it >> will >> be impossible to account for file descriptors used for sockets, log files, >> config files, or random libraries that like to open files... > > > Hmmm... I don't think we can do any kind of checking, if we're not going to > use ulimit by default, since it depends on which FSALs are in use at any > given time. I say we either default the limits to ulimit, or just ignore > ulimit entirely and log an appropriate error when EMFILE is returned. > > Daniel > > > ------------------------------------------------------------------------------ > Check out the vibrant tech community on one of the world's most > engaging tech sites, Slashdot.org! http://sdm.link/slashdot > _______________________________________________ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel -- Matt Benjamin Red Hat, Inc. 315 West Huron Street, Suite 140A Ann Arbor, Michigan 48103 http://www.redhat.com/en/technologies/storage tel. 734-821-5101 fax. 734-769-8938 cel. 734-216-5309 ------------------------------------------------------------------------------ Check out the vibrant tech community on one of the world's most engaging tech sites, Slashdot.org! http://sdm.link/slashdot _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel