I don't intuit that removing the generality from the LRU framework is
a desirable side effect.

Matt

On Fri, Sep 22, 2017 at 8:51 AM, Daniel Gryniewicz <d...@redhat.com> wrote:
> On 09/21/2017 07:45 PM, Frank Filz wrote:
>>
>> Philippe discovered that recent Ganesha will no longer allow compiling the
>> linux kernel due to dangling open file descriptors.
>>
>> I'm not sure if there is any true leak, the simple test of echo foo >
>> /mnt/foo does show a remaining open fd for /mnt/foo, however that is the
>> global fd opened in the course of doing a getattrs on FSAL_ VFS.
>>
>> We have been talking about how the current management of open file
>> descriptors doesn't really work, so I have a couple proposals:
>>
>> 1. We really should have a limit on the number of states we allow. Now
>> that
>> NLM locks and shares also have a state_t, it would be simple to have a
>> count
>> of how many are in use, and return a resource error if an operation
>> requires
>> creating a new one past the limit. This can be a hard limit with no grace,
>> if the limit is hit, then alloc_state fails.
>
>
> This I agree with.
>
>>
>> 2. Management of the global fd is more complex, so here goes:
>>
>> Part of the proposal is a way for the FSAL to indicate that an FSAL call
>> used the global fd in a way that consumes some kind of resource the FSAL
>> would like managed.
>>
>> FSAL_PROXY should never indicate that (anonymous I/O should be done using
>> a
>> special stateid, and a simple file create should result in the open
>> stateid
>> immediately being closed, if that's not the case, then it's easy enough to
>> indicate use of a limited resource.
>>
>> FSAL_VFS would indicate use of the resource any time it utilizes the
>> global
>> fd. If it uses a temp fd that is closed after performing the operation, it
>> would not indicate use of the limited resource.
>>
>> FSAL_GPFS, FSAL_GLUSTER, and FSAL_CEPH should all be similar to FSAL_VFS.
>>
>> FSAL_RGW only has a global fd, and I don't quite understand how it is
>> managed.
>
>
> If only PROXY doesn't set this, then maybe it's added complexity we don't
> need.  Just assume it's set.
>
>> The main part of the proposal is to actually create a new LRU queue for
>> objects that are using the limited resource.
>>
>> If we are at the hard limit on the limited resource and an entry that is
>> not
>> already in the LRU uses the resource, then we would reap an existing entry
>> and call fsal_close on it to release the resource. If an entry was not
>> available to be reaped, we would temporarily exceed the limit just like we
>> do with mdcache entries.
>>
>> If an FSAL call resulted in use of the resource and the entry was already
>> in
>> the resource LRU, then it would be bumped to MRU of L1.
>>
>> The LRU run thread for the resource would demote objects from LRU L1 to
>> MRU
>> of L2, and call fsal_close and remove objects from LRU of L2. I think it
>> should work to close any files that have not been used in the amount of
>> time, really using the L1 and L2 to give a shorter life to objects for
>> which
>> the resource is used once and then not used again, whereas a file that is
>> accessed multiple times would have more resistance to being closed. I
>> think
>> the exact mechanics here may need some tuning, but that's the general
>> idea.
>>
>> The idea here is to be constantly closing files that have not been
>> accessed
>> recently, and also to better manage a count of the files for which we are
>> actually using the resources, and not keep a file open just because for
>> some
>> reason we do lots of lookups or stats of it (we might have to open it for
>> getattrs, but then we might serve a bunch of cached attrs, which doesn't
>> go
>> to disk, might as well close the fd).
>
>
> This sounds almost exactly like the existing LRU thread, except that it
> ignores refcount.  If you remove global FD from the obj_handle, then the LRU
> as it currently exists becomes unnecessary for MDCACHE entries, as they only
> need a simple, single-level LRU based only on initial refcounts.  The
> current, multi-level LRU only exists to close the global FD when
> transitioning LRU levels.
>
> So, what it sounds like to me is that you're splitting the LRU for entries
> from the LRU for global FDs.  Is this correct?  If so, I think this
> complicates the two sets of LRU transitions, but probably not insurmountably
> so.
>
>> I also propose making the limit for the resource configurable independent
>> of
>> the ulimit for file descriptors, though if an FSAL is loaded that actually
>> uses file descriptors for open files should check that the ulimit is big
>> enough, it should also include the limit on state_t also. Of course it
>> will
>> be impossible to account for file descriptors used for sockets, log files,
>> config files, or random libraries that like to open files...
>
>
> Hmmm... I don't think we can do any kind of checking, if we're not going to
> use ulimit by default, since it depends on which FSALs are in use at any
> given time.  I say we either default the limits to ulimit, or just ignore
> ulimit entirely and log an appropriate error when EMFILE is returned.
>
> Daniel
>
>
> ------------------------------------------------------------------------------
> Check out the vibrant tech community on one of the world's most
> engaging tech sites, Slashdot.org! http://sdm.link/slashdot
> _______________________________________________
> Nfs-ganesha-devel mailing list
> Nfs-ganesha-devel@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel



-- 

Matt Benjamin
Red Hat, Inc.
315 West Huron Street, Suite 140A
Ann Arbor, Michigan 48103

http://www.redhat.com/en/technologies/storage

tel.  734-821-5101
fax.  734-769-8938
cel.  734-216-5309

------------------------------------------------------------------------------
Check out the vibrant tech community on one of the world's most
engaging tech sites, Slashdot.org! http://sdm.link/slashdot
_______________________________________________
Nfs-ganesha-devel mailing list
Nfs-ganesha-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel

Reply via email to