With respect to FSAL_GLUSTER, like FSAL_GPFS, I think we can make use multiple-fd support (per OPEN) to support OPEN UPGRADE/OPEN_DOWNGRADE for specific NFS clients and to avoid the extra UNLOCK requests which you have mentioned at the end.
I could have missed the context. Could you please specify how the fd per lock owner per file would be used? Thanks, Soumya On 07/27/2015 10:03 PM, Frank Filz wrote: > We may need to devote a concall to discussing this work, but I'm going to > try a modest length discussion of the approach I'm taking and why. > > Ultimately, this effort got kicked off due to the stupid POSIX lock behavior > which really only impacts FSAL_VFS, but then when we started talking about > possibly not forcing a one-to-one mapping between file descriptors and > object handles, Marc Eshel suggested he could benefit from file descriptors > associated with individual NFS v4 OPENs so that GPFS's I/O prediction could > do better. And then Open File Descriptor Locks came along which opens > FSAL_VFS up to being able to have a much improved lock interface beyond just > being able to dodge the stupid POSIX lock behavior. > > And as I got into thinking I realized, hey, to get full share reservation > and lock support in FSAL_PROXY, we need to be able to associate open and > lock stateids with Ganesha stateids. > > So now we have: > > FSAL_GPFS would like to have a file descriptor per OPEN (well, really per > OPEN stateid, tracking OPEN upgrade and OPEN_DOWNGRADE), and maybe one per > NFS v3 client > > FSAL_VFS would like to have a file descriptor per lock owner per file. > > FSAL_PROXY would like to have a stateid per OPEN stateid and LOCK stateid > > FSAL_LUSTRE actually also uses fcntl to get POSIX locks, so it has the same > issues as FSAL_VFS. > > Now I don't know about other FSALs, though I wouldn't be surprised if at > least some other FSALs might benefit from some mechanism to associate > SOMETHING with each lock owner per file. > > So I came up with the idea of the FSAL providing the size of the "thing" > (which I have currently named fsal_fd, but we can change the name if it > would make folks feel more comfortable) with each Ganesha stateid. And then > to bring NFS v3 locks and share reservations into the picture, I've made > them able to use stateids (well, state_t structures), which also cleaned up > part of the SAL lock interface (since NFS v3 can hide it's "state" value in > the state_t and stop overloading state_t *state... > > Now each FSAL gets to define exactly what an fsal_fd actually is. At the > moment I have the open mode in the generic structure, but maybe it can move > into the FSAL private fsal_fd. > > Not all FSALs will use both open and lock fds, and we could provide separate > sizes for them so space would only be allocated when necessary (and if zero, > a NULL pointer is passed). > > For most operations, the object_handle, and both a share_fd and lock_fd are > passed. This allows the FSAL to decide exactly which ones it needs (if it > needs a generic "thing" for anonymous I/O it can stash a generic fsal_fd in > it's object_handle). > > The benefit of this interface is that the FSAL can leverage cache inode AVL > tree and SAL hash tables without making upcalls (that would be fraught with > locking issues) or duplicating hash tables in order to store information > entirely separately. It also may open possibilities to better hinting > between the layers so garbage collection can be improved. > > In the meantime, the legacy interface is still available. If truly some > FSALs will never need this function, but they want to benefit from the > atomic open/create/setattr capability of FSAL open_fd method, we can make a > more generic version of that (well, actually, it just needs to be ok having > NULL passed for the fsal_fd). > > I'm also thinking that eventually the cache inode content_lock will no > longer be what protects the generic "thing". Already FSAL_VFS is using the > object_handle lock to protect access to the fd associated with the > object_handle. Removing the need to hold content_lock would mean FSAL_VFS > could actually manage IT's number of open file descriptors and do some > management of them (maybe still in conjuction with cache inode to benefit > from the LRU table). But for example, a setattr or getattr call could result > in an open fd that FSAL_VFS would not immediately close. > > There is an eventual goal of getting rid of the insane logic SAL has to > manage lock state when the FSAL supports locks but does not support lock > owners. This logic is not too bad on LOCK requests, but UNLOCK requires > walking the entire lock list for a file to figure out what portions of the > UNLOCK request are held by other lock owners. This can result in N+1 lock > requests (and memory objects) to perform unlock (where N is the number of > locks held by other lock owners). > > Frank > > > ------------------------------------------------------------------------------ > _______________________________________________ > Nfs-ganesha-devel mailing list > Nfs-ganesha-devel@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel > ------------------------------------------------------------------------------ _______________________________________________ Nfs-ganesha-devel mailing list Nfs-ganesha-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs-ganesha-devel