On Tue, Apr 10, 2018 at 10:02 AM, riya khanna <riyakhanna1...@gmail.com> wrote:
> On Mon, Apr 9, 2018 at 10:42 PM, Raghavendra Gowdappa <rgowd...@redhat.com > > wrote: > >> +Manoj. >> >> On Mon, Apr 9, 2018 at 10:18 PM, riya khanna <riyakhanna1...@gmail.com> >> wrote: >> >>> Hi All, >>> >>> I'm trying to use the new framework to speed up lookups/attr/xattr >>> operations by split functionality between fast/slow execution paths. I'd >>> highly appreciate if you could suggest experiments to evaluate the >>> performance improvement. >>> >> How about a software build workload, varying the number of source files? Especially the case where nothing needs to be done because no files have changed since last build -- this case should be all metadata operations. -- Manoj >> As you've pointed out already, this is a good place for read caches (both >> data and metadata). While there is an overlap between things cached by >> kernel and things cached by glusterfs, there are somethings which are >> cached only by glusterfs but not by VFS/kernel. I think this is the area we >> can explore to move these caches into kernel. Things I can think of: >> >> > Even if things are cached by VFS (e.g., dir entries, attributes, etc.). > The size of VFS dcache is limited and can affect performance when under > pressure. Have you ever experienced such as case? Nevertheless, with the > new framework can help create your own dir/attr cache managed by the > user-space daemon - lets call it self-managed dcache. > > >> * xattr caching - done by md-cache in glusterfs. I am not sure whether >> VFS caches xattrs. If not, this can yield good returns for workloads >> involving xattrs (like POSIX acls etc). >> > > Thanks! Similar to attr, xattr caching should be doable as well. I can > start by looking at the existing implementation in md-cache. > > >> * GET kind of interface for small files - done by quick-read in >> glusterfs. Note that we fetch the file in lookup. If we couple this with >> pushing open-behind in kernel, we can prevent open/readv/flush/release to >> glusterfs completely in suitable workloads (We had earlier found that this >> boosts performance for webserver usecases). I think in lookup response, we >> would've to populate page cache. Also lookup response signature doesn't >> provide for holding this data. Not sure whether this can be done. >> > > This one is tricky. There are some limitations imposed by the framework. > Let me think about it. > > >> * Dirent prefetching for directories - done by readdir-ahead. >> > The user space daemon in readdir() can populate the self-managed dcache. > Future lookups can be served from this cache entirely within the kernel. > What kind of workload can benefit from this? > > >> * As you've already pointed out, we can improve on our invalidation >> strategies. >> * since page cache is already present in VFS, I don't think >> read-ahead/io-cache might have any benefits. >> > > The framework can also bypass fuse user space daemon during data I/O > (e.g., read, write) if the file is locally stored by the lower file system. > This design is called pass-though I/O and has been discussed numerous times > on fuse-dlevel mailing list. Recent discussion: https://lwn.net/Ar > ticles/674286/ > Does this apply to glusterfs as well, perhaps when a file is cached by the > client locally? > > >>> As I mentioned in my previous email, I'm caching replies from fuse >>> daemon (hashed key/value blobs) in the kernel so that for the same key >>> (e.g., <parent ino, child name> in case of FUSE_LOOKUP), the reply (e.g., >>> fuse_entry_out) is served from the kernel itself and no call is delivered >>> to user-space. >>> >>> While this may seem redundant due to entry_timeout/attr_timeout caching >>> that already exists in FUSE, this design provides more control to the >>> user-space daemon over when/what to invalidate. For instance, entry_timeout >>> caching is only valid until a timeout or until the kernel removes dentry >>> from its dcache. >>> >>> For invalidation, fuse_lowlevel_notify_inval_entry() can also remove >>> entries from the hash table. Please refer to the figure attached in my last >>> email. >>> >>> Thanks, >>> Riya >>> >>> On Tue, Apr 3, 2018 at 1:45 PM, riya khanna <riyakhanna1...@gmail.com> >>> wrote: >>> >>>> I'm attaching a figure that depicts the architecture of my optimized >>>> fuse framework. Kindly let me know if you have any questions. >>>> >>>> On Mon, Apr 2, 2018 at 10:57 AM, riya khanna <riyakhanna1...@gmail.com> >>>> wrote: >>>> >>>>> Thanks Amar! Please see my answers inline. >>>>> >>>>> On Mon, Apr 2, 2018 at 5:41 AM, Amar Tumballi <atumb...@redhat.com> >>>>> wrote: >>>>> >>>>>> Hi Riya, >>>>>> >>>>>> Thanks for writing to us. Some questions before we start on this. >>>>>> >>>>>> * Where can we see your work of modifying the fuse module to cache >>>>>> the calls? Some reference would help us to provide more specific >>>>>> pointers. >>>>>> (or ask better questions). >>>>>> >>>>>> I've created a fast path framework for FUSE, where the user space >>>>> daemon can load a module and register handlers for file operations >>>>> (lookup, >>>>> open, r/w, etc.) that must be handled in the kernel itself without an up >>>>> call to the user space. I call them fast path handlers. This design also >>>>> retains the regular FUSE handlers for file system operations in >>>>> upser-space >>>>> (slow path). The fast path and slow path can communicate with each other >>>>> over shared memory or using syscalls to enable/invalidate caching of data >>>>> structs (e.g., results of getattr, getxattr, etc.) >>>>> >>>>> There's a process I need to follow in order to make the code available >>>>> publicly. I've already started, but will take some time. I will try to do >>>>> this asap. >>>>> >>>>> * If the caching happened in fuse module, and it expects the regular >>>>>> arguments as the parameters, then there may not be any work required at >>>>>> all >>>>>> in glusterfs, as it works on low-level fuse api. >>>>>> >>>>>> >>>>> The fast handlers expect same interface and args (fuse_args) as the >>>>> regular user-space daemon. The fast handler code is fs-specific, >>>>> therefore, >>>>> must come from glusterfs. Changes are also needed in glusterfs code to >>>>> communicate with the fast path for enabling/invalidating caching. >>>>> >>>>> >>>>>> * Also, how to invalidate caches from userspace program? because >>>>>> GlusterFS could be accessed from multiple clients, so it becomes an >>>>>> important piece to have. >>>>>> >>>>>> >>>>> Server invalidate can trigger a system call into the fast path >>>>> framework to invalidate caches. >>>>> >>>>> >>>>>> For referring at the codebase to look at integration with fuse >>>>>> module, please check the directory 'xlators/mount/fuse/src/' and mostly >>>>>> the >>>>>> file 'fuse-bridge.c'. >>>>>> >>>>>> Thanks for your interest in project, would be great to collaborate on >>>>>> this effort, as it can enhance the performance of glusterfs in many >>>>>> usecases. >>>>>> >>>>> >>>>> I'm still going through gluster developer documentation, but it'd be >>>>> helpful if you could mention what kind of use cases does the fast/slow >>>>> split FUSE framework enable? i've already applied the framework to >>>>> accelerate multiple FUSE-based stackable file systems, but want the >>>>> interface to be generic enough for all FUSE file systems to take advantage >>>>> of it. >>>>> >>>>> >>>>>> Regards, >>>>>> Amar >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Apr 2, 2018 at 6:34 AM, riya khanna <riyakhanna1...@gmail.com >>>>>> > wrote: >>>>>> >>>>>>> Hi, >>>>>>> >>>>>>> I've modified FUSE framework to take a part of user-space daemon >>>>>>> code and moves it into the kernel fuse driver to minimize >>>>>>> user-kernel-user >>>>>>> switches during file system operations. An example would be caching >>>>>>> getattr/getxattr/lookup/security checks etc. This design, >>>>>>> therefore, create fast (served directly from the kernel) and a slow >>>>>>> (regular fuse) execution paths. The fast and slow paths can also >>>>>>> communicate with each other using shared memory. >>>>>>> >>>>>>> I was wondering if it is possible to accelerate glusterfs using this >>>>>>> design. What pieces could (should) be easily moved to kernel space? >>>>>>> Any pointers would be highly appreciated. Thanks! >>>>>>> >>>>>>> -Riya >>>>>>> >>>>>>> _______________________________________________ >>>>>>> Gluster-devel mailing list >>>>>>> Gluster-devel@gluster.org >>>>>>> http://lists.gluster.org/mailman/listinfo/gluster-devel >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Amar Tumballi (amarts) >>>>>> >>>>> >>>>> >>>> >>> >> >
_______________________________________________ Gluster-devel mailing list Gluster-devel@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-devel