Re: [Gluster-users] Disabling read-ahead and io-cache for native fuse mounts
On Wed, Feb 13, 2019 at 10:51 AM Raghavendra Gowdappa wrote: > > > On Tue, Feb 12, 2019 at 5:38 PM Raghavendra Gowdappa > wrote: > >> All, >> >> We've found perf xlators io-cache and read-ahead not adding any >> performance improvement. At best read-ahead is redundant due to kernel >> read-ahead >> > > One thing we are still figuring out is whether kernel read-ahead is > tunable. From what we've explored, it _looks_ like (may not be entirely > correct), ra is capped at 128KB. If that's the case, I am interested in few > things: > * Are there any realworld applications/usecases, which would benefit from > larger read-ahead (Manoj says block devices can do ra of 4MB)? > kernel read-ahead is adaptive but influenced by the read-ahead setting on the block device (/sys/block//queue/read_ahead_kb), which can be tuned. For RHEL specifically, the default is 128KB (last I checked) but the default RHEL tuned-profile, throughput-performance, bumps that up to 4MB. It should be fairly easy to rig up a test where 4MB read-ahead on the block device gives better performance than 128KB read-ahead. -- Manoj * Is the limit on kernel ra tunable a hard one? IOW, what does it take to > make it to do higher ra? If its difficult, can glusterfs read-ahead provide > the expected performance improvement for these applications that would > benefit from aggressive ra (as glusterfs can support larger ra sizes)? > > I am still inclined to prefer kernel ra as I think its more intelligent > and can identify more sequential patterns than Glusterfs read-ahead [1][2]. > [1] https://www.kernel.org/doc/ols/2007/ols2007v2-pages-273-284.pdf > [2] https://lwn.net/Articles/155510/ > > and at worst io-cache is degrading the performance for workloads that >> doesn't involve re-read. Given that VFS already have both these >> functionalities, I am proposing to have these two translators turned off by >> default for native fuse mounts. >> >> For non-native fuse mounts like gfapi (NFS-ganesha/samba) we can have >> these xlators on by having custom profiles. Comments? >> >> [1] https://bugzilla.redhat.com/show_bug.cgi?id=1665029 >> >> regards, >> Raghavendra >> > ___ Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [ovirt-users] GlusterFS performance with only one drive per host?
My take is that unless you have loads of data and are trying to optimize for cost/TB, HDDs are probably not the right choice. This is particularly true for random I/O workloads for which HDDs are really quite bad. I'd recommend a recent gluster release, and some tuning because the default settings are not optimized for performance. Some options to consider: client.event-threads server.event-threads cluster.choose-local performance.client-io-threads You can toggle the last two and see what works for you. You'd probably need to set event-threads to 4 or more. Ideally you'd tune some of the thread pools based on observed bottlenecks in collected stats. top (top -bHd 10 > top_threads.out.txt) is great for this. Using 6 small drives/bricks instead of 3 is also a good idea to reduce likelihood of rpc bottlenecks. There has been an effort to improve gluster performance over fast SSDs. Hence the recommendation to try with a recent release. You can also check in on some of the issues being worked on: https://github.com/gluster/glusterfs/issues/412 https://github.com/gluster/glusterfs/issues/410 -- Manoj On Sat, Mar 24, 2018 at 4:14 AM, Jayme <jay...@gmail.com> wrote: > Do you feel that SSDs are worth the extra cost or am I better off using > regular HDDs? I'm looking for the best performance I can get with glusterFS > > On Fri, Mar 23, 2018 at 12:03 AM, Manoj Pillai <mpil...@redhat.com> wrote: > >> >> >> On Thu, Mar 22, 2018 at 3:31 PM, Sahina Bose <sab...@redhat.com> wrote: >> >>> >>> >>> On Mon, Mar 19, 2018 at 5:57 PM, Jayme <jay...@gmail.com> wrote: >>> >>>> I'm spec'ing a new oVirt build using three Dell R720's w/ 256GB. I'm >>>> considering storage options. I don't have a requirement for high amounts >>>> of storage, I have a little over 1TB to store but want some overhead so I'm >>>> thinking 2TB of usable space would be sufficient. >>>> >>>> I've been doing some research on Micron 1100 2TB ssd's and they seem to >>>> offer a lot of value for the money. I'm considering using smaller cheaper >>>> SSDs for boot drives and using one 2TB micron SSD in each host for a >>>> glusterFS replica 3 setup (on the fence about using an arbiter, I like the >>>> extra redundancy replicate 3 will give me). >>>> >>>> My question is, would I see a performance hit using only one drive in >>>> each host with glusterFS or should I try to add more physical disks. Such >>>> as 6 1TB drives instead of 3 2TB drives? >>>> >>> >> It is possible. With SSDs the rpc layer can become the bottleneck with >> some workloads, especially if there are not enough connections out to the >> server side. We had experimented with a multi-connection model for this >> reason: https://review.gluster.org/#/c/19133/. >> >> -- Manoj >> >>> >>> [Adding gluster-users for inputs here] >>> >>> >>>> Also one other question. I've read that gluster can only be done in >>>> groups of three. Meaning you need 3, 6, or 9 hosts. Is this true? If I >>>> had an operational replicate 3 glusterFS setup and wanted to add more >>>> capacity I would have to add 3 more hosts, or is it possible for me to add >>>> a 4th host in to the mix for extra processing power down the road? >>>> >>> >>> In oVirt, we support replica 3 or replica 3 with arbiter (where one of >>> the 3 bricks is a low storage arbiter brick). To expand storage, you would >>> need to add in multiples of 3 bricks. However if you only want to expand >>> compute capacity in your HC environment, you can add a 4th node. >>> >>> >>>> Thanks! >>>> >>>> >>>> ___ >>>> Users mailing list >>>> us...@ovirt.org >>>> http://lists.ovirt.org/mailman/listinfo/users >>>> >>>> >>> >>> ___ >>> Gluster-users mailing list >>> Gluster-users@gluster.org >>> http://lists.gluster.org/mailman/listinfo/gluster-users >>> >> >> > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] [ovirt-users] GlusterFS performance with only one drive per host?
On Thu, Mar 22, 2018 at 3:31 PM, Sahina Bosewrote: > > > On Mon, Mar 19, 2018 at 5:57 PM, Jayme wrote: > >> I'm spec'ing a new oVirt build using three Dell R720's w/ 256GB. I'm >> considering storage options. I don't have a requirement for high amounts >> of storage, I have a little over 1TB to store but want some overhead so I'm >> thinking 2TB of usable space would be sufficient. >> >> I've been doing some research on Micron 1100 2TB ssd's and they seem to >> offer a lot of value for the money. I'm considering using smaller cheaper >> SSDs for boot drives and using one 2TB micron SSD in each host for a >> glusterFS replica 3 setup (on the fence about using an arbiter, I like the >> extra redundancy replicate 3 will give me). >> >> My question is, would I see a performance hit using only one drive in >> each host with glusterFS or should I try to add more physical disks. Such >> as 6 1TB drives instead of 3 2TB drives? >> > It is possible. With SSDs the rpc layer can become the bottleneck with some workloads, especially if there are not enough connections out to the server side. We had experimented with a multi-connection model for this reason: https://review.gluster.org/#/c/19133/. -- Manoj > > [Adding gluster-users for inputs here] > > >> Also one other question. I've read that gluster can only be done in >> groups of three. Meaning you need 3, 6, or 9 hosts. Is this true? If I >> had an operational replicate 3 glusterFS setup and wanted to add more >> capacity I would have to add 3 more hosts, or is it possible for me to add >> a 4th host in to the mix for extra processing power down the road? >> > > In oVirt, we support replica 3 or replica 3 with arbiter (where one of the > 3 bricks is a low storage arbiter brick). To expand storage, you would need > to add in multiples of 3 bricks. However if you only want to expand compute > capacity in your HC environment, you can add a 4th node. > > >> Thanks! >> >> >> ___ >> Users mailing list >> us...@ovirt.org >> http://lists.ovirt.org/mailman/listinfo/users >> >> > > ___ > Gluster-users mailing list > Gluster-users@gluster.org > http://lists.gluster.org/mailman/listinfo/gluster-users > ___ Gluster-users mailing list Gluster-users@gluster.org http://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] gluster for home directories?
Hi Rik, Nice clarity and detail in the description. Thanks! inline... On Wed, Mar 7, 2018 at 8:29 PM, Rik Theyswrote: > Hi, > > We are looking into replacing our current storage solution and are > evaluating gluster for this purpose. Our current solution uses a SAN > with two servers attached that serve samba and NFS 4. Clients connect to > those servers using NFS or SMB. All users' home directories live on this > server. > > I would like to have some insight in who else is using gluster for home > directories for about 500 users and what performance they get out of the > solution. Which connectivity method are you using on the clients > (gluster native, nfs, smb)? Which volume options do you have configured > for your gluster volume? What hardware are you using? Are you using > snapshots and/or quota? If so, any number on performance impact? > > The solution I had in mind for our setup is multiple servers/bricks with > replica 3 arbiter 1 volume where each server is also running nfs-ganesha > and samba in HA. Clients would be connecting to one of the nfs servers > (dns round robin). In this case the nfs servers would be the gluster > clients. Gluster traffic would go over a dedicated network with 10G and > jumbo frames. > > I'm currently testing gluster (3.12, now 3.13) on older machines[1] and > have created a replica 3 arbiter 1 volume 2x(2+1). I seem to run in all > sorts of (performance) problems. I must be doing something wrong but > I've tried all sorts of benchmarks and nothing seems to make my setup > live up to what I would expect from this hardware. > > * I understand that gluster only starts to work well when multiple > clients are connecting in parallel, but I did expect the single client > performance to be better. > > * Unpacking the linux-4.15.7.tar.xz file on the brick XFS filesystem > followed by a sync takes about 1 minute. Doing the same on the gluster > volume using the fuse client (client is one of the brick servers) takes > over 9 minutes and neither disk nor cpu nor network are reaching their > bottleneck. Doing the same over NFS-ganesha (client is a workstation > connected through gbit) takes even longer (more than 30min!?). > > I understand that unpacking a lot of small files may be the worst > workload for a distributed filesystem, but when I look at the file sizes > of the files in our users' home directories, more than 90% is smaller > than 1MB. > > * A file copy of a 300GB file over NFS 4 (nfs-ganesha) starts fast > (90MB/s) and then drops to 20MB/s. When I look at the servers during the > copy, I don't see where the bottleneck is as the cpu, disk and network > are not maxing out (on none of the bricks). When the same client copies > the file to our current NFS storage it is limited by the gbit network > connection of the client. > Both untar and cp are single-threaded, which means throughput is mostly dictated by latency. Latency is generally higher in a distributed FS; nfs-ganesha has an extra hop to the backend, and hence higher latency for most operations compared to glusterfs-fuse. You don't necessarily need multiple clients for good performance with gluster. Many multi-threaded benchmarks give good performance from a single client. Here for e.g., if you run multiple copy commands in parallel from the same client, I'd expect your aggregate transfer rate to improve. Been a long while since I looked at nfs-ganesha. But in terms of upper bounds for throughput tests: data needs to flow over the client->nfs-server link, and then, depending on which servers the file is located on, either 1x (if the nfs-ganesha node is also hosting one copy of the file, and neglecting arbiter) or 2x over the s2s link. With 1Gbps links, that means an upper bound between 125 MB/s and 62.5 MB/s, in the steady state, unless I miscalculated. -- Manoj > > * I had the 'cluster.optimize-lookup' option enabled but ran into all > sorts of issues where ls is showing either the wrong files (content of a > different directory), or claiming a directory does not exist when mkdir > says it already exists... I current have the following options set: > > server.outstanding-rpc-limit: 256 > client.event-threads: 4 > performance.io-thread-count: 16 > performance.parallel-readdir: on > server.event-threads: 4 > performance.cache-size: 2GB > performance.rda-cache-limit: 128MB > performance.write-behind-window-size: 8MB > performance.md-cache-timeout: 600 > performance.cache-invalidation: on > performance.stat-prefetch: on > network.inode-lru-limit: 50 > performance.nl-cache-timeout: 600 > performance.nl-cache: on > features.cache-invalidation-timeout: 600 > features.cache-invalidation: on > transport.address-family: inet > nfs.disable: on > cluster.enable-shared-storage: enable > > The brick servers have 2 dual-core cpu's so I've set the client and > server event threads to 4. > > * When using nfs-ganesha I run into bugs that makes me wonder who is > using nfs-ganesha with
Re: [Gluster-users] [Gluster-devel] CFP for Gluster Developer Summit
Here's a proposal ... Title: State of Gluster Performance Theme: Stability and Performance I hope to achieve the following in this talk: * present a brief overview of current performance for the broad workload classes: large-file sequential and random workloads, small-file and metadata-intensive workloads. * highlight some use-cases where we are seeing really good performance. * highlight some of the areas of concerns, covering in some detail the state of analysis and work in progress. Regards, Manoj - Original Message - > Hey All, > > Gluster Developer Summit 2016 is fast approaching [1] on us. We are > looking to have talks and discussions related to the following themes in > the summit: > > 1. Gluster.Next - focusing on features shaping the future of Gluster > > 2. Experience - Description of real world experience and feedback from: > a> Devops and Users deploying Gluster in production > b> Developers integrating Gluster with other ecosystems > > 3. Use cases - focusing on key use cases that drive Gluster.today and > Gluster.Next > > 4. Stability & Performance - focusing on current improvements to reduce > our technical debt backlog > > 5. Process & infrastructure - focusing on improving current workflow, > infrastructure to make life easier for all of us! > > If you have a talk/discussion proposal that can be part of these themes, > please send out your proposal(s) by replying to this thread. Please > clearly mention the theme for which your proposal is relevant when you > do so. We will be ending the CFP by 12 midnight PDT on August 31st, 2016. > > If you have other topics that do not fit in the themes listed, please > feel free to propose and we might be able to accommodate some of them as > lightening talks or something similar. > > Please do reach out to me or Amye if you have any questions. > > Thanks! > Vijay > > [1] https://www.gluster.org/events/summit2016/ > ___ > Gluster-devel mailing list > gluster-de...@gluster.org > http://www.gluster.org/mailman/listinfo/gluster-devel > ___ Gluster-users mailing list Gluster-users@gluster.org http://www.gluster.org/mailman/listinfo/gluster-users