Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Dear Rafi, all, please find attached two profile files; both are profiling the same command: ``` time rsync -a $SRC root@172.23.187.207:/glusterfs ``` In both cases, the target is a Ubuntu 16.04 VM mounting a pure distributed GlusterFS 7 filesystem on `/glusterfs`. The GlusterFS 7 cluster is comprised of 3 identical Ubuntu 16.04 VMs, each with one brick of 200GB. I have turned ctime off, as suggested in a previous email. In the case of file `profile1.txt` (larger file), $SRC is a directory tree containing ~94'500 files, collectively weighing ~376GB. The transfer takes between 250 and 300 minutes (I've made several attempts now), for an average bandwidth of ~21MB/s. In the case of `profile3.txt`, $SRC is a single tar file weighing 72GB. It takes between 16 and 30 minutes to write it into the GlusterFS 7 filesystem; average bandwidth is ~60MB/s. To me, this seems to indicate that, while write performance on data is good, metadata ops on GlusterFS 7 are rather slow, and much slower than in the 3.x series. Is there any other tweak that I may try to apply? Thanks, Riccardo Brick: server001:/srv/glusterfs --- Cumulative Stats: Block Size: 2b+ 4b+ 8b+ No. of Reads:0 0 0 No. of Writes:4 2 9 Block Size: 16b+ 32b+ 64b+ No. of Reads:0 0 0 No. of Writes: 322837 Block Size:128b+ 256b+ 512b+ No. of Reads:0 0 0 No. of Writes: 108 297 575 Block Size: 1024b+2048b+4096b+ No. of Reads:0 0 0 No. of Writes: 984 1994 4334 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads:0 0 0 No. of Writes: 8304 16146 33163 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads:0 0 0 No. of Writes:64431 4173687 207 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 114999 FORGET 0.00 0.00 us 0.00 us 0.00 us 130690 RELEASE 0.00 0.00 us 0.00 us 0.00 us 1886 RELEASEDIR 0.00 449.84 us 222.38 us2924.41 us 31 MKNOD 0.01 177.17 us 104.16 us 436.34 us281LINK 0.01 128.41 us 44.77 us2512.16 us414SETXATTR 0.01 82.37 us 29.26 us4792.82 us 1319 FSTAT 0.02 414.20 us 181.30 us 10818.54 us414 MKDIR 0.02 741.77 us 103.38 us 158364.81 us281 UNLINK 0.29 77.67 us 17.57 us4887.93 us 34118 STATFS 0.37 50.29 us 10.83 us7341.33 us 67134 FLUSH 0.73 46.83 us 9.41 us7990.78 us 140764 ENTRYLK 0.85 219.73 us 75.32 us 125748.37 us 35106 RENAME 1.21 51.71 us 10.32 us 10816.56 us 211798 INODELK 1.33 377.95 us 38.65 us 150160.77 us 31778 FTRUNCATE 1.64 97.82 us 23.85 us6386.67 us 151947STAT 3.25 123.40 us 35.48 us 43775.28 us 238449 SETATTR 4.31 581.91 us 123.91 us 261522.99 us 67134 CREATE 7.30 137.63 us 23.45 us 153825.73 us 480472 LOOKUP 78.65 321.99 us 32.95 us 169393.74 us2211891 WRITE Duration: 156535 seconds Data Read: 0 bytes Data Written: 555615327151 bytes Interval 2 Stats: Block Size: 2b+ 8b+ 16b+ No. of Reads:0 0 0 No. of Writes:3 624 Block Size: 32b+ 64b+ 128b+ No. of Reads:0 0 0 No. of Writes: 21277
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
On 11/5/19 4:53 PM, Riccardo Murri wrote: Is it possible for you to repeat the test by disabling ctime or increasing the inode size to a higher value say 1024? Sure! How do I disable ctime or increase the inode size? Would this suffice to disable `ctime`? sudo gluster volume set glusterfs ctime off Can it be done on a running cluster? Do I need to unmount and remount for clients to see the effect? This is good enough. There is no need to unmount. Thanks, R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
> > Is it possible for you to repeat the test by disabling ctime or increasing > > the inode size to a higher value say 1024? > > Sure! How do I disable ctime or increase the inode size? Would this suffice to disable `ctime`? sudo gluster volume set glusterfs ctime off Can it be done on a running cluster? Do I need to unmount and remount for clients to see the effect? Thanks, R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello Rafi, many thanks for looking into this! > Is it possible for you to repeat the test by disabling ctime or increasing > the inode size to a higher value say 1024? Sure! How do I disable ctime or increase the inode size? Ciao, R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
On 11/4/19 2:41 PM, Riccardo Murri wrote: Hello Amar, Can you please check the profile info [1] ? That may give some hints. I am attaching the output of `sudo gluster volume profile info` as a text file to preserve formatting. This covers the time from Friday night to Monday morning; during this time the cluster has been the target of a an `rsync` command copying a large directory tree (14TB; it's taking more than 2 weeks now...). My take from cursorily reading the profile: - metadata operations (opendir, create, rename) seem to have very high latency (up to 25 seconds for a rename!) We have identified a performance drop with ctime and rename operations. This is because ctime stores some metadata information as an extended attributes which sometimes exceeds the default inode size. In such scenarios the additional xattrs won't fit into the default size. This will result in additional blocks to be used which will effect the latency. Is it possible for you to repeat the test by disabling ctime or increasing the inode size to a higher value say 1024? - large block size (>128kB) are rare; most blocks seems to be 8kB, 16kB, or 128kB. I personally do not see any obvious culprit or optimization opportunities; is there something I'm missing out that catches the eye of someone more experienced? Thanks, R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello Strahil, > You can set your mounts with 'noatime,nodiratime' options for better > performance. Thanks for the suggestion! I'll try that eventually, but I don't think `noatime` will make much difference on write-mostly workload. Thanks, R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello Amar, > Can you please check the profile info [1] ? That may give some hints. I am attaching the output of `sudo gluster volume profile info` as a text file to preserve formatting. This covers the time from Friday night to Monday morning; during this time the cluster has been the target of a an `rsync` command copying a large directory tree (14TB; it's taking more than 2 weeks now...). My take from cursorily reading the profile: - metadata operations (opendir, create, rename) seem to have very high latency (up to 25 seconds for a rename!) - large block size (>128kB) are rare; most blocks seems to be 8kB, 16kB, or 128kB. I personally do not see any obvious culprit or optimization opportunities; is there something I'm missing out that catches the eye of someone more experienced? Thanks, R $ sudo gluster volume profile glusterfs info Brick: glusterfs-server-001:/srv/glusterfs -- Cumulative Stats: Block Size: 2b+ 4b+ 8b+ No. of Reads:21812 No. of Writes: 471522 Block Size: 16b+ 32b+ 64b+ No. of Reads: 3177 336 No. of Writes: 65 7869 47979 Block Size:128b+ 256b+ 512b+ No. of Reads: 291 648 1125 No. of Writes: 8909125293 62952 Block Size: 1024b+2048b+4096b+ No. of Reads: 2054 3868 7903 No. of Writes:11728293386 87850 Block Size: 8192b+ 16384b+ 32768b+ No. of Reads:16084 35981 66646 No. of Writes: 10758668 6852707289622 Block Size: 65536b+ 131072b+ 262144b+ No. of Reads: 128659 15220392 0 No. of Writes: 239551 12723330 316 %-latency Avg-latency Min-Latency Max-Latency No. of calls Fop - --- --- --- 0.00 0.00 us 0.00 us 0.00 us 760721 FORGET 0.00 0.00 us 0.00 us 0.00 us 877203 RELEASE 0.00 0.00 us 0.00 us 0.00 us 16092 RELEASEDIR 0.00 16174.60 us 554.21 us 31794.98 us 2READDIRP 0.00 185.11 us 82.75 us 660.68 us377OPEN 0.00 95974.52 us 95974.52 us 95974.52 us 1 OPENDIR 0.01 59611.16 us 444.39 us 160460.53 us 3 SYMLINK 0.01 217.87 us 72.47 us 15655.59 us 1337SETXATTR 0.30 79.48 us 16.15 us6242.48 us 123275 FLUSH 0.31 120.49 us 30.68 us 10641.08 us 83225 FSTAT 0.50 12161.43 us 273.52 us 298385.31 us 1337 MKDIR 0.86 129.11 us 32.64 us9029.93 us 216678 STATFS 1.35 89.34 us 12.29 us 16695.73 us 491862 ENTRYLK 2.09 91.83 us 11.85 us 20615.78 us 737946 INODELK 2.38 628.28 us 115.01 us 25374396.01 us 122889 RENAME 2.422080.96 us 36.39 us 403164.19 us 37810READ 2.74 179.20 us 49.63 us 1041137.91 us 497183 SETATTR 6.34 154.95 us 33.60 us 17451.12 us1329171STAT 11.35 201.91 us 31.14 us 181613.31 us1824910 LOOKUP 28.657568.96 us 169.02 us 1888964.31 us 122898 CREATE 40.68 517.99 us 58.77 us 1076940.31 us2549418 WRITE Duration: 2936614 seconds Data Read: 2012268179193 bytes Data Written: 2005120204522 bytes Interval 0 Stats: Block Size: 2b+ 4b+ 8b+ No. of Reads:21812 No. of Writes: 471522 Block Size: 16b+ 32b+ 64b+ No. of Reads: 3177 336 No. of Writes: 65 7869 47979 Block Size:128b+ 256b+ 512b+ No. of Reads: 291 648
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hm... This seems to be cluster-wide effect than a single brick. In order to make things faster, can you remount (mount -o remount,noatime,nodiratime /gluster_brick/) on all bricks in the same volume and take the test again ? I think I saw your gluster bricks are mounted without these options. Also, are you using XFS as brick FS? Best Regards, Strahil NikolovOn Nov 1, 2019 21:21, Riccardo Murri wrote: > > Dear Strahil, > > > Have you noticed if slowness is only when accessing the files from > > specific node ? > > I am copying a largest of image files into the GlusterFS volume -- > slowness is on the aggregated performance (e.g., it takes ~300 minutes > to copy 376GB worth of files). Given the high number of files > (O(100'000)), I guess they're +/- equally distributed across nodes. > Report of `df -h` across server nodes shows no imbalance. > > Thanks, > R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Dear Strahil, > Have you noticed if slowness is only when accessing the files from > specific node ? I am copying a largest of image files into the GlusterFS volume -- slowness is on the aggregated performance (e.g., it takes ~300 minutes to copy 376GB worth of files). Given the high number of files (O(100'000)), I guess they're +/- equally distributed across nodes. Report of `df -h` across server nodes shows no imbalance. Thanks, R Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Dear Amar, > Can you please check the profile info [1] ? That may give some hints. I have started profiling, will check what info has been collected on Monday. Many thanks for the suggestion! Riccardo Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hi Riccardo, Can you please check the profile info [1] ? That may give some hints. [1] - https://docs.gluster.org/en/latest/Administrator%20Guide/Monitoring%20Workload/ ? On Fri, 1 Nov, 2019, 9:55 AM Riccardo Murri, wrote: > Hello all, > > I have done some further testing and found out that I get the bad > performance with a freshly-installed cluster running 6.6. Also the > performance drop is there with plain `rsync` into the GlusterFS > mountpoint, so SAMBA plays no role in it. In other words, for my > installations, performance of 6.5 and 6.6 is *half* of what 3.8 used > to deliver. > > Was any default option changed (e.g., in the FUSE client) or in the > xlator stack that I can start looking at as a potential culprit? Or > any direction where to look at for debugging? > > Thanks for any help! > > Riccardo > > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/118564314 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/118564314 > > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users > Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
I'm using replicated volumes. In your case, you got a distribited volume. Have you noticed if slowness is only when accessing the files from specific node ? Best Regards, Strahil NikolovOn Nov 1, 2019 17:28, Riccardo Murri wrote: > > Hello Strahil > > > What options do you use i your cluster? > > I'm not sure what exact info you would like to see? > > Here's how clients mount the GlusterFS volume: > ``` > $ fgrep gluster /proc/mounts > tp-glusterfs5:/glusterfs /net/glusterfs fuse.glusterfs > rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072 > > 0 0 > ``` > > Here's some server-side info: > > ``` > $ sudo gluster volume info glusterfs > > Volume Name: glusterfs > Type: Distribute > Volume ID: a3358ff6-5cec-4a65-9ecf-a63bbe56dfd9 > Status: Started > Snapshot Count: 0 > Number of Bricks: 5 > Transport-type: tcp > Bricks: > Brick1: pelkmanslab-tp-glusterfs5:/srv/glusterfs > Brick2: pelkmanslab-tp-glusterfs4:/srv/glusterfs > Brick3: pelkmanslab-tp-glusterfs3:/srv/glusterfs > Brick4: pelkmanslab-tp-glusterfs1:/srv/glusterfs > Brick5: pelkmanslab-tp-glusterfs2:/srv/glusterfs > Options Reconfigured: > diagnostics.client-log-level: WARNING > diagnostics.brick-log-level: INFO > features.uss: disable > features.barrier: disable > performance.client-io-threads: on > transport.address-family: inet > nfs.disable: on > snap-activate-on-create: enable > > $ sudo gluster volume get all all > Option Value > -- - > cluster.server-quorum-ratio 51 > cluster.enable-shared-storage disable > cluster.op-version 6 > cluster.max-op-version 6 > cluster.brick-multiplex disable > cluster.max-bricks-per-process 250 > cluster.localtime-logging disable > cluster.daemon-log-level INFO > > $ cat /etc/glusterfs/glusterd.vol > volume management > type mgmt/glusterd > option working-directory /var/lib/glusterd > option transport-type socket,rdma > option transport.socket.keepalive-time 10 > option transport.socket.keepalive-interval 2 > option transport.socket.read-fail-log off > option transport.socket.listen-port 24007 > option transport.rdma.listen-port 24008 > option ping-timeout 0 > option event-threads 1 > # option lock-timer 180 > # option transport.address-family inet6 > # option base-port 49152 > option max-port 60999 > end-volume > ``` > > Both server and clients are running v6.5 and op-version is 6 everywhere. > > Thanks, > Riccardo Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello Strahil > What options do you use i your cluster? I'm not sure what exact info you would like to see? Here's how clients mount the GlusterFS volume: ``` $ fgrep gluster /proc/mounts tp-glusterfs5:/glusterfs /net/glusterfs fuse.glusterfs rw,nosuid,nodev,relatime,user_id=0,group_id=0,default_permissions,allow_other,max_read=131072 0 0 ``` Here's some server-side info: ``` $ sudo gluster volume info glusterfs Volume Name: glusterfs Type: Distribute Volume ID: a3358ff6-5cec-4a65-9ecf-a63bbe56dfd9 Status: Started Snapshot Count: 0 Number of Bricks: 5 Transport-type: tcp Bricks: Brick1: pelkmanslab-tp-glusterfs5:/srv/glusterfs Brick2: pelkmanslab-tp-glusterfs4:/srv/glusterfs Brick3: pelkmanslab-tp-glusterfs3:/srv/glusterfs Brick4: pelkmanslab-tp-glusterfs1:/srv/glusterfs Brick5: pelkmanslab-tp-glusterfs2:/srv/glusterfs Options Reconfigured: diagnostics.client-log-level: WARNING diagnostics.brick-log-level: INFO features.uss: disable features.barrier: disable performance.client-io-threads: on transport.address-family: inet nfs.disable: on snap-activate-on-create: enable $ sudo gluster volume get all all Option Value -- - cluster.server-quorum-ratio 51 cluster.enable-shared-storage disable cluster.op-version 6 cluster.max-op-version 6 cluster.brick-multiplex disable cluster.max-bricks-per-process 250 cluster.localtime-logging disable cluster.daemon-log-levelINFO $ cat /etc/glusterfs/glusterd.vol volume management type mgmt/glusterd option working-directory /var/lib/glusterd option transport-type socket,rdma option transport.socket.keepalive-time 10 option transport.socket.keepalive-interval 2 option transport.socket.read-fail-log off option transport.socket.listen-port 24007 option transport.rdma.listen-port 24008 option ping-timeout 0 option event-threads 1 # option lock-timer 180 # option transport.address-family inet6 # option base-port 49152 option max-port 60999 end-volume ``` Both server and clients are running v6.5 and op-version is 6 everywhere. Thanks, Riccardo Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
What options do you use i your cluster? Best Regards, Strahil NikolovOn Nov 1, 2019 06:24, Riccardo Murri wrote: > > Hello all, > > I have done some further testing and found out that I get the bad > performance with a freshly-installed cluster running 6.6. Also the > performance drop is there with plain `rsync` into the GlusterFS > mountpoint, so SAMBA plays no role in it. In other words, for my > installations, performance of 6.5 and 6.6 is *half* of what 3.8 used > to deliver. > > Was any default option changed (e.g., in the FUSE client) or in the > xlator stack that I can start looking at as a potential culprit? Or > any direction where to look at for debugging? > > Thanks for any help! > > Riccardo > > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/118564314 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/118564314 > > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello all, I have done some further testing and found out that I get the bad performance with a freshly-installed cluster running 6.6. Also the performance drop is there with plain `rsync` into the GlusterFS mountpoint, so SAMBA plays no role in it. In other words, for my installations, performance of 6.5 and 6.6 is *half* of what 3.8 used to deliver. Was any default option changed (e.g., in the FUSE client) or in the xlator stack that I can start looking at as a potential culprit? Or any direction where to look at for debugging? Thanks for any help! Riccardo Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
> In previous discussions it was confirmed by others that v5.5 is a little bit > slower than v3.12 , but I think that most of those issues were fixed in v6 . > What was the exact version you have? 6.5 according to the package version; op-version is 6. Thanks, Riccardo Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
In previous discussions it was confirmed by others that v5.5 is a little bit slower than v3.12 , but I think that most of those issues were fixed in v6 . What was the exact version you have? Best Regards, Strahil NikolovOn Oct 29, 2019 12:50, Riccardo Murri wrote: > > Hello Anoop, > > many thanks for your fast reply! My comments inline below: > > > > > [1]: I have tried both the config where SAMBA 4.8 is using the > > > vfs_glusterfs.so backend, and the one where `smbd` is just writing to > > > a locally-mounted directory. Doesn't seem to make a difference. > > > > Samba v4.8 is an EOL ed version. Please consider updating Samba to at > > least v4.9(rather v4.10) or higher. > > This is going to be tricky: I could find no backport package of recent > SAMBA to Ubuntu 16.04; I am using this one which has SAMBA 4.8 > https://launchpad.net/~mumblepins > > More recent packages from either the Ubuntu or Debian repositories do > not build on Ubuntu 16.04 because of changes in the packaging > infrastructure. > > Anyway, I was running SAMBA 4.8 before the upgrade and still getting > 40MB/s, so I don't think SAMBA is the core of the issue... > > > Can you paste the output of `testparm -s` along with the output of > > `gluster volume info ` ? > > Here's `testparm -s` on the server using `vfs_glusterfs` (the "active" > share is the one with the perf problems):: > > ``` > $ testparm -s > Load smb config files from /etc/samba/smb.conf > rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384) > WARNING: The "syslog only" option is deprecated > Processing section "[homes]" > Processing section "[active]" > Loaded services file OK. > WARNING: some services use vfs_fruit, others don't. Mounting them in > conjunction on OS X clients results in undefined behaviour. > > Server role: ROLE_STANDALONE > > # Global parameters > [global] > dns proxy = No > load printers = No > map to guest = Bad User > name resolve order = lmhosts > netbios name = REDACTED1 > obey pam restrictions = Yes > pam password change = Yes > passwd chat = *Enter\snew\s*\spassword:* %n\n > *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . > passwd program = /usr/bin/passwd %u > printcap cache time = 0 > printcap name = /dev/null > security = USER > server role = standalone server > server string = SAMBA Server %v > syslog only = Yes > unix password sync = Yes > workgroup = REDACTED > idmap config * : backend = tdb > > > [homes] > browseable = No > comment = Work Directories > create mask = 0700 > directory mask = 0700 > read only = No > valid users = %S > vfs objects = fruit streams_xattr > > > [active] > create mask = 0775 > directory mask = 0775 > kernel share modes = No > path = /active > read only = No > vfs objects = glusterfs > glusterfs:volume = glusterfs > glusterfs:volfile_server = glusterfs5 glusterfs4 glusterfs3 > glusterfs2 glusterfs1 > glusterfs:logfile = /var/log/samba/glusterfs-vol-active.log > glusterfs:loglevel = 1 > ``` > > > Here's `testparm -s` on the server writing directly to the GlusterFS > mount point:: > > ``` > $ testparm -s > Load smb config files from /etc/samba/smb.conf > rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384) > WARNING: The "syslog only" option is deprecated > Processing section "[homes]" > Processing section "[active]" > Loaded services file OK. > WARNING: some services use vfs_fruit, others don't. Mounting them in > conjunction on OS X clients results in undefined behaviour. > > Server role: ROLE_STANDALONE > > # Global parameters > [global] > allow insecure wide links = Yes > dns proxy = No > load printers = No > map to guest = Bad User > name resolve order = lmhosts > netbios name = REDACTED2 > obey pam restrictions = Yes > pam password change = Yes > passwd chat = *Enter\snew\s*\spassword:* %n\n > *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . > passwd program = /usr/bin/passwd %u > printcap cache time = 0 > printcap name = /dev/null > security = USER > server role = standalone server > server string = SAMBA Server %v > syslog only = Yes > unix password sync = Yes > workgroup = REDACTED > idmap config * : backend = tdb > > > [homes] > browseable = No > comment = Work Directories > create mask = 0700 > directory mask = 0700 > read only = No > valid users = %S > vfs objects = fruit streams_xattr > > > [active] > create mask = 0775 > directory mask = 0775 > path = /data/active > read only = No > wide links = Yes > ``` > > Here's the volume info: > ``` > $ sudo gluster volume info glusterfs > > Volume Name: glusterfs > Type: Distribute > Volume ID
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hi Riccardo, You can set your mounts with 'noatime,nodiratime' options for better performance. Best Regards, Strahil NikolovOn Oct 29, 2019 12:50, Riccardo Murri wrote: > > Hello Anoop, > > many thanks for your fast reply! My comments inline below: > > > > > [1]: I have tried both the config where SAMBA 4.8 is using the > > > vfs_glusterfs.so backend, and the one where `smbd` is just writing to > > > a locally-mounted directory. Doesn't seem to make a difference. > > > > Samba v4.8 is an EOL ed version. Please consider updating Samba to at > > least v4.9(rather v4.10) or higher. > > This is going to be tricky: I could find no backport package of recent > SAMBA to Ubuntu 16.04; I am using this one which has SAMBA 4.8 > https://launchpad.net/~mumblepins > > More recent packages from either the Ubuntu or Debian repositories do > not build on Ubuntu 16.04 because of changes in the packaging > infrastructure. > > Anyway, I was running SAMBA 4.8 before the upgrade and still getting > 40MB/s, so I don't think SAMBA is the core of the issue... > > > Can you paste the output of `testparm -s` along with the output of > > `gluster volume info ` ? > > Here's `testparm -s` on the server using `vfs_glusterfs` (the "active" > share is the one with the perf problems):: > > ``` > $ testparm -s > Load smb config files from /etc/samba/smb.conf > rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384) > WARNING: The "syslog only" option is deprecated > Processing section "[homes]" > Processing section "[active]" > Loaded services file OK. > WARNING: some services use vfs_fruit, others don't. Mounting them in > conjunction on OS X clients results in undefined behaviour. > > Server role: ROLE_STANDALONE > > # Global parameters > [global] > dns proxy = No > load printers = No > map to guest = Bad User > name resolve order = lmhosts > netbios name = REDACTED1 > obey pam restrictions = Yes > pam password change = Yes > passwd chat = *Enter\snew\s*\spassword:* %n\n > *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . > passwd program = /usr/bin/passwd %u > printcap cache time = 0 > printcap name = /dev/null > security = USER > server role = standalone server > server string = SAMBA Server %v > syslog only = Yes > unix password sync = Yes > workgroup = REDACTED > idmap config * : backend = tdb > > > [homes] > browseable = No > comment = Work Directories > create mask = 0700 > directory mask = 0700 > read only = No > valid users = %S > vfs objects = fruit streams_xattr > > > [active] > create mask = 0775 > directory mask = 0775 > kernel share modes = No > path = /active > read only = No > vfs objects = glusterfs > glusterfs:volume = glusterfs > glusterfs:volfile_server = glusterfs5 glusterfs4 glusterfs3 > glusterfs2 glusterfs1 > glusterfs:logfile = /var/log/samba/glusterfs-vol-active.log > glusterfs:loglevel = 1 > ``` > > > Here's `testparm -s` on the server writing directly to the GlusterFS > mount point:: > > ``` > $ testparm -s > Load smb config files from /etc/samba/smb.conf > rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384) > WARNING: The "syslog only" option is deprecated > Processing section "[homes]" > Processing section "[active]" > Loaded services file OK. > WARNING: some services use vfs_fruit, others don't. Mounting them in > conjunction on OS X clients results in undefined behaviour. > > Server role: ROLE_STANDALONE > > # Global parameters > [global] > allow insecure wide links = Yes > dns proxy = No > load printers = No > map to guest = Bad User > name resolve order = lmhosts > netbios name = REDACTED2 > obey pam restrictions = Yes > pam password change = Yes > passwd chat = *Enter\snew\s*\spassword:* %n\n > *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . > passwd program = /usr/bin/passwd %u > printcap cache time = 0 > printcap name = /dev/null > security = USER > server role = standalone server > server string = SAMBA Server %v > syslog only = Yes > unix password sync = Yes > workgroup = REDACTED > idmap config * : backend = tdb > > > [homes] > browseable = No > comment = Work Directories > create mask = 0700 > directory mask = 0700 > read only = No > valid users = %S > vfs objects = fruit streams_xattr > > > [active] > create mask = 0775 > directory mask = 0775 > path = /data/active > read only = No > wide links = Yes > ``` > > Here's the volume info: > ``` > $ sudo gluster volume info glusterfs > > Volume Name: glusterfs > Type: Distribute > Volume ID: a3358ff6-5cec-4a65-9ecf-a63bbe56dfd9 > Status: Started > Snapshot Count: 0 > Number of Br
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello Anoop, many thanks for your fast reply! My comments inline below: > > [1]: I have tried both the config where SAMBA 4.8 is using the > > vfs_glusterfs.so backend, and the one where `smbd` is just writing to > > a locally-mounted directory. Doesn't seem to make a difference. > > Samba v4.8 is an EOL ed version. Please consider updating Samba to at > least v4.9(rather v4.10) or higher. This is going to be tricky: I could find no backport package of recent SAMBA to Ubuntu 16.04; I am using this one which has SAMBA 4.8 https://launchpad.net/~mumblepins More recent packages from either the Ubuntu or Debian repositories do not build on Ubuntu 16.04 because of changes in the packaging infrastructure. Anyway, I was running SAMBA 4.8 before the upgrade and still getting 40MB/s, so I don't think SAMBA is the core of the issue... > Can you paste the output of `testparm -s` along with the output of > `gluster volume info ` ? Here's `testparm -s` on the server using `vfs_glusterfs` (the "active" share is the one with the perf problems):: ``` $ testparm -s Load smb config files from /etc/samba/smb.conf rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384) WARNING: The "syslog only" option is deprecated Processing section "[homes]" Processing section "[active]" Loaded services file OK. WARNING: some services use vfs_fruit, others don't. Mounting them in conjunction on OS X clients results in undefined behaviour. Server role: ROLE_STANDALONE # Global parameters [global] dns proxy = No load printers = No map to guest = Bad User name resolve order = lmhosts netbios name = REDACTED1 obey pam restrictions = Yes pam password change = Yes passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . passwd program = /usr/bin/passwd %u printcap cache time = 0 printcap name = /dev/null security = USER server role = standalone server server string = SAMBA Server %v syslog only = Yes unix password sync = Yes workgroup = REDACTED idmap config * : backend = tdb [homes] browseable = No comment = Work Directories create mask = 0700 directory mask = 0700 read only = No valid users = %S vfs objects = fruit streams_xattr [active] create mask = 0775 directory mask = 0775 kernel share modes = No path = /active read only = No vfs objects = glusterfs glusterfs:volume = glusterfs glusterfs:volfile_server = glusterfs5 glusterfs4 glusterfs3 glusterfs2 glusterfs1 glusterfs:logfile = /var/log/samba/glusterfs-vol-active.log glusterfs:loglevel = 1 ``` Here's `testparm -s` on the server writing directly to the GlusterFS mount point:: ``` $ testparm -s Load smb config files from /etc/samba/smb.conf rlimit_max: increasing rlimit_max (1024) to minimum Windows limit (16384) WARNING: The "syslog only" option is deprecated Processing section "[homes]" Processing section "[active]" Loaded services file OK. WARNING: some services use vfs_fruit, others don't. Mounting them in conjunction on OS X clients results in undefined behaviour. Server role: ROLE_STANDALONE # Global parameters [global] allow insecure wide links = Yes dns proxy = No load printers = No map to guest = Bad User name resolve order = lmhosts netbios name = REDACTED2 obey pam restrictions = Yes pam password change = Yes passwd chat = *Enter\snew\s*\spassword:* %n\n *Retype\snew\s*\spassword:* %n\n *password\supdated\ssuccessfully* . passwd program = /usr/bin/passwd %u printcap cache time = 0 printcap name = /dev/null security = USER server role = standalone server server string = SAMBA Server %v syslog only = Yes unix password sync = Yes workgroup = REDACTED idmap config * : backend = tdb [homes] browseable = No comment = Work Directories create mask = 0700 directory mask = 0700 read only = No valid users = %S vfs objects = fruit streams_xattr [active] create mask = 0775 directory mask = 0775 path = /data/active read only = No wide links = Yes ``` Here's the volume info: ``` $ sudo gluster volume info glusterfs Volume Name: glusterfs Type: Distribute Volume ID: a3358ff6-5cec-4a65-9ecf-a63bbe56dfd9 Status: Started Snapshot Count: 0 Number of Bricks: 5 Transport-type: tcp Bricks: Brick1: glusterfs5:/srv/glusterfs Brick2: glusterfs4:/srv/glusterfs Brick3: glusterfs3:/srv/glusterfs Brick4: glusterfs1:/srv/glusterfs Brick5: glusterfs2:/srv/glusterfs Options Reconfigured: diagnostics.client-log-level: WARNING diagnostics.brick-log-level: INFO features.uss: disable features.barrier: disable performance.client-io-threads: on transport.address-family: inet nfs.disable: on snap-activate-on-create: enable ``` > > [2]: Actually, since the servers are VMs on an OpenStack cloud, I > > created new virtual machines, installed GlusterFS 6 fresh
Re: [Gluster-users] Performance drop when upgrading from 3.8 to 6.5
On Tue, 2019-10-29 at 10:59 +0100, Riccardo Murri wrote: > Hello, > > I recently upgraded[2] our servers from GlusterFS 3.8 (old GlusterFS > repo for Ubuntu 16.04) to 6.0 (gotten from the GlusterFS PPA for > Ubuntu 16.04 "xenial"). > > The sustained write performance nearly dropped to half it was before. > We copy a large (a few 10'000s) number of image files (each 2 to 10 > MB > size) from the microscope where they were acquired to a SAMBA server > which mounts[1] the GlusterFS volume; before the upgrade, we could > write at about 40MB/s, after the upgrade, this dropped to 20MB/s. > > This is the version of server and client software installed: > ``` > $ dpkg -l '*gluster*' > Desired=Unknown/Install/Remove/Purge/Hold > > Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig- > > aWait/Trig-pend > > / Err?=(none)/Reinst-required (Status,Err: uppercase=bad) > > > / > > > Name Version Architecture > Description > +++-==-- > - > == > ii glusterfs-client 6.5-ubuntu1~xenial1 amd64 > clustered file-system (client package) > ii glusterfs-common 6.5-ubuntu1~xenial1 amd64 > GlusterFS common libraries and translator modules > ii glusterfs-server 6.5-ubuntu1~xenial1 amd64 > clustered file-system (server package) > ``` > Op version has been upped to 6: > ``` > $ sudo gluster volume get all cluster.op-version > Option Value > -- - > cluster.op-version 6 > > $ sudo gluster volume get all cluster.max-op-version > Option Value > -- - > cluster.max-op-version 6 > ``` > > Running `sudo gluster volume status all clients` reports that all > clients are on op-version 6, too. > > Any suggestions on what to look for or changes to try out? > > Thanks, > Riccardo > > [1]: I have tried both the config where SAMBA 4.8 is using the > vfs_glusterfs.so backend, and the one where `smbd` is just writing to > a locally-mounted directory. Doesn't seem to make a difference. Samba v4.8 is an EOL ed version. Please consider updating Samba to at least v4.9(rather v4.10) or higher. Can you paste the output of `testparm -s` along with the output of `gluster volume info ` ? > [2]: Actually, since the servers are VMs on an OpenStack cloud, I > created new virtual machines, installed GlusterFS 6 fresh, mounted > the > old bricks in the same brick locations, How did you mount old bricks in the new location? > and restarted the cluster. I > had to fiddle a bit with the files in `/var/lib/glusterfs` because > the > hostnames and IPs changed but did not do anything else than `sed -e > s/old_hostname/new_hostname/` or similarly renaming files. In > particular, I did not touch the extended attributes in the brick > directory. > > > Community Meeting Calendar: > > APAC Schedule - > Every 2nd and 4th Tuesday at 11:30 AM IST > Bridge: https://bluejeans.com/118564314 > > NA/EMEA Schedule - > Every 1st and 3rd Tuesday at 01:00 PM EDT > Bridge: https://bluejeans.com/118564314 > > Gluster-users mailing list > Gluster-users@gluster.org > https://lists.gluster.org/mailman/listinfo/gluster-users Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users
[Gluster-users] Performance drop when upgrading from 3.8 to 6.5
Hello, I recently upgraded[2] our servers from GlusterFS 3.8 (old GlusterFS repo for Ubuntu 16.04) to 6.0 (gotten from the GlusterFS PPA for Ubuntu 16.04 "xenial"). The sustained write performance nearly dropped to half it was before. We copy a large (a few 10'000s) number of image files (each 2 to 10 MB size) from the microscope where they were acquired to a SAMBA server which mounts[1] the GlusterFS volume; before the upgrade, we could write at about 40MB/s, after the upgrade, this dropped to 20MB/s. This is the version of server and client software installed: ``` $ dpkg -l '*gluster*' Desired=Unknown/Install/Remove/Purge/Hold | Status=Not/Inst/Conf-files/Unpacked/halF-conf/Half-inst/trig-aWait/Trig-pend |/ Err?=(none)/Reinst-required (Status,Err: uppercase=bad) ||/ Name Version Architecture Description +++-==---== ii glusterfs-client 6.5-ubuntu1~xenial1 amd64 clustered file-system (client package) ii glusterfs-common 6.5-ubuntu1~xenial1 amd64 GlusterFS common libraries and translator modules ii glusterfs-server 6.5-ubuntu1~xenial1 amd64 clustered file-system (server package) ``` Op version has been upped to 6: ``` $ sudo gluster volume get all cluster.op-version Option Value -- - cluster.op-version 6 $ sudo gluster volume get all cluster.max-op-version Option Value -- - cluster.max-op-version 6 ``` Running `sudo gluster volume status all clients` reports that all clients are on op-version 6, too. Any suggestions on what to look for or changes to try out? Thanks, Riccardo [1]: I have tried both the config where SAMBA 4.8 is using the vfs_glusterfs.so backend, and the one where `smbd` is just writing to a locally-mounted directory. Doesn't seem to make a difference. [2]: Actually, since the servers are VMs on an OpenStack cloud, I created new virtual machines, installed GlusterFS 6 fresh, mounted the old bricks in the same brick locations, and restarted the cluster. I had to fiddle a bit with the files in `/var/lib/glusterfs` because the hostnames and IPs changed but did not do anything else than `sed -e s/old_hostname/new_hostname/` or similarly renaming files. In particular, I did not touch the extended attributes in the brick directory. Community Meeting Calendar: APAC Schedule - Every 2nd and 4th Tuesday at 11:30 AM IST Bridge: https://bluejeans.com/118564314 NA/EMEA Schedule - Every 1st and 3rd Tuesday at 01:00 PM EDT Bridge: https://bluejeans.com/118564314 Gluster-users mailing list Gluster-users@gluster.org https://lists.gluster.org/mailman/listinfo/gluster-users