Hi, Does the gluster team have any feedback about this? Resolving the "Found anomalies" issues may be key to resolving dir list speed issues.
Sincerely, Artem -- Founder, Android Police <http://www.androidpolice.com>, APK Mirror <http://www.apkmirror.com/>, Illogical Robot LLC beerpla.net | @ArtemR <http://twitter.com/ArtemR> On Thu, Apr 30, 2020 at 10:36 PM Strahil Nikolov <[email protected]> wrote: > On April 30, 2020 9:05:19 PM GMT+03:00, Artem Russakovskii < > [email protected]> wrote: > >I did this on the same prod instance just now. > > > >'find' on a fuse gluster dir with 40k+ files: > >1st run: 3m56.261s > >2nd run: 0m24.970s > >3rd run: 0m24.099s > > > >At this point, I killed all gluster services on one of the 4 servers > >and > >verified that that brick went offline. > > > >1st run: 0m38.131s > >2nd run: 0m19.369s > >3rd run: 0m23.576s > > > >Nothing conclusive really IMO. > > > >Sincerely, > >Artem > > > >-- > >Founder, Android Police <http://www.androidpolice.com>, APK Mirror > ><http://www.apkmirror.com/>, Illogical Robot LLC > >beerpla.net | @ArtemR <http://twitter.com/ArtemR> > > > > > >On Thu, Apr 30, 2020 at 9:55 AM Strahil Nikolov <[email protected]> > >wrote: > > > >> On April 30, 2020 6:27:10 PM GMT+03:00, Artem Russakovskii < > >> [email protected]> wrote: > >> >Hi Strahil, in the original email I included both the times for the > >> >first > >> >and subsequent reads on the fuse mounted gluster volume as well as > >the > >> >xfs > >> >filesystem the gluster data resides on (this is the brick, right?). > >> > > >> >On Thu, Apr 30, 2020, 7:44 AM Strahil Nikolov > ><[email protected]> > >> >wrote: > >> > > >> >> On April 30, 2020 4:24:23 AM GMT+03:00, Artem Russakovskii < > >> >> [email protected]> wrote: > >> >> >Hi all, > >> >> > > >> >> >We have 500GB and 10TB 4x1 replicate xfs-based gluster volumes, > >and > >> >the > >> >> >10TB one especially is extremely slow to do certain things with > >(and > >> >> >has > >> >> >been since gluster 3.x when we started). We're currently on 5.13. > >> >> > > >> >> >The number of files isn't even what I'd consider that great - > >under > >> >> >100k > >> >> >per dir. > >> >> > > >> >> >Here are some numbers to look at: > >> >> > > >> >> >On gluster volume in a dir of 45k files: > >> >> >The first time > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 8m44.819s > >> >> >user 0m0.459s > >> >> >sys 0m0.998s > >> >> > > >> >> >And again > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 0m34.677s > >> >> >user 0m0.291s > >> >> >sys 0m0.754s > >> >> > > >> >> > > >> >> >If I run the same operation on the xfs block device itself: > >> >> >The first time > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 0m13.514s > >> >> >user 0m0.144s > >> >> >sys 0m0.501s > >> >> > > >> >> >And again > >> >> > > >> >> >time find | wc -l > >> >> >45423 > >> >> >real 0m0.197s > >> >> >user 0m0.088s > >> >> >sys 0m0.106s > >> >> > > >> >> > > >> >> >I'd expect a performance difference here but just as it was > >several > >> >> >years > >> >> >ago when we started with gluster, it's still huge, and simple > >file > >> >> >listings > >> >> >are incredibly slow. > >> >> > > >> >> >At the time, the team was looking to do some optimizations, but > >I'm > >> >not > >> >> >sure this has happened. > >> >> > > >> >> >What can we do to try to improve performance? > >> >> > > >> >> >Thank you. > >> >> > > >> >> > > >> >> > > >> >> >Some setup values follow. > >> >> > > >> >> >xfs_info /mnt/SNIP_block1 > >> >> >meta-data=/dev/sdc isize=512 agcount=103, > >> >> >agsize=26214400 > >> >> >blks > >> >> > = sectsz=512 attr=2, > >projid32bit=1 > >> >> > = crc=1 finobt=1, sparse=0, > >> >rmapbt=0 > >> >> > = reflink=0 > >> >> >data = bsize=4096 blocks=2684354560, > >> >> >imaxpct=25 > >> >> > = sunit=0 swidth=0 blks > >> >> >naming =version 2 bsize=4096 ascii-ci=0, ftype=1 > >> >> >log =internal log bsize=4096 blocks=51200, > >> >version=2 > >> >> > = sectsz=512 sunit=0 blks, > >> >lazy-count=1 > >> >> >realtime =none extsz=4096 blocks=0, > >rtextents=0 > >> >> > > >> >> >Volume Name: SNIP_data1 > >> >> >Type: Replicate > >> >> >Volume ID: SNIP > >> >> >Status: Started > >> >> >Snapshot Count: 0 > >> >> >Number of Bricks: 1 x 4 = 4 > >> >> >Transport-type: tcp > >> >> >Bricks: > >> >> >Brick1: nexus2:/mnt/SNIP_block1/SNIP_data1 > >> >> >Brick2: forge:/mnt/SNIP_block1/SNIP_data1 > >> >> >Brick3: hive:/mnt/SNIP_block1/SNIP_data1 > >> >> >Brick4: citadel:/mnt/SNIP_block1/SNIP_data1 > >> >> >Options Reconfigured: > >> >> >cluster.quorum-count: 1 > >> >> >cluster.quorum-type: fixed > >> >> >network.ping-timeout: 5 > >> >> >network.remote-dio: enable > >> >> >performance.rda-cache-limit: 256MB > >> >> >performance.readdir-ahead: on > >> >> >performance.parallel-readdir: on > >> >> >network.inode-lru-limit: 500000 > >> >> >performance.md-cache-timeout: 600 > >> >> >performance.cache-invalidation: on > >> >> >performance.stat-prefetch: on > >> >> >features.cache-invalidation-timeout: 600 > >> >> >features.cache-invalidation: on > >> >> >cluster.readdir-optimize: on > >> >> >performance.io-thread-count: 32 > >> >> >server.event-threads: 4 > >> >> >client.event-threads: 4 > >> >> >performance.read-ahead: off > >> >> >cluster.lookup-optimize: on > >> >> >performance.cache-size: 1GB > >> >> >cluster.self-heal-daemon: enable > >> >> >transport.address-family: inet > >> >> >nfs.disable: on > >> >> >performance.client-io-threads: on > >> >> >cluster.granular-entry-heal: enable > >> >> >cluster.data-self-heal-algorithm: full > >> >> > > >> >> >Sincerely, > >> >> >Artem > >> >> > > >> >> >-- > >> >> >Founder, Android Police <http://www.androidpolice.com>, APK > >Mirror > >> >> ><http://www.apkmirror.com/>, Illogical Robot LLC > >> >> >beerpla.net | @ArtemR <http://twitter.com/ArtemR> > >> >> > >> >> Hi Artem, > >> >> > >> >> Have you checked the same on brick level ? How big is the > >difference > >> >? > >> >> > >> >> Best Regards, > >> >> Strahil Nikolov > >> >> > >> > >> Hi Artem, > >> > >> My bad I missed the 'xfs' word... Still the difference is huge. > >> > >> May I ask you to do a test again (pure curiosity) as follows: > >> 1. Repeat the test from before > >> 2. Stop 1 brick and test again. > >> > >> > >> P.S.: You can try it on the test cluster > >> > >> Best Regards, > >> Strahil Nikolov > >> > > Hi Artem, > > I was wondering if the 4th replica is adding additional overhead (another > dir to check), but the test is not very conclusive. > > > Actually the 'anomalities' log entries in your pool could be a symptom of > another pdoblem (just like the long listing time). > > I will try to reproduce your setup (smaller scale - 1 brick 50k files) > and then will try with 3 bricks. > > > Best Regards, > Strahil Nikolov >
________ Community Meeting Calendar: Schedule - Every 2nd and 4th Tuesday at 14:30 IST / 09:00 UTC Bridge: https://bluejeans.com/441850968 Gluster-users mailing list [email protected] https://lists.gluster.org/mailman/listinfo/gluster-users
