Oh, I misread your initial email and thought you were on hard drives. These do seem slow for SSDs.
You could try tracking down where the time is spent; perhaps run strace and see which calls are taking a while, and go through the op tracker on the MDS and see if it has anything that's obviously taking a long time. -Greg On Wed, Nov 17, 2021 at 8:00 PM Sasha Litvak <alexander.v.lit...@gmail.com> wrote: > > Gregory, > Thank you for your reply, I do understand that a number of serialized lookups > may take time. However if 3.25 sec is OK, 11.2 seconds sounds long, and I > had once removed a large subdirectory which took over 20 minutes to complete. > I attempted to use nowsync mount option with kernel 5.15 and it seems to > hide latency (i.e. it is almost immediately returns prompt after recursive > directory removal. However, I am not sure whether nowsync is safe to use > with kernel >= 5.8. I also have kernel 5.3 on one of the client clusters and > nowsync there is not supported, however all rm operations happen reasonably > fast. So the second question is, does 5.3's libceph behave differently on > recursing rm compared to 5.4 or 5.8? > > > On Wed, Nov 17, 2021 at 9:52 AM Gregory Farnum <gfar...@redhat.com> wrote: >> >> On Sat, Nov 13, 2021 at 5:25 PM Sasha Litvak >> <alexander.v.lit...@gmail.com> wrote: >> > >> > I continued looking into the issue and have no idea what hinders the >> > performance yet. However: >> > >> > 1. A client operating with kernel 5.3.0-42 (ubuntu 18.04) has no such >> > problems. I delete a directory with hashed subdirs (00 - ff) and total >> > space taken by files ~707MB spread across those 256 in 3.25 s. >> >> Recursive rm first requires the client to get capabilities on the >> files in question, and the MDS to read that data off disk. >> Newly-created directories will be cached, but old ones might not be. >> >> So this might just be the consequence of having to do 256 serialized >> disk lookups on hard drives. 3.25 seconds seems plausible to me. >> >> The number of bytes isn't going to have any impact on how long it >> takes to delete from the client side — that deletion is just marking >> it in the MDS, and then the MDS does the object removals in the >> background. >> -Greg >> >> > >> > 2. A client operating with kernel 5.8.0-53 (ubuntu 20.04) processes a >> > similar directory with less space taken ~ 530 MB spread across 256 subdirs >> > in 11.2 s. >> > >> > 3. Yet another client with kernel 5.4.156 has similar latency removing >> > directories as in line 2. >> > >> > In all scenarios, mounts are set with the same options, i.e. >> > noatime,secret-file,acl. >> > >> > Client 1 has luminous, client 2 has octopus, client 3 has nautilus. While >> > they are all on the same LAN, ceph -s on 2 and 3 returns in ~ 800 ms and on >> > client in ~300 ms. >> > >> > Any ideas are appreciated, >> > >> > >> > >> > >> > >> > >> > >> > >> > >> > On Fri, Nov 12, 2021 at 8:44 PM Sasha Litvak <alexander.v.lit...@gmail.com> >> > wrote: >> > >> > > The metadata pool is on the same type of drives as other pools; every >> > > node >> > > uses SATA SSDs. They are all read / write mix DC types. Intel and >> > > Seagate. >> > > >> > > On Fri, Nov 12, 2021 at 8:02 PM Anthony D'Atri <anthony.da...@gmail.com> >> > > wrote: >> > > >> > >> MDS RAM cache vs going to the metadata pool? What type of drives is >> > >> your >> > >> metadata pool on? >> > >> >> > >> > On Nov 12, 2021, at 5:30 PM, Sasha Litvak >> > >> > <alexander.v.lit...@gmail.com> >> > >> wrote: >> > >> > >> > >> > I am running Pacific 16.2.4 cluster and recently noticed that rm -rf >> > >> > <dir-name> visibly hangs on the old directories. Cluster is healthy, >> > >> has a >> > >> > light load, and any newly created directories deleted immediately >> > >> > (well >> > >> rm >> > >> > returns command prompt immediately). The directories in question have >> > >> 10 - >> > >> > 20 small text files so nothing should be slow when removing them. >> > >> > >> > >> > I wonder if someone can please give me a hint on where to start >> > >> > troubleshooting as I see no "big bad bear" yet. >> > >> > _______________________________________________ >> > >> > ceph-users mailing list -- ceph-users@ceph.io >> > >> > To unsubscribe send an email to ceph-users-le...@ceph.io >> > >> >> > >> >> > _______________________________________________ >> > ceph-users mailing list -- ceph-users@ceph.io >> > To unsubscribe send an email to ceph-users-le...@ceph.io >> > >> _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io