Hi Paul, On Wed, Dec 13, 2023 at 9:50 PM Paul Mezzanini <pfm...@rit.edu> wrote: > > Long story short, we've got a lot of empty directories that I'm working on > removing. While removing directories, using "perf top -g" we can watch the > MDS daemon go to 100% cpu usage with "SnapRealm:: split_at" and > "CInode::is_ancestor_of". > > It's this 2 year old bug that still is around. > https://tracker.ceph.com/issues/53192
Unfortunately the fix isn't straightforward as it was attempted, so lately, we've been working around these issues by pinning to-be-deleted directories to a (separate) active MDS. This might need some tuning at the application level to move stuff inside this "special" pinned directory and then delete it. HTH. > > To help combat this, we've moved our snapshot schedule down the tree one > level so the snaprealm is significantly smaller. Our luck with multiple > active MDSs hasn't been great so we are still on a single MDS. To help split > the load, I'm working on moving different workloads to different filesytems > within ceph. > > A user can still fairly easily overwhelm the MDS's finisher thread and > basically stop all cephfs io through that MDS. I'm hoping we can get some > other people chiming in with "Me Too!" so there can be some traction behind > fixing this. > > It's a longstanding bug so the version is less important, but we are on > 17.2.7. > > Thoughts? > -paul > > -- > > Paul Mezzanini > Platform Engineer III > Research Computing > > Rochester Institute of Technology > > “End users is a description, not a goal.” > > > > _______________________________________________ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > -- Cheers, Venky _______________________________________________ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io