We have a cluster running CephFS with metadata on SSDs and data split between SSDs and OSDs (main pool is on HDDs, some subtrees are on an SSD pool).

We're seeing quite poor deletion performance, especially for directories. It seems that previously empty directories are often deleted quickly, but unlinkat() on any directory that used to contain data often takes upwards of a second. Stracing a simple `rm -r`:

unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.002668>
unlinkat(6, "INBOX", AT_REMOVEDIR)      = 0 <2.045551>
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.005872>
unlinkat(6, "Trash", AT_REMOVEDIR)      = 0 <1.918497>
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.012609>
unlinkat(6, "Spam", AT_REMOVEDIR)       = 0 <1.743648>
unlinkat(7, "dbox-Mails", AT_REMOVEDIR) = 0 <0.016548>
unlinkat(6, "Sent", AT_REMOVEDIR)       = 0 <2.295136>
unlinkat(5, "mailboxes", AT_REMOVEDIR)  = 0 <0.735630>
unlinkat(4, "mdbox", AT_REMOVEDIR)      = 0 <0.686786>

(all those dbox-Mails subdirectories are empty children of the folder-name directories)

It also seems that these deletions have a huge impact on cluster performance, across hosts. This is the global MDS op latency impact of doing first 1, then 6 parallel 'rm -r' instances from a host that is otherwise not doing anything else:

https://mrcn.st/t/Screenshot_20190913_161500.png

(I had to stop the 6-parallel run because it was completely trashing cluster performance for live serving machines; I wound up with load average >900 on one of them).

The OSD SSDs/HDDs are not significantly busier during the deletions, nor is CPU usage on the MDS much at that time, so I'm not sure what the bottleneck is here.

Is this expected for CephFS? I know data deletions are asynchronous, but not being able to delete metadata/directories without an undue impact on the whole filesystem performance is somewhat problematic.


--
Hector Martin (hec...@marcansoft.com)
Public Key: https://mrcn.st/pub
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to