Re: [ceph-users] cephfs deleting files No space left on device
Hey Kenneth, We encountered this when the number of strays (unlinked files yet to be purged) reached 1 million, which is a result of many many file removals happening on the fs repeatedly. It can also happen when there are more than 100k files in a dir with default settings. You can tune it via 'mds_bal_fragment_size_max' setting on the mds either temporarily to rm files or permanently. Beware setting it too high. Check num strays in mds cache by running `ceph daemon mds.{mds name} perf dump` and inspecting the mds cache section for num_strays. The 1 million limit is a multiple/function of the mds bal fragment size (10x). Raf On Fri, May 10, 2019, 9:03 PM Kenneth Waegeman wrote: > Hi all, > > I am seeing issues on cephfs running 13.2.5 when deleting files: > > [root@osd006 ~]# rm /mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700 > rm: remove regular empty file > ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’? y > rm: cannot remove > ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’: No space left > on device > > few minutes later, I can remove it without problem. This happens > especially when there are a lot of files deleted somewhere on the > filesystem around the same time. > > We already have tuned our mds config: > > [mds] > mds_cache_memory_limit=10737418240 > mds_log_max_expiring=200 > mds_log_max_segments=200 > mds_max_purge_files=2560 > mds_max_purge_ops=327600 > mds_max_purge_ops_per_pg=20 > > ceph -s is reporting everything clean, and the file system space usage > is less than 50%, also no full osds or anything. > > Is there a way to further debug what the bottleneck is when removing > files that gives this 'no space left on device' error? > > > Thank you very much! > > Kenneth > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] cephfs deleting files No space left on device
About how many files are we talking here? Implementation detail on file deletion to understand why this might happen: deletion is async, deleting a file inserts it into the purge queue and the actual data will be removed in the background. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Fri, May 10, 2019 at 1:04 PM Kenneth Waegeman wrote: > Hi all, > > I am seeing issues on cephfs running 13.2.5 when deleting files: > > [root@osd006 ~]# rm /mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700 > rm: remove regular empty file > ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’? y > rm: cannot remove > ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’: No space left > on device > > few minutes later, I can remove it without problem. This happens > especially when there are a lot of files deleted somewhere on the > filesystem around the same time. > > We already have tuned our mds config: > > [mds] > mds_cache_memory_limit=10737418240 > mds_log_max_expiring=200 > mds_log_max_segments=200 > mds_max_purge_files=2560 > mds_max_purge_ops=327600 > mds_max_purge_ops_per_pg=20 > > ceph -s is reporting everything clean, and the file system space usage > is less than 50%, also no full osds or anything. > > Is there a way to further debug what the bottleneck is when removing > files that gives this 'no space left on device' error? > > > Thank you very much! > > Kenneth > > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] cephfs deleting files No space left on device
Hi all, I am seeing issues on cephfs running 13.2.5 when deleting files: [root@osd006 ~]# rm /mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700 rm: remove regular empty file ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’? y rm: cannot remove ‘/mnt/ceph/backups/osd006.gigalith.os-2b5a3740.1326700’: No space left on device few minutes later, I can remove it without problem. This happens especially when there are a lot of files deleted somewhere on the filesystem around the same time. We already have tuned our mds config: [mds] mds_cache_memory_limit=10737418240 mds_log_max_expiring=200 mds_log_max_segments=200 mds_max_purge_files=2560 mds_max_purge_ops=327600 mds_max_purge_ops_per_pg=20 ceph -s is reporting everything clean, and the file system space usage is less than 50%, also no full osds or anything. Is there a way to further debug what the bottleneck is when removing files that gives this 'no space left on device' error? Thank you very much! Kenneth ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com