[ceph-users] Re: CephFS space usage

2024-03-26 Thread Thorne Lawler
Hi everyone! Just thought I would let everyone know: The issue appears to have been the Ceph NFS service associated with the filesystem. I removed all the files, waited a while, disconnected all the clients, waited a while, then deleted the NFS shares - the disk space and objects abruptly

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Anthony D'Atri
Grep through the ls output for ‘rados bench’ leftovers, it’s easy to leave them behind. > On Mar 20, 2024, at 5:28 PM, Igor Fedotov wrote: > > Hi Thorne, > > unfortunately I'm unaware of any tools high level enough to easily map files > to rados objects without deep undestanding how this

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Igor Fedotov
Thorne, if that's a bug in Ceph which causes space leakage you might be unable to reclaim the space without total purge of the pool. The problem is that we still uncertain if this is a leakage or something else. Hence the need for more thorough research. Thanks, Igor On 3/20/2024 9:13

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Igor Fedotov
Thorne, if that's a bug in Ceph which causes space leakage you might be unable to reclaim the space without total purge of the pool. The problem is that we still uncertain if this is a leakage or something else. Hence the need for more thorough research. Thanks, Igor On 3/20/2024 9:13

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Igor Fedotov
Hi Thorne, unfortunately I'm unaware of any tools high level enough to easily map files to rados objects without deep undestanding how this works. You might want to try "rados ls" command to get the list of all the objects in the cephfs data pool. And then  learn how that mapping is performed

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Thorne Lawler
Alexander, Thanks for explaining this. As I suspected, this is a high abstract pursuit of what caused the problem, and while I'm sure this makes sense for Ceph developers, it isn't going to happen in this case. I don't care how it got this way- the tools used to create this pool will never

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Alexander E. Patrakov
Hi Thorne, The idea is quite simple. By retesting the leak with a separate pool, used by nobody except you, in the case if the leak exists and is reproducible (which is not a given), you can definitely pinpoint it without giving any chance to the alternate hypothesis "somebody wrote some data in

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Thorne Lawler
Alexander, I'm happy to create a new pool if it will help, but I don't presently see how creating a new pool will help us to identify the source of the 10TB discrepancy in this original cephfs pool. Please help me to understand what you are hoping to find...? On 20/03/2024 6:35 pm,

[ceph-users] Re: CephFS space usage

2024-03-20 Thread Alexander E. Patrakov
Thorne, That's why I asked you to create a separate pool. All writes go to the original pool, and it is possible to see object counts per-pool. On Wed, Mar 20, 2024 at 6:32 AM Thorne Lawler wrote: > Alexander, > > Thank you, but as I said to Igor: The 5.5TB of files on this filesystem > are

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Anthony D'Atri
> Those files are VM disk images, and they're under constant heavy use, so yes- > there/is/ constant severe write load against this disk. Why are you using CephFS for an RBD application? ___ ceph-users mailing list -- ceph-users@ceph.io To

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Thorne Lawler
Alexander, Thank you, but as I said to Igor: The 5.5TB of files on this filesystem are virtual machine disks. They are under constant, heavy write load. There is no way to turn this off. On 19/03/2024 9:36 pm, Alexander E. Patrakov wrote: Hello Thorne, Here is one more suggestion on how to

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Thorne Lawler
Igor, Those files are VM disk images, and they're under constant heavy use, so yes- there/is/ constant severe write load against this disk. Apart from writing more test files into the filesystems, there must be Ceph diagnostic tools to describe what those objects are being used for, surely?

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Alexander E. Patrakov
Hello Thorne, Here is one more suggestion on how to debug this. Right now, there is uncertainty on whether there is really a disk space leak or if something simply wrote new data during the test. If you have at least three OSDs you can reassign, please set their CRUSH device class to something

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Igor Fedotov
Hi Thorn, given the amount of files at CephFS volume I presume you don't have severe write load against it. Is that correct? If so we can assume that the numbers you're sharing are mostly refer to your experiment. At peak I can see bytes_used increase = 629,461,893,120 bytes (45978612027392 

[ceph-users] Re: CephFS space usage

2024-03-19 Thread Eugen Block
It's your pool replication (size = 3): 3886733 (number of objects) * 3 = 11660199 Zitat von Thorne Lawler : Can anyone please tell me what "COPIES" means in this context? [ceph: root@san2 /]# rados df -p cephfs.shared.data POOL_NAME USED  OBJECTS  CLONES    COPIES

[ceph-users] Re: CephFS space usage

2024-03-18 Thread Thorne Lawler
Can anyone please tell me what "COPIES" means in this context? [ceph: root@san2 /]# rados df -p cephfs.shared.data POOL_NAME USED  OBJECTS  CLONES    COPIES MISSING_ON_PRIMARY  UNFOUND  DEGRADED  RD_OPS   RD WR_OPS   WR  USED COMPR  UNDER COMPR cephfs.shared.data  41

[ceph-users] Re: CephFS space usage

2024-03-17 Thread Thorne Lawler
Thanks Igor, I have tried that, and the number of objects and bytes_used took a long time to drop, but they seem to have dropped back to almost the original level: * Before creating the file: o 3885835 objects o 45349150134272 bytes_used * After creating the file: o 3931663

[ceph-users] Re: CephFS space usage

2024-03-15 Thread Igor Fedotov
Hi Thorn, so the problem is apparently bound to huge file sizes. I presume they're split into multiple chunks at ceph side hence producing millions of objects. And possibly something is wrong with this mapping. If this pool has no write load at the moment you might want to run the following

[ceph-users] Re: CephFS space usage

2024-03-14 Thread Thorne Lawler
Also, before anyone asks- I have just gone over every client attached to this filesystem through native CephFS or NFS and checked for deleted files. There are a total of three deleted files, amounting to about 200G. On 15/03/2024 10:05 am, Thorne Lawler wrote: Igor, Yes. Just a bit.

[ceph-users] Re: CephFS space usage

2024-03-14 Thread Thorne Lawler
--Original Message- From: Igor Fedotov Sent: March 14, 2024 1:37 PM To: Thorne Lawler;ceph-users@ceph.io; etienne.men...@ubisoft.com;vbog...@gmail.com Subject: [ceph-users] Re: CephFS space usage Thorn, you might want to assess amount of files on the mounted fs by runnning "du -h | wc&q

[ceph-users] Re: CephFS space usage

2024-03-14 Thread Thorne Lawler
Igor, Yes. Just a bit. root@pmx101:/mnt/pve/iso# du -h | wc -l 10 root@pmx101:/mnt/pve/iso# du -h 0   ./snippets 0   ./tmp 257M    ./xcp_nfs_sr/2ba36cf5-291a-17d2-b510-db1a295ce0c2 5.5T    ./xcp_nfs_sr/5aacaebb-4469-96f9-729e-fe45eef06a14 5.5T    ./xcp_nfs_sr 0   ./failover_test 11G 

[ceph-users] Re: CephFS space usage

2024-03-14 Thread Bailey Allison
...@ubisoft.com; vbog...@gmail.com > Subject: [ceph-users] Re: CephFS space usage > > Thorn, > > you might want to assess amount of files on the mounted fs by runnning "du > -h | wc". Does it differ drastically from amount of objects in the pool = ~3.8 > M? >

[ceph-users] Re: CephFS space usage

2024-03-14 Thread Igor Fedotov
Thorn, you might want to assess amount of files on the mounted fs by runnning "du -h | wc". Does it differ drastically from amount of objects in the pool = ~3.8 M? And just in case - please run "rados lssnap -p cephfs.shared.data". Thanks, Igor On 3/14/2024 1:42 AM, Thorne Lawler wrote:

[ceph-users] Re: CephFS space usage

2024-03-13 Thread Thorne Lawler
Igor, Etienne, Bogdan, The system is a four node cluster. Each node has 12 3.8TB SSDs, and each SSD is an OSD. I have not defined any separate DB / WAL devices - this cluster is mostly at cephadm defaults. Everything is currently configured to have x3 replicas. The system also does

[ceph-users] Re: CephFS space usage

2024-03-13 Thread Igor Fedotov
Hi Thorn, could you please share the output of "ceph df detail" command representing the problem? And please give an overview of your OSD layout - amount of OSDs, shared or dedicated DB/WAL, main and DB volume sizes. Thanks, Igor On 3/13/2024 5:58 AM, Thorne Lawler wrote: Hi

[ceph-users] Re: CephFS space usage

2024-03-13 Thread Bogdan Adrian Velica
Hi, Not sure if it was mentioned but also you could check the following: 1. Snapshots Snapshots can consume a significant amount of space without being immediately obvious. They preserve the state of the filesystem at various points in time. List Snapshots: Use the "*ceph fs subvolume snapshot

[ceph-users] Re: CephFS space usage

2024-03-13 Thread Etienne Menguy
Hi, Check your replication/EC configuration. How do you get your different sizes/usages? Étienne From: Thorne Lawler Sent: Wednesday, 13 March 2024 03:58 To: ceph-users@ceph.io Subject: [ceph-users] CephFS space usage [Some people who received this message