[ceph-users] How to see bucket usage when user is suspended ?
Hello to everyone When I use this command to see bucket usage radosgw-admin bucket stats --bucket= It work only when the owner of the bucket is activated How to see the usage even when the owner is suspended ? Here is 2 exemple, one with the owner activated et the other one with owner suspended: radosgw-admin bucket stats --bucket=bonjour { "bucket": "bonjour", "num_shards": 11, "tenant": "", "zonegroup": "46d4ba06-76ff-44b4-a441-54197517ded2", "placement_rule": "default-placement", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" }, "id": "f8c2e3e2-da22-4c80-b330-466db13bbf6a.204114.85", "marker": "f8c2e3e2-da22-4c80-b330-466db13bbf6a.204114.85", "index_type": "Normal", "owner": "identifiant_leviia_GB6mSIAmTt48cY5O", "ver": "0#148,1#124,2#134,3#155,4#199,5#123,6#165,7#141,8#133,9#154,10#137", "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0", "mtime": "0.00", "creation_time": "2023-02-24T16:16:14.196314Z", "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#", "usage": { "rgw.main": { "size": 532572233, "size_actual": 535318528, "size_utilized": 532572233, "size_kb": 520091, "size_kb_actual": 522772, "size_kb_utilized": 520091, "num_objects": 1486 }, "rgw.multimeta": { "size": 0, "size_actual": 0, "size_utilized": 0, "size_kb": 0, "size_kb_actual": 0, "size_kb_utilized": 0, "num_objects": 0 } }, "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 } } radosgw-admin bucket stats --bucket=locking4 { "bucket": "locking4", "num_shards": 11, "tenant": "", "zonegroup": "46d4ba06-76ff-44b4-a441-54197517ded2", "placement_rule": "default-placement", "explicit_placement": { "data_pool": "", "data_extra_pool": "", "index_pool": "" }, "id": "f8c2e3e2-da22-4c80-b330-466db13bbf6a.204114.80", "marker": "f8c2e3e2-da22-4c80-b330-466db13bbf6a.204114.80", "index_type": "Normal", "owner": "identifiant_leviia_xf4q139fq1", "ver": "0#1,1#1,2#1,3#1,4#1,5#1,6#1,7#1,8#1,9#1,10#1", "master_ver": "0#0,1#0,2#0,3#0,4#0,5#0,6#0,7#0,8#0,9#0,10#0", "mtime": "0.00", "creation_time": "2023-02-23T12:49:24.089538Z", "max_marker": "0#,1#,2#,3#,4#,5#,6#,7#,8#,9#,10#", "usage": {}, "bucket_quota": { "enabled": false, "check_on_raw": false, "max_size": -1, "max_size_kb": 0, "max_objects": -1 } } As you can see with the bucket where the owner is suspended (locking4) it lack the part: "usage": { "rgw.main": { "size": 532572233, "size_actual": 535318528, "size_utilized": 532572233, "size_kb": 520091, "size_kb_actual": 522772, "size_kb_utilized": 520091, "num_objects": 1486 }, "rgw.multimeta": { "size": 0, "size_actual": 0, "size_utilized": 0, "size_kb": 0, "size_kb_actual": 0, "size_kb_utilized": 0, "num_objects": 0 } }, How to have this part even when the owner is suspended ? Is it possible via API ? All the best ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Best value for "mds_cache_memory_limit" for large (more than 10 Po) cephfs
Hello to everyone I have a ceph cluster currently serving cephfs. The size of the ceph filesystem is around 1 Po. 1 Active mds and 1 Standby-replay I do not have a lot of cephfs clients for now 5 but it may increase to 20 or 30. Here is some output Rank | State | Daemon| Activity | Dentries | Inodes | Dirs| Caps 0 | active | ceph-g-ssd-4-2.mxwjvd | Reqs: 130 /s | 10.2 M | 10.1 M | 356.8 k | 707.6 k 0-s | standby-replay | ceph-g-ssd-4-1.ixqewp | Evts: 0 /s | 156.5 k | 127.7 k | 47.4 k | 0 It is working really well I plan to to increase this cephfs cluster up to 10 Po (for now) and even more What would be the good value for "mds_cache_memory_limit" ? I have set it to 80 Gb because I have enough ram on my server to do so. Was it a good idea ? Or is it counter-productive ? All the best Arnaud ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: What is the max size of cephfs (filesystem)
Yes i know for zfs such an amount of data is impossible I am not trying to store that much data My question is really for pure curiosity What is the téorical max size of a ceph filesystem For example is it theoretically possible to store 1 exabyte ? 1 zettabyte ? More ? in cephfs ? Let's suppose I have all the servers/osd I want (again just théorically) would I be able to store 1 zettabyte ? More ? Or is there an hardcoded limit to the maximum size of a cephfs cluster Just for curiosity All the best Arnaud Le lun. 20 juin 2022 à 10:26, Robert Sander a écrit : > Am 20.06.22 um 09:45 schrieb Arnaud M: > > > A ZFS file system can store up to *256 quadrillion zettabytes* (ZB). > > How would a storage system look like in reality that could hold such an > amount of data? > > Regards > -- > Robert Sander > Heinlein Consulting GmbH > Schwedter Str. 8/9b, 10119 Berlin > > http://www.heinlein-support.de > > Tel: 030 / 405051-43 > Fax: 030 / 405051-19 > > Zwangsangaben lt. §35a GmbHG: > HRB 220009 B / Amtsgericht Berlin-Charlottenburg, > Geschäftsführer: Peer Heinlein -- Sitz: Berlin > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: What is the max size of cephfs (filesystem)
Hello thanks for the answer But is there any hard coded limit ? Like in zfs ? Maybe a limit to the maximum files a cephfs can have ? All the best Arnaud Le lun. 20 juin 2022 à 10:18, Serkan Çoban a écrit : > Currently the biggest HDD is 20TB. 1 exabyte means 50.000 OSD > cluster(without replication or EC) > AFAIK Cern did some tests using 5000 OSDs. I don't know any larger > clusters than Cern's. > So I am not saying it is impossible but it is very unlikely to grow a > single Ceph cluster to that size. > Maybe you should search for alternatives, like hdfs which I > know/worked with more than 50.000 HDDs without problems. > > On Mon, Jun 20, 2022 at 10:46 AM Arnaud M > wrote: > > > > Hello to everyone > > > > I have looked on the internet but couldn't find an answer. > > Do you know the maximum size of a ceph filesystem ? Not the max size of a > > single file but the limit of the whole filesystem ? > > > > For example a quick search on zfs on google output : > > A ZFS file system can store up to *256 quadrillion zettabytes* (ZB). > > > > I would like to have the same answer with cephfs. > > > > And if there is a limit, where is this limit coded ? Is it hard-coded or > is > > it configurable ? > > > > Let's say someone wants to have a cephfs up to ExaByte, would it be > > completely foolish or would the system, given enough mds and servers and > > everything needed, be usable ? > > > > Is there any other limit to a ceph filesystem ? > > > > All the best > > > > Arnaud > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] What is the max size of cephfs (filesystem)
Hello to everyone I have looked on the internet but couldn't find an answer. Do you know the maximum size of a ceph filesystem ? Not the max size of a single file but the limit of the whole filesystem ? For example a quick search on zfs on google output : A ZFS file system can store up to *256 quadrillion zettabytes* (ZB). I would like to have the same answer with cephfs. And if there is a limit, where is this limit coded ? Is it hard-coded or is it configurable ? Let's say someone wants to have a cephfs up to ExaByte, would it be completely foolish or would the system, given enough mds and servers and everything needed, be usable ? Is there any other limit to a ceph filesystem ? All the best Arnaud ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph remote disaster recovery at PB scale
Hello I will speak about cephfs because it what I am working on Of course you can do some kind of rsync or rclone between two cephfs clusters but at petabytes scales it will be really slow and cost a lot ! There is another approach that we tested successfully (only on test not in prod) We created a replicated cephfs data pool (replica 3) and spread it on 3 datacenters Beauharnois (Canada), Strasbourg (France) and Warsaw (Poland) So we had 1 replica per datacenter Then only the cephfs metadata pool was on ssd (nvme) close to the end user (On Strasbourg (France)) Same for Mon and Mgr (Also in Strasbourg) (in fact only cephfs data was spread geographically) We had high bandwidth and high latency (of course) between the datacenter but it worked surprisingly well This way you can lose up to two datacenters without losing any data (more if you use more replicas). You just have to backup (Mon and CephFS metadata witch are never a lot of data) This strategy is only feasible for cephfs (has it is the least IOPS demanding) If you need more iops then you should isolate the high iops demanding folder and run it on an separated pool locally on ssd All the best Arnaud Leviia https://leviia.com/en Le ven. 1 avr. 2022 à 10:57, huxia...@horebdata.cn a écrit : > Dear Cepher experts, > > We are operating some ceph clusters (both L and N versions) at PB scale, > and now planning remote distaster recovery solutions. Among these clusters, > most are rbd volumes for Openstack and K8s, and a few for S3 object > storage, and very few cephfs clusters. > > For rbd volumes, we are planning to use rbd mirroring, and the data volume > will reach several PBs. My questions are > 1) Rbd mirroring with Peta bytes data is doable or not? are there any > practical limits on the size of the total data? > 2) Should i use parallel rbd mirroring daemons to speed up the sync > process? Or a single daemon would be sufficient? > 3) What could be the lagging time at the remote site? at most 1 minutes or > 10 minutes? > > For S3 object store, we plan to use multisite replication, and thus > 4) are there any practical limits on the size of the total data for S3 > multisite replication? > > and for CephFS data, i have no idea. > 5) what could be the best practice to CephFS disaster recovery scheme? > > > thanks a lot in advance for suggestions, > > > Samuel > > > > > huxia...@horebdata.cn > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Laggy OSDs
Hello is swap enabled on your host ? Is swap used ? For our cluster we tend to allocate enough ram and disable swap Maybe the reboot of your host re-activated swap ? Try to disable swap and see if it help All the best Arnaud Le mar. 29 mars 2022 à 23:41, David Orman a écrit : > We're definitely dealing with something that sounds similar, but hard to > state definitively without more detail. Do you have object lock/versioned > buckets in use (especially if one started being used around the time of the > slowdown)? Was this cluster always 16.2.7? > > What is your pool configuration (EC k+m or replicated X setup), and do you > use the same pool for indexes and data? I'm assuming this is RGW usage via > the S3 API, let us know if this is not correct. > > On Tue, Mar 29, 2022 at 4:13 PM Alex Closs wrote: > > > Hey folks, > > > > We have a 16.2.7 cephadm cluster that's had slow ops and several > > (constantly changing) laggy PGs. The set of OSDs with slow ops seems to > > change at random, among all 6 OSD hosts in the cluster. All drives are > > enterprise SATA SSDs, by either Intel or Micron. We're still not ruling > out > > a network issue, but wanted to troubleshoot from the Ceph side in case > > something broke there. > > > > ceph -s: > > > > health: HEALTH_WARN > > 3 slow ops, oldest one blocked for 246 sec, daemons > > [osd.124,osd.130,osd.141,osd.152,osd.27] have slow ops. > > > > services: > > mon: 5 daemons, quorum > > ceph-osd10,ceph-mon0,ceph-mon1,ceph-osd9,ceph-osd11 (age 28h) > > mgr: ceph-mon0.sckxhj(active, since 25m), standbys: ceph-osd10.xmdwfh, > > ceph-mon1.iogajr > > osd: 143 osds: 143 up (since 92m), 143 in (since 2w) > > rgw: 3 daemons active (3 hosts, 1 zones) > > > > data: > > pools: 26 pools, 3936 pgs > > objects: 33.14M objects, 144 TiB > > usage: 338 TiB used, 162 TiB / 500 TiB avail > > pgs: 3916 active+clean > > 19 active+clean+laggy > > 1 active+clean+scrubbing+deep > > > > io: > > client: 59 MiB/s rd, 98 MiB/s wr, 1.66k op/s rd, 1.68k op/s wr > > > > This is actually much faster than it's been for much of the past hour, > > it's been as low as 50 kb/s and dozens of iops in both directions (where > > the cluster typically does 300MB to a few gigs, and ~4k iops) > > > > The cluster has been on 16.2.7 since a few days after release without > > issue. The only recent change was an apt upgrade and reboot on the hosts > > (which was last Friday and didn't show signs of problems). > > > > Happy to provide logs, let me know what would be useful. Thanks for > > reading this wall :) > > > > -Alex > > > > MIT CSAIL > > he/they > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > > > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: CephFS snaptrim bug?
Hello Linkriver I might have an issue close to your Can you tell us if your strays dirs are full ? What does this command output to you ? ceph tell mds.0 perf dump | grep strays Does the value change over time ? All the best Arnaud Le mer. 16 mars 2022 à 15:35, Linkriver Technology < technol...@linkriver-capital.com> a écrit : > Hi, > > Has anyone figured whether those "lost" snaps are rediscoverable / > trimmable? > All pgs in the cluster have been deep scrubbed since my previous email and > I'm > not seeing any of that wasted space being recovered. > > Regards, > > LRT > > -Original Message- > From: Dan van der Ster > To: technol...@linkriver-capital.com > Cc: Ceph Users , Neha Ojha > Subject: Re: [ceph-users] CephFS snaptrim bug? > Date: Thu, 24 Feb 2022 09:48:04 +0100 > > See https://tracker.ceph.com/issues/54396 > > I don't know how to tell the osds to rediscover those trimmed snaps. > Neha does that possible? > > Cheers, Dan > > On Thu, Feb 24, 2022 at 9:27 AM Dan van der Ster > wrote: > > > > Hi, > > > > I had a look at the code -- looks like there's a flaw in the logic: > > the snaptrim queue is cleared if osd_pg_max_concurrent_snap_trims = 0. > > > > I'll open a tracker and send a PR to restrict > > osd_pg_max_concurrent_snap_trims to >= 1. > > > > Cheers, Dan > > > > On Wed, Feb 23, 2022 at 9:44 PM Linkriver Technology > > wrote: > > > > > > Hello, > > > > > > I have upgraded our Ceph cluster from Nautilus to Octopus (15.2.15) > over the > > > weekend. The upgrade went well as far as I can tell. > > > > > > Earlier today, noticing that our CephFS data pool was approaching > capacity, I > > > removed some old CephFS snapshots (taken weekly at the root of the > filesystem), > > > keeping only the most recent one (created today, 2022-02-21). As > expected, a > > > good fraction of the PGs transitioned from active+clean to > active+clean+snaptrim > > > or active+clean+snaptrim_wait. In previous occasions when I removed a > snapshot > > > it took a few days for snaptrimming to complete. This would happen > without > > > noticeably impacting other workloads, and would also free up an > appreciable > > > amount of disk space. > > > > > > This time around, after a few hours of snaptrimming, users complained > of high IO > > > latency, and indeed Ceph reported "slow ops" on a number of OSDs and > on the > > > active MDS. I attributed this to the snaptrimming and decided to > reduce it by > > > initially setting osd_pg_max_concurrent_snap_trims to 1, which didn't > seem to > > > help much, so I then set it to 0, which had the surprising effect of > > > transitioning all PGs back to active+clean (is this intended?). I also > restarted > > > the MDS which seemed to be struggling. IO latency went back to normal > > > immediately. > > > > > > Outside of users' working hours, I decided to resume snaptrimming by > setting > > > osd_pg_max_concurrent_snap_trims back to 1. Much to my surprise, > nothing > > > happened. All PGs remained (and still remain at time of writing) in > the state > > > active+clean, even after restarting some of them. This definitely seems > > > abnormal, as I mentioned earlier, snaptrimming this FS previously > would take in > > > the order of multiple days. Moreover, if snaptrim were truly complete, > I would > > > expect pool usage to have dropped by appreciable amounts (at least a > dozen > > > terabytes), but that doesn't seem to be the case. > > > > > > A du on the CephFS root gives: > > > > > > # du -sh /mnt/pve/cephfs > > > 31T/mnt/pve/cephfs > > > > > > But: > > > > > > # ceph df > > > > > > --- POOLS --- > > > POOL ID PGS STORED OBJECTS USED %USED MAX > AVAIL > > > cephfs_data 7 512 43 TiB 190.83M 147 TiB 93.22 > 3.6 TiB > > > cephfs_metadata 832 89 GiB 694.60k 266 GiB 1.32 > 6.4 TiB > > > > > > > > > ceph pg dump reports a SNAPTRIMQ_LEN of 0 on all PGs. > > > > > > Did CephFS just leak a massive 12 TiB worth of objects...? It seems to > me that > > > the snaptrim operation did not complete at all. > > > > > > Perhaps relatedly: > > > > > > # ceph daemon mds.choi dump snaps > > > { > > > "last_created": 93, > > > "last_destroyed": 94, > > > "snaps": [ > > > { > > > "snapid": 93, > > > "ino": 1, > > > "stamp": "2022-02-21T00:00:01.245459+0800", > > > "name": "2022-02-21" > > > } > > > ] > > > } > > > > > > How can last_destroyed > last_created? The last snapshot to have been > taken on > > > this FS is indeed #93, and the removed snapshots were all created on > previous > > > weeks. > > > > > > Could someone shed some light please? Assuming that snaptrim didn't > run to > > > completion, how can I manually delete objects from now-removed > snapshots? I > > > believe this is what the Ceph documentation calls a "backwards scrub" > - but I > > > didn't find anything in the Ceph suite that can run such a scrub. Thi
[ceph-users] Re: How often should I scrub the filesystem ?
Thanks a lot !! Your answer will help a lot of admins (myself included). I will study your answer and implement your suggestions and let you know All the best Arnaud Le ven. 11 mars 2022 à 13:25, Milind Changire a écrit : > Here's some answers to your questions: > > On Sun, Mar 6, 2022 at 3:57 AM Arnaud M > wrote: > >> Hello to everyone :) >> >> Just some question about filesystem scrubbing >> >> In this documentation it is said that scrub will help admin check >> consistency of filesystem: >> >> https://docs.ceph.com/en/latest/cephfs/scrub/ >> >> So my questions are: >> >> Is filesystem scrubbing mandatory ? >> How often should I scrub the whole filesystem (ie start at /) >> How often should I scrub ~mdsdir >> Should I set up a cronjob ? >> Is filesystem scrubbing considerated armless ? Even with recursive force >> repair ? >> Is there any chance for scrubbing to overload mds on a big file system (ie >> like find . -ls) ? >> What is the difference between "recursive repair" and "recursive force >> repair" ? Is "force" armless ? >> Is there any way to see at which file/folder is the scrub operation ? In >> fact any better way to see srub progress than "scrub status" which doesn't >> say much >> >> Sorry for all the questions, but there is not that much documentation >> about >> filesystem scrubbing. And I do think the answers will help a lot of cephfs >> administrators :) >> >> Thanks to all >> >> All the best >> >> Arnaud >> ___ >> ceph-users mailing list -- ceph-users@ceph.io >> To unsubscribe send an email to ceph-users-le...@ceph.io >> >> >1. > >Is filesystem scrubbing mandatory ? >As a routine system administration practice, it is good to ensure that >your file-system is always in a good state. To avoid getting the >file-system into a bottleneck state during work hours, it's always a good >idea to reserve some time to run a recursive forward scrub and use the >in-built scrub automation to fix such issues. Although you can run the >scrub at any directory of your choice, it's always a good practice to start >the scrub at the file-system root once in a while. > > So file-system scrubbing is not mandatory but highly recommended. > > Filesystem scrubbing is designed to read CephFS’ metadata and detect > inconsistencies or issues that are generated by bitrot or bugs, just as > RADOS’ pg scrubbing is. In a perfect world without bugs or bit flips it > would be unnecessary, but we don’t live in that world — so a scrub can > detect small issues before they turn into big ones, and the mere act of > reading data can keep it fresh and give storage devices a chance to correct > any media errors while that’s still possible. > > We don’t have a specific recommended schedule and scrub takes up cluster > IO and compute resources so its frequency should be tailored to your > workload. > > >1. > >How often should I scrub the whole filesystem (ie start at /) >Since you'd always want to have a consistent file-system, it would >good to run scrubbing: >1. > > before taking a snapshot of the entire file-system OR > 2. > > before taking a backup of the entire file-system OR > 3. > > after significant metadata activity eg. after creating files, > renaming files, deleting files, changing file attributes, etc. > > > There's no one-rule-fixes-all scenario. So, you'll need to follow a > heuristic approach. The type of devices (HDD or SSD), the amount of > activity wearing the device are the typical factors involved when deciding > to scrub a file-system. If you have some window dedicated for backup > activity, then you’d want to run a recursive forward scrub with repair on > the entire file-system before it is snapshotted and used for backup. > Although you can run a scrub along with active use of the file-system, it > is always recommended that you run the scrub on a quiet file-system so that > neither of the activities get in each other’s way. This also helps in > completing the scrub task quicker. > > >1. > >How often should I scrub ~mdsdir ? >~mdsdir is used to collect deleted (stray) entries. So, the number of >file/dir unlinks in a typical workload should be used to come up with a >heuristic to scrub the file-system. This activity can be taken up >separately from scrubbing the file-system root. > > > >1. > >Should I set up a cr
[ceph-users] Re: How often should I scrub the filesystem ?
Hello Does anyone have some infos about filesystem scrubbing ? As it is a generic thread (not specific to my cluster) I think the answers can help a lot of admin :) It would be awesome if someone can answer some of these questions. And maybe complete the doc. All the best Arnaud Le sam. 5 mars 2022 à 23:26, Arnaud M a écrit : > Hello to everyone :) > > Just some question about filesystem scrubbing > > In this documentation it is said that scrub will help admin check > consistency of filesystem: > > https://docs.ceph.com/en/latest/cephfs/scrub/ > > So my questions are: > > Is filesystem scrubbing mandatory ? > How often should I scrub the whole filesystem (ie start at /) > How often should I scrub ~mdsdir > Should I set up a cronjob ? > Is filesystem scrubbing considerated armless ? Even with recursive force > repair ? > Is there any chance for scrubbing to overload mds on a big file system (ie > like find . -ls) ? > What is the difference between "recursive repair" and "recursive force > repair" ? Is "force" armless ? > Is there any way to see at which file/folder is the scrub operation ? In > fact any better way to see srub progress than "scrub status" which doesn't > say much > > Sorry for all the questions, but there is not that much documentation > about filesystem scrubbing. And I do think the answers will help a lot of > cephfs administrators :) > > Thanks to all > > All the best > > Arnaud > > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays
Thanks to all So then I will wait for the update and see if that helps to resolve the issue All the best Arnaud Le mar. 1 mars 2022 à 11:39, Dan van der Ster a écrit : > Hi, > > There was a recent (long) thread about this. It might give you some hints: > > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/thread/2NT55RUMD33KLGQCDZ74WINPPQ6WN6CW/ > > And about the crash, it could be related to > https://tracker.ceph.com/issues/51824 > > Cheers, dan > > > On Tue, Mar 1, 2022 at 11:30 AM Arnaud M > wrote: > > > > Hello Dan > > > > Thanks a lot for the answer > > > > i do remove the the snap everydays (I keep them for one month) > > But the "num_strays" never seems to reduce. > > > > I know I can do a listing of the folder with "find . -ls". > > > > So my question is: is there a way to find the directory causing the > strays so I can "find . ls" them ? I would prefer not to do it on my whole > cluster as it will take time (several days and more if i need to do it also > on every snap) and will certainly overload the mds. > > > > Please let me know if there is a way to spot the source of strays ? So I > can find the folder/snap with the biggest strays ? > > > > And what about the scrub of ~mdsdir who crashes every times with the > error: > > > > { > > "damage_type": "dir_frag", > > "id": 3776355973, > > "ino": 1099567262916, > > "frag": "*", > > "path": "~mds0/stray3/1000350ecc4" > > }, > > > > Again, thanks for your help, that is really appreciated > > > > All the best > > > > Arnaud > > > > Le mar. 1 mars 2022 à 11:02, Dan van der Ster a > écrit : > >> > >> Hi, > >> > >> stray files are created when you have hardlinks to deleted files, or > >> snapshots of deleted files. > >> You need to delete the snapshots, or "reintegrate" the hardlinks by > >> recursively listing the relevant files. > >> > >> BTW, in pacific there isn't a big problem with accumulating lots of > >> stray files. (Before pacific there was a default limit of 1M strays, > >> but that is now removed). > >> > >> Cheers, dan > >> > >> On Tue, Mar 1, 2022 at 1:04 AM Arnaud M > wrote: > >> > > >> > Hello to everyone > >> > > >> > Our ceph cluster is healthy and everything seems to go well but we > have a > >> > lot of num_strays > >> > > >> > ceph tell mds.0 perf dump | grep stray > >> > "num_strays": 1990574, > >> > "num_strays_delayed": 0, > >> > "num_strays_enqueuing": 0, > >> > "strays_created": 3, > >> > "strays_enqueued": 17, > >> > "strays_reintegrated": 0, > >> > "strays_migrated": 0, > >> > > >> > And num_strays doesn't seems to reduce whatever we do (scrub / or > scrub > >> > ~mdsdir) > >> > And when we scrub ~mdsdir (force,recursive,repair) we get thoses error > >> > > >> > { > >> > "damage_type": "dir_frag", > >> > "id": 3775653237, > >> > "ino": 1099569233128, > >> > "frag": "*", > >> > "path": "~mds0/stray3/100036efce8" > >> > }, > >> > { > >> > "damage_type": "dir_frag", > >> > "id": 3776355973, > >> > "ino": 1099567262916, > >> > "frag": "*", > >> > "path": "~mds0/stray3/1000350ecc4" > >> > }, > >> > { > >> > "damage_type": "dir_frag", > >> > "id": 3776485071, > >> > "ino": 1099559071399, > >> > "frag": "*", > >> > "path": "~mds0/stray4/10002d3eea7" > >> > }, > >> > > >> > And just before the end of the ~mdsdir scrub the mds crashes and I > have to > >> > do a > >> > > >> > ceph mds repaired 0 to have the filesystem back onl
[ceph-users] How often should I scrub the filesystem ?
Hello to everyone :) Just some question about filesystem scrubbing In this documentation it is said that scrub will help admin check consistency of filesystem: https://docs.ceph.com/en/latest/cephfs/scrub/ So my questions are: Is filesystem scrubbing mandatory ? How often should I scrub the whole filesystem (ie start at /) How often should I scrub ~mdsdir Should I set up a cronjob ? Is filesystem scrubbing considerated armless ? Even with recursive force repair ? Is there any chance for scrubbing to overload mds on a big file system (ie like find . -ls) ? What is the difference between "recursive repair" and "recursive force repair" ? Is "force" armless ? Is there any way to see at which file/folder is the scrub operation ? In fact any better way to see srub progress than "scrub status" which doesn't say much Sorry for all the questions, but there is not that much documentation about filesystem scrubbing. And I do think the answers will help a lot of cephfs administrators :) Thanks to all All the best Arnaud ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays
Hello Dan Thanks a lot for the answer i do remove the the snap everydays (I keep them for one month) But the "num_strays" never seems to reduce. I know I can do a listing of the folder with "find . -ls". So my question is: is there a way to find the directory causing the strays so I can "find . ls" them ? I would prefer not to do it on my whole cluster as it will take time (several days and more if i need to do it also on every snap) and will certainly overload the mds. Please let me know if there is a way to spot the source of strays ? So I can find the folder/snap with the biggest strays ? And what about the scrub of ~mdsdir who crashes every times with the error: { "damage_type": "dir_frag", "id": 3776355973, "ino": 1099567262916, "frag": "*", "path": "~mds0/stray3/1000350ecc4" }, Again, thanks for your help, that is really appreciated All the best Arnaud Le mar. 1 mars 2022 à 11:02, Dan van der Ster a écrit : > Hi, > > stray files are created when you have hardlinks to deleted files, or > snapshots of deleted files. > You need to delete the snapshots, or "reintegrate" the hardlinks by > recursively listing the relevant files. > > BTW, in pacific there isn't a big problem with accumulating lots of > stray files. (Before pacific there was a default limit of 1M strays, > but that is now removed). > > Cheers, dan > > On Tue, Mar 1, 2022 at 1:04 AM Arnaud M > wrote: > > > > Hello to everyone > > > > Our ceph cluster is healthy and everything seems to go well but we have a > > lot of num_strays > > > > ceph tell mds.0 perf dump | grep stray > > "num_strays": 1990574, > > "num_strays_delayed": 0, > > "num_strays_enqueuing": 0, > > "strays_created": 3, > > "strays_enqueued": 17, > > "strays_reintegrated": 0, > > "strays_migrated": 0, > > > > And num_strays doesn't seems to reduce whatever we do (scrub / or scrub > > ~mdsdir) > > And when we scrub ~mdsdir (force,recursive,repair) we get thoses error > > > > { > > "damage_type": "dir_frag", > > "id": 3775653237, > > "ino": 1099569233128, > > "frag": "*", > > "path": "~mds0/stray3/100036efce8" > > }, > > { > > "damage_type": "dir_frag", > > "id": 3776355973, > > "ino": 1099567262916, > > "frag": "*", > > "path": "~mds0/stray3/1000350ecc4" > > }, > > { > > "damage_type": "dir_frag", > > "id": 3776485071, > > "ino": 1099559071399, > > "frag": "*", > > "path": "~mds0/stray4/10002d3eea7" > > }, > > > > And just before the end of the ~mdsdir scrub the mds crashes and I have > to > > do a > > > > ceph mds repaired 0 to have the filesystem back online > > > > A lot of them. Do you have any ideas of what those errors are and how > > should I handle them ? > > > > We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot > > everyday of / and keep them for 1 month (rolling) > > > > here is our cluster state > > > > ceph -s > > cluster: > > id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9 > > health: HEALTH_WARN > > 78 pgs not deep-scrubbed in time > > 70 pgs not scrubbed in time > > > > services: > > mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age > 10d) > > mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys: > > ceph-g-112-1.ksojnh > > mds: 1/1 daemons up, 1 standby > > osd: 67 osds: 67 up (since 14m), 67 in (since 7d) > > > > data: > > volumes: 1/1 healthy > > pools: 5 pools, 609 pgs > > objects: 186.86M objects, 231 TiB > > usage: 351 TiB used, 465 TiB / 816 TiB avail > > pgs: 502 active+clean > > 82 active+clean+snaptrim_wait > > 20 active+clean+snaptrim > > 4 active+clean+scrubbing+deep > > 1 active+clean+scrubbing+deep+snaptrim_wait > > > > io: > > client: 8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr > > > > My questions are about the damage found on the ~mdsdir scrub, should I > > worry about it ? What does it mean ? It seems to be linked with my issue > of > > the high number of strays, is it right ? How to fix it and how to reduce > > num_stray ? > > > > Thank for all > > > > All the best > > > > Arnaud > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Errors when scrub ~mdsdir and lots of num_strays
I am using ceph pacific (16.2.5) Does anyone have an idea about my issues ? Thanks again to everyone All the best Arnaud Le mar. 1 mars 2022 à 01:04, Arnaud M a écrit : > Hello to everyone > > Our ceph cluster is healthy and everything seems to go well but we have a > lot of num_strays > > ceph tell mds.0 perf dump | grep stray > "num_strays": 1990574, > "num_strays_delayed": 0, > "num_strays_enqueuing": 0, > "strays_created": 3, > "strays_enqueued": 17, > "strays_reintegrated": 0, > "strays_migrated": 0, > > And num_strays doesn't seems to reduce whatever we do (scrub / or scrub > ~mdsdir) > And when we scrub ~mdsdir (force,recursive,repair) we get thoses error > > { > "damage_type": "dir_frag", > "id": 3775653237, > "ino": 1099569233128, > "frag": "*", > "path": "~mds0/stray3/100036efce8" > }, > { > "damage_type": "dir_frag", > "id": 3776355973, > "ino": 1099567262916, > "frag": "*", > "path": "~mds0/stray3/1000350ecc4" > }, > { > "damage_type": "dir_frag", > "id": 3776485071, > "ino": 1099559071399, > "frag": "*", > "path": "~mds0/stray4/10002d3eea7" > }, > > And just before the end of the ~mdsdir scrub the mds crashes and I have to > do a > > ceph mds repaired 0 to have the filesystem back online > > A lot of them. Do you have any ideas of what those errors are and how > should I handle them ? > > We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot > everyday of / and keep them for 1 month (rolling) > > here is our cluster state > > ceph -s > cluster: > id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9 > health: HEALTH_WARN > 78 pgs not deep-scrubbed in time > 70 pgs not scrubbed in time > > services: > mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age 10d) > mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys: > ceph-g-112-1.ksojnh > mds: 1/1 daemons up, 1 standby > osd: 67 osds: 67 up (since 14m), 67 in (since 7d) > > data: > volumes: 1/1 healthy > pools: 5 pools, 609 pgs > objects: 186.86M objects, 231 TiB > usage: 351 TiB used, 465 TiB / 816 TiB avail > pgs: 502 active+clean > 82 active+clean+snaptrim_wait > 20 active+clean+snaptrim > 4 active+clean+scrubbing+deep > 1 active+clean+scrubbing+deep+snaptrim_wait > > io: > client: 8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr > > My questions are about the damage found on the ~mdsdir scrub, should I > worry about it ? What does it mean ? It seems to be linked with my issue of > the high number of strays, is it right ? How to fix it and how to reduce > num_stray ? > > Thank for all > > All the best > > Arnaud > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Errors when scrub ~mdsdir and lots of num_strays
Hello to everyone Our ceph cluster is healthy and everything seems to go well but we have a lot of num_strays ceph tell mds.0 perf dump | grep stray "num_strays": 1990574, "num_strays_delayed": 0, "num_strays_enqueuing": 0, "strays_created": 3, "strays_enqueued": 17, "strays_reintegrated": 0, "strays_migrated": 0, And num_strays doesn't seems to reduce whatever we do (scrub / or scrub ~mdsdir) And when we scrub ~mdsdir (force,recursive,repair) we get thoses error { "damage_type": "dir_frag", "id": 3775653237, "ino": 1099569233128, "frag": "*", "path": "~mds0/stray3/100036efce8" }, { "damage_type": "dir_frag", "id": 3776355973, "ino": 1099567262916, "frag": "*", "path": "~mds0/stray3/1000350ecc4" }, { "damage_type": "dir_frag", "id": 3776485071, "ino": 1099559071399, "frag": "*", "path": "~mds0/stray4/10002d3eea7" }, And just before the end of the ~mdsdir scrub the mds crashes and I have to do a ceph mds repaired 0 to have the filesystem back online A lot of them. Do you have any ideas of what those errors are and how should I handle them ? We have a lot of data in our cephfs cluster 350 TB+ and we takes snapshot everyday of / and keep them for 1 month (rolling) here is our cluster state ceph -s cluster: id: 817b5736-84ae-11eb-bf7b-c9513f2d60a9 health: HEALTH_WARN 78 pgs not deep-scrubbed in time 70 pgs not scrubbed in time services: mon: 3 daemons, quorum ceph-r-112-1,ceph-g-112-3,ceph-g-112-2 (age 10d) mgr: ceph-g-112-2.ghcodb(active, since 4d), standbys: ceph-g-112-1.ksojnh mds: 1/1 daemons up, 1 standby osd: 67 osds: 67 up (since 14m), 67 in (since 7d) data: volumes: 1/1 healthy pools: 5 pools, 609 pgs objects: 186.86M objects, 231 TiB usage: 351 TiB used, 465 TiB / 816 TiB avail pgs: 502 active+clean 82 active+clean+snaptrim_wait 20 active+clean+snaptrim 4 active+clean+scrubbing+deep 1 active+clean+scrubbing+deep+snaptrim_wait io: client: 8.8 MiB/s rd, 39 MiB/s wr, 25 op/s rd, 54 op/s wr My questions are about the damage found on the ~mdsdir scrub, should I worry about it ? What does it mean ? It seems to be linked with my issue of the high number of strays, is it right ? How to fix it and how to reduce num_stray ? Thank for all All the best Arnaud ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io