[ceph-users] Re: Unbalanced Cluster
Thanks for the list of parameters. I suppose I was just asking whether the docs saying "planned carefully" was something more than being ready to moderate the rebuild process etc. I think that clears things up for me. Sincerely -Dave On 2022-05-05 1:15 p.m., Anthony D'Atri wrote: > [△EXTERNAL] > > > >> The balancer was driving all the weights to 1.0 so I turned it off. > Which weights (CRUSH or reweight?) And which balancer? > > Assuming the ceph-mgr balancer module in upmap mode, you’d want the reweight > values to be 1.000 since it uses the newer pg-upmap functionality to > distribute capacity. Lower reweight values have a way of confusing the > balancer and preventing good uniformity. If you had a bunch of significantly > adjusted reweight values, eg. from prior runs of reweight-by-utilization, > that could contribute to suboptimal balancing. > >> You mentioned that all solutions would cause data migration and would need >> to be planned carefully. I've seen that language in the docs and other >> messages but what I can't find is what is meant by "planned carefully". > There are many ways to proceed; documenting them all might be a bit of a > rabbit-hole. > > > >> Doing any of these will cause data migration like crazy but it's not >> avoidable other than to change the number of max backfills etc. but the >> systems should still be accessible during this time but with reduced >> bandwidth and higher latency. Is it just a warning that the system could be >> degraded for a long period of time or is it suggesting that users should >> take an outage while the rebuild happens? > Throttling recovery/backfill can reduce the impact of big data migrations, at > the expense of increased elapsed time to complete. > > osd_max_backfills=1 > osd_recovery_max_active=1 > osd_recovery_op_priority=1 > osd_recovery_max_single_start=1 > osd_scrub_during_recovery=false > > Also, ensure that > > osd_op_queue_cut_off = high > > This will help ensure that recovery / backfill doesn’t DoS client traffic. > I’m not sure if this is default in your release. If changed, I believe that > OSDs would need to be restarted for the new value to take effect. > > PGs: > > pg_num = ( #OSDs * ratio ) / replication > ratio = pg_num * replication / #OSDs > > On clusters with multiple pools this can get a bit complicated when more than > one pool have significant numbers of PGs; the end goal is the total number of > PGs on a given OSD, which `ceph osd df` reports. > > Your OSDs look to have ~~ 190 PGs each on average, which is probably ok given > your media. If you do have big empty pools, deleting them would show more > indicative numbers. PG ratio targets are somewhat controversial, but > depending on your media and RAM an aggregate around this range is reasonable; > you can go higher with flash. > > This calculator can help when you have multiple pools: > > https://can01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fold.ceph.com%2Fpgcalc%2Fdata=05%7C01%7Cdschulz%40ucalgary.ca%7C403a1880f2d94adc374808da2ecbc34a%7Cc609a0eca5e346319686192280bd9151%7C1%7C0%7C637873750505242659%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C3000%7C%7C%7Csdata=HSTPwi0QFzPqXqzSi0NWe1sXYvhLwuOaK%2FNMKWBH%2Fho%3Dreserved=0 > > If you need to bump pg_num for a pool, you don’t have to do it in one step. > You can increase it by, say, 32 at a time. > > >> Thanks for your guidance. >> >> -Dave >> >> >> On 2022-05-05 2:33 a.m., Erdem Agaoglu wrote: >> [△EXTERNAL] >> >> >> Hi David, >> >> I think you're right with your option 2. 512 pgs is just too few. You're also >> right with the "inflation" but you should add your erasure bits to the >> calculation, so 9x512=4608. With 144 OSDs, you would average 32 pgs per OSD. >> Some old advice for that number was around 100. >> >> But your current PGs per OSD is around 180-190 according to the df output. >> This >> is probably because of your empty pool 4 fsdata, having 4096 pgs with size 5, >> and adding 5x4096=20480, 20480/144=142 more pgs per OSD. >> >> I'm not really sure how empty/unused PGs would affect OSD, but I think it >> will >> affect the balancer which tries to balance the number of PGs, which might >> explain things getting worse. Also your df output shows several >> modifications in >> weights/reweights but I'm not sure if they're manual or balancer adjusted. >> >> I would first delete that empty pool to have a more clear picture of PGs on >> OSDs. Then I would increase the pg_num for pool 6 to 2048. And after >> everything >> settles, if it's still too unbalanced I'd go for the upmap balancer. >> Needless to >> say, all these would cause major data migration so it should be planned >> carefully. >> >> Best, >> >> >> >> On Thu, May 5, 2022 at 12:02 AM David Schulz >> mailto:dsch...@ucalgary.ca>> wrote: >> Hi Josh, >> >> We do have an old pool that is empty so there's 4611 empty PGs but the >> rest seem fairly close:
[ceph-users] Re: Unbalanced Cluster
WHA? Mind blown. I hadn't noticed that you can reduce PG counts now! Thanks Richard for pointing that out. I've already reduced the pgs in that unused pool to half of what it was but I think the other backfill operations have blocked that but for the moment I think the system is ok at least for the weekend. Thanks for pointing that out. -Dave On 2022-05-05 1:09 p.m., Richard Bade wrote: [△EXTERNAL] Hi David, Something else you could try with that other pool, if it contains little or no data, is to reduce the PG number. This does cause some backfill operations as it does a pg merge but this doesn't take long if the pg is virtually empty. The autoscaler has a mode where it can make recommendations for you without actually doing anything if you want some advice on a suitable number. Then you can set it manually. If the empty pg's are a factor in the balance issues then this will help. Also, the upmap mode on the balancer is far more effective than reweight. It has an option where you can control the max deviation. I have this set to one and it achieves a 5% spread for my EC cluster. Note, you'll need to reweight everything back to 1, which will cause backfill to occur. If you have your backfill_full level set to default this should stop any osd's over 85% doing any backfill. Rich ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Unbalanced Cluster
On Thu, May 5, 2022 at 11:15 AM Anthony D'Atri wrote: > > > This calculator can help when you have multiple pools: > > https://old.ceph.com/pgcalc/ Did an EC-aware version of this calculator ever escape the Red Hat paywall? Thanks, -- Jeremy Austin jhaus...@gmail.com ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Unbalanced Cluster
> The balancer was driving all the weights to 1.0 so I turned it off. Which weights (CRUSH or reweight?) And which balancer? Assuming the ceph-mgr balancer module in upmap mode, you’d want the reweight values to be 1.000 since it uses the newer pg-upmap functionality to distribute capacity. Lower reweight values have a way of confusing the balancer and preventing good uniformity. If you had a bunch of significantly adjusted reweight values, eg. from prior runs of reweight-by-utilization, that could contribute to suboptimal balancing. > You mentioned that all solutions would cause data migration and would need to > be planned carefully. I've seen that language in the docs and other messages > but what I can't find is what is meant by "planned carefully". There are many ways to proceed; documenting them all might be a bit of a rabbit-hole. > Doing any of these will cause data migration like crazy but it's not > avoidable other than to change the number of max backfills etc. but the > systems should still be accessible during this time but with reduced > bandwidth and higher latency. Is it just a warning that the system could be > degraded for a long period of time or is it suggesting that users should take > an outage while the rebuild happens? Throttling recovery/backfill can reduce the impact of big data migrations, at the expense of increased elapsed time to complete. osd_max_backfills=1 osd_recovery_max_active=1 osd_recovery_op_priority=1 osd_recovery_max_single_start=1 osd_scrub_during_recovery=false Also, ensure that osd_op_queue_cut_off = high This will help ensure that recovery / backfill doesn’t DoS client traffic. I’m not sure if this is default in your release. If changed, I believe that OSDs would need to be restarted for the new value to take effect. PGs: pg_num = ( #OSDs * ratio ) / replication ratio = pg_num * replication / #OSDs On clusters with multiple pools this can get a bit complicated when more than one pool have significant numbers of PGs; the end goal is the total number of PGs on a given OSD, which `ceph osd df` reports. Your OSDs look to have ~~ 190 PGs each on average, which is probably ok given your media. If you do have big empty pools, deleting them would show more indicative numbers. PG ratio targets are somewhat controversial, but depending on your media and RAM an aggregate around this range is reasonable; you can go higher with flash. This calculator can help when you have multiple pools: https://old.ceph.com/pgcalc/ If you need to bump pg_num for a pool, you don’t have to do it in one step. You can increase it by, say, 32 at a time. > > Thanks for your guidance. > > -Dave > > > On 2022-05-05 2:33 a.m., Erdem Agaoglu wrote: > [△EXTERNAL] > > > Hi David, > > I think you're right with your option 2. 512 pgs is just too few. You're also > right with the "inflation" but you should add your erasure bits to the > calculation, so 9x512=4608. With 144 OSDs, you would average 32 pgs per OSD. > Some old advice for that number was around 100. > > But your current PGs per OSD is around 180-190 according to the df output. > This > is probably because of your empty pool 4 fsdata, having 4096 pgs with size 5, > and adding 5x4096=20480, 20480/144=142 more pgs per OSD. > > I'm not really sure how empty/unused PGs would affect OSD, but I think it will > affect the balancer which tries to balance the number of PGs, which might > explain things getting worse. Also your df output shows several modifications > in > weights/reweights but I'm not sure if they're manual or balancer adjusted. > > I would first delete that empty pool to have a more clear picture of PGs on > OSDs. Then I would increase the pg_num for pool 6 to 2048. And after > everything > settles, if it's still too unbalanced I'd go for the upmap balancer. Needless > to > say, all these would cause major data migration so it should be planned > carefully. > > Best, > > > > On Thu, May 5, 2022 at 12:02 AM David Schulz > mailto:dsch...@ucalgary.ca>> wrote: > Hi Josh, > > We do have an old pool that is empty so there's 4611 empty PGs but the > rest seem fairly close: > > # ceph pg ls|awk '{print $7/1024/1024/10}'|cut -d "." -f 1|sed -e > 's/$/0/'|sort -n|uniq -c >4611 00 > 1 1170 > 8 1180 > 10 1190 > 28 1200 > 51 1210 > 54 1220 > 52 1230 > 32 1240 > 13 1250 > 7 1260 > Hmm, that's interesting, adding up the first column except the 4611 > gives 256 but there are 512 PGs in the main data pool. > > Here are our pool settings: > > pool 3 'fsmeta' replicated size 3 min_size 1 crush_rule 0 object_hash > rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 35490 > flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 > recovery_priority 5 application cephfs > pool 4 'fsdata' erasure size 5 min_size 4 crush_rule 1 object_hash > rjenkins pg_num 4096 pgp_num 4096 autoscale_mode warn
[ceph-users] Re: Unbalanced Cluster
Hi David, Something else you could try with that other pool, if it contains little or no data, is to reduce the PG number. This does cause some backfill operations as it does a pg merge but this doesn't take long if the pg is virtually empty. The autoscaler has a mode where it can make recommendations for you without actually doing anything if you want some advice on a suitable number. Then you can set it manually. If the empty pg's are a factor in the balance issues then this will help. Also, the upmap mode on the balancer is far more effective than reweight. It has an option where you can control the max deviation. I have this set to one and it achieves a 5% spread for my EC cluster. Note, you'll need to reweight everything back to 1, which will cause backfill to occur. If you have your backfill_full level set to default this should stop any osd's over 85% doing any backfill. Rich ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Unbalanced Cluster
Hi Erdem, The balancer was driving all the weights to 1.0 so I turned it off. The OSDs were creeping up to the 90% full threshold with it turned on. I've been playing whackamole with the OSDs for a week trying to keep the cluster from locking all writes when a single OSD goes over 90%. I had a look at deleting that pool. I think it was there and still required to keep the filesystem happy and I'm a bit anxious about deleting it. It's been a long time since the new fsdatak7m2 was created and my memory is getting foggy about how it was done. I think the new pool was created as a tier and then data was migrated to it. I don't really think it's safe to delete the pool as I think it is still in use: # ceph osd pool stats fsdata pool fsdata id 4 227323/3383377685 objects degraded (0.007%) 94063915/3383377685 objects misplaced (2.780%) recovery io 0 B/s, 71 objects/s The filesystem has 1.4B files You mentioned that all solutions would cause data migration and would need to be planned carefully. I've seen that language in the docs and other messages but what I can't find is what is meant by "planned carefully". Doing any of these will cause data migration like crazy but it's not avoidable other than to change the number of max backfills etc. but the systems should still be accessible during this time but with reduced bandwidth and higher latency. Is it just a warning that the system could be degraded for a long period of time or is it suggesting that users should take an outage while the rebuild happens? Thanks for your guidance. -Dave On 2022-05-05 2:33 a.m., Erdem Agaoglu wrote: [△EXTERNAL] Hi David, I think you're right with your option 2. 512 pgs is just too few. You're also right with the "inflation" but you should add your erasure bits to the calculation, so 9x512=4608. With 144 OSDs, you would average 32 pgs per OSD. Some old advice for that number was around 100. But your current PGs per OSD is around 180-190 according to the df output. This is probably because of your empty pool 4 fsdata, having 4096 pgs with size 5, and adding 5x4096=20480, 20480/144=142 more pgs per OSD. I'm not really sure how empty/unused PGs would affect OSD, but I think it will affect the balancer which tries to balance the number of PGs, which might explain things getting worse. Also your df output shows several modifications in weights/reweights but I'm not sure if they're manual or balancer adjusted. I would first delete that empty pool to have a more clear picture of PGs on OSDs. Then I would increase the pg_num for pool 6 to 2048. And after everything settles, if it's still too unbalanced I'd go for the upmap balancer. Needless to say, all these would cause major data migration so it should be planned carefully. Best, On Thu, May 5, 2022 at 12:02 AM David Schulz mailto:dsch...@ucalgary.ca>> wrote: Hi Josh, We do have an old pool that is empty so there's 4611 empty PGs but the rest seem fairly close: # ceph pg ls|awk '{print $7/1024/1024/10}'|cut -d "." -f 1|sed -e 's/$/0/'|sort -n|uniq -c 4611 00 1 1170 8 1180 10 1190 28 1200 51 1210 54 1220 52 1230 32 1240 13 1250 7 1260 Hmm, that's interesting, adding up the first column except the 4611 gives 256 but there are 512 PGs in the main data pool. Here are our pool settings: pool 3 'fsmeta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 35490 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 4 'fsdata' erasure size 5 min_size 4 crush_rule 1 object_hash rjenkins pg_num 4096 pgp_num 4096 autoscale_mode warn last_change 35490 lfor 0/0/4742 flags hashpspool,ec_overwrites stripe_width 12288 application cephfs pool 6 'fsdatak7m2' erasure size 9 min_size 8 crush_rule 3 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 35490 flags hashpspool,ec_overwrites stripe_width 28672 application cephfs The fsdata was originally created with very safe erasure coding that wasted too much space, then the fsdatak7m2 was created and everything was migrated to it. This is why there's at least 4096 pgs with 0 bytes. -Dave On 2022-05-04 2:08 p.m., Josh Baergen wrote: > [△EXTERNAL] > > > > Hi Dave, > >> This cluster was upgraded from 13.x to 14.2.9 some time ago. The entire >> cluster was installed at the 13.x time and was upgraded together so all >> OSDs should have the same formatting etc. > OK, thanks, that should rule out a difference in bluestore > min_alloc_size, for example. > >> Below is pasted the ceph osd df tree output. > It looks like there is some pretty significant skew in terms of the > amount of bytes per active PG. If you issue "ceph pg ls", are you able > to find any PGs with a significantly higher byte count? > > Josh ___ ceph-users mailing list --
[ceph-users] Re: Unbalanced Cluster
Hi Richard, Thanks for that. It never occurred to me that we'd need at least 10 servers for that shape of EC. We will certainly push to get that new server in now. -Dave On 2022-05-04 5:07 p.m., Richard Bade wrote: > [△EXTERNAL] > > > > Hi David, > I think that part of the problem with unbalanced osds is that your EC > rule k=7,m=2 gives 9 total chunks and you have 9 total servers. This > is essentially tying cephs hands as it has no choice where to put the > pg's. Assuming a failure domain of host then each EC shard needs to be > on a different host. > Therefore adding another host would help, but you're still going to be > limited. For balance a rule of k=4,m=2 would work better on 9 servers. > I had a similar experience with 6 hosts and 4,2 EC rule. > This may also cause you other problems if you have a host failure as > you wouldn't be able to balance out the host due to 8 hosts being less > than 9 shards. You would stay degraded. At least that is my > understanding. I'm happy to be corrected if I've put you wrong. > > Rich ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Unbalanced Cluster
Hi David, I think that part of the problem with unbalanced osds is that your EC rule k=7,m=2 gives 9 total chunks and you have 9 total servers. This is essentially tying cephs hands as it has no choice where to put the pg's. Assuming a failure domain of host then each EC shard needs to be on a different host. Therefore adding another host would help, but you're still going to be limited. For balance a rule of k=4,m=2 would work better on 9 servers. I had a similar experience with 6 hosts and 4,2 EC rule. This may also cause you other problems if you have a host failure as you wouldn't be able to balance out the host due to 8 hosts being less than 9 shards. You would stay degraded. At least that is my understanding. I'm happy to be corrected if I've put you wrong. Rich ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Unbalanced Cluster
Hi Josh, We do have an old pool that is empty so there's 4611 empty PGs but the rest seem fairly close: # ceph pg ls|awk '{print $7/1024/1024/10}'|cut -d "." -f 1|sed -e 's/$/0/'|sort -n|uniq -c 4611 00 1 1170 8 1180 10 1190 28 1200 51 1210 54 1220 52 1230 32 1240 13 1250 7 1260 Hmm, that's interesting, adding up the first column except the 4611 gives 256 but there are 512 PGs in the main data pool. Here are our pool settings: pool 3 'fsmeta' replicated size 3 min_size 1 crush_rule 0 object_hash rjenkins pg_num 256 pgp_num 256 autoscale_mode warn last_change 35490 flags hashpspool stripe_width 0 pg_autoscale_bias 4 pg_num_min 16 recovery_priority 5 application cephfs pool 4 'fsdata' erasure size 5 min_size 4 crush_rule 1 object_hash rjenkins pg_num 4096 pgp_num 4096 autoscale_mode warn last_change 35490 lfor 0/0/4742 flags hashpspool,ec_overwrites stripe_width 12288 application cephfs pool 6 'fsdatak7m2' erasure size 9 min_size 8 crush_rule 3 object_hash rjenkins pg_num 512 pgp_num 512 autoscale_mode warn last_change 35490 flags hashpspool,ec_overwrites stripe_width 28672 application cephfs The fsdata was originally created with very safe erasure coding that wasted too much space, then the fsdatak7m2 was created and everything was migrated to it. This is why there's at least 4096 pgs with 0 bytes. -Dave On 2022-05-04 2:08 p.m., Josh Baergen wrote: > [△EXTERNAL] > > > > Hi Dave, > >> This cluster was upgraded from 13.x to 14.2.9 some time ago. The entire >> cluster was installed at the 13.x time and was upgraded together so all >> OSDs should have the same formatting etc. > OK, thanks, that should rule out a difference in bluestore > min_alloc_size, for example. > >> Below is pasted the ceph osd df tree output. > It looks like there is some pretty significant skew in terms of the > amount of bytes per active PG. If you issue "ceph pg ls", are you able > to find any PGs with a significantly higher byte count? > > Josh ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Unbalanced Cluster
Hi Josh, Thanks for getting back to me so soon! This cluster was upgraded from 13.x to 14.2.9 some time ago. The entire cluster was installed at the 13.x time and was upgraded together so all OSDs should have the same formatting etc. Below is pasted the ceph osd df tree output. Sincerely -Dave # ceph osd df tree ID CLASS WEIGHT REWEIGHT SIZE RAW USE DATA OMAP META AVAIL %USE VAR PGS STATUS TYPE NAME -1 1773.46399 - 1.8 PiB 1.2 PiB 1.2 PiB 1.1 TiB 11 TiB 570 TiB 68.71 1.00 - root default -3 202.48462 - 202 TiB 135 TiB 131 TiB 122 GiB 1.2 TiB 68 TiB 66.50 0.97 - host trex-ceph1 0 hdd 12.65529 1.0 13 TiB 7.7 TiB 7.5 TiB 4.4 GiB 73 GiB 5.0 TiB 60.73 0.88 186 up osd.0 1 hdd 12.65529 1.0 13 TiB 8.3 TiB 8.1 TiB 7.2 GiB 77 GiB 4.3 TiB 65.64 0.96 185 up osd.1 2 hdd 12.65529 1.0 13 TiB 6.8 TiB 6.6 TiB 6.1 GiB 67 GiB 5.9 TiB 53.51 0.78 183 up osd.2 3 hdd 12.65529 1.0 13 TiB 8.6 TiB 8.3 TiB 7.2 GiB 76 GiB 4.1 TiB 67.57 0.98 182 up osd.3 4 hdd 12.65529 1.0 13 TiB 8.5 TiB 8.3 TiB 8.6 GiB 76 GiB 4.2 TiB 67.19 0.98 178 up osd.4 5 hdd 12.65529 1.0 13 TiB 10 TiB 10 TiB 13 GiB 86 GiB 2.3 TiB 81.88 1.19 180 up osd.5 6 hdd 12.65529 1.0 13 TiB 6.8 TiB 6.6 TiB 10 GiB 67 GiB 5.9 TiB 53.47 0.78 183 up osd.6 7 hdd 12.65529 1.0 13 TiB 10 TiB 9.9 TiB 8.7 GiB 86 GiB 2.6 TiB 79.82 1.16 184 up osd.7 8 hdd 12.65529 1.0 13 TiB 7.8 TiB 7.6 TiB 7.2 GiB 72 GiB 4.9 TiB 61.60 0.90 181 up osd.8 9 hdd 12.65529 1.0 13 TiB 7.7 TiB 7.5 TiB 8.4 GiB 73 GiB 4.9 TiB 60.93 0.89 186 up osd.9 10 hdd 12.65529 1.0 13 TiB 8.4 TiB 8.2 TiB 5.8 GiB 75 GiB 4.3 TiB 66.30 0.96 179 up osd.10 11 hdd 12.65529 1.0 13 TiB 7.3 TiB 7.1 TiB 6.0 GiB 69 GiB 5.3 TiB 57.86 0.84 182 up osd.11 12 hdd 12.65529 1.0 13 TiB 9.0 TiB 8.8 TiB 10 GiB 78 GiB 3.6 TiB 71.41 1.04 183 up osd.12 13 hdd 12.65529 1.0 13 TiB 7.8 TiB 7.6 TiB 2.8 GiB 74 GiB 4.9 TiB 61.52 0.90 188 up osd.13 14 hdd 12.65529 1.0 13 TiB 10 TiB 10 TiB 5.9 GiB 87 GiB 2.3 TiB 81.61 1.19 185 up osd.14 15 hdd 12.65529 1.0 13 TiB 9.2 TiB 9.0 TiB 10 GiB 81 GiB 3.4 TiB 72.99 1.06 184 up osd.15 -5 197.82933 - 202 TiB 140 TiB 136 TiB 130 GiB 1.2 TiB 63 TiB 68.97 1.00 - host trex-ceph2 16 hdd 12.65529 1.0 13 TiB 7.6 TiB 7.4 TiB 9.9 GiB 73 GiB 5.0 TiB 60.42 0.88 191 up osd.16 17 hdd 12.65529 1.0 13 TiB 8.6 TiB 8.4 TiB 18 GiB 78 GiB 4.0 TiB 68.29 0.99 191 up osd.17 18 hdd 12.65529 1.0 13 TiB 8.5 TiB 8.3 TiB 8.7 GiB 79 GiB 4.2 TiB 67.01 0.98 194 up osd.18 19 hdd 12.65529 1.0 13 TiB 9.9 TiB 9.6 TiB 9.0 GiB 85 GiB 2.8 TiB 78.02 1.14 188 up osd.19 20 hdd 12.65529 1.0 13 TiB 8.7 TiB 8.5 TiB 10 GiB 79 GiB 3.9 TiB 69.02 1.00 191 up osd.20 21 hdd 12.65529 0.79160 13 TiB 8.4 TiB 8.2 TiB 4.5 GiB 71 GiB 4.2 TiB 66.56 0.97 139 up osd.21 22 hdd 12.65529 0.80670 13 TiB 9.9 TiB 9.7 TiB 2.7 GiB 83 GiB 2.8 TiB 78.26 1.14 155 up osd.22 23 hdd 12.65529 1.0 13 TiB 8.8 TiB 8.6 TiB 12 GiB 80 GiB 3.9 TiB 69.54 1.01 194 up osd.23 24 hdd 12.65529 1.0 13 TiB 8.0 TiB 7.8 TiB 7.3 GiB 76 GiB 4.7 TiB 63.13 0.92 190 up osd.24 25 hdd 8.0 0.42892 13 TiB 11 TiB 10 TiB 6.9 GiB 77 GiB 2.0 TiB 83.82 1.22 86 up osd.25 26 hdd 12.65529 1.0 13 TiB 7.2 TiB 7.0 TiB 10 GiB 70 GiB 5.4 TiB 57.28 0.83 187 up osd.26 27 hdd 12.65529 1.0 13 TiB 10 TiB 9.9 TiB 1.5 GiB 86 GiB 2.5 TiB 79.90 1.16 182 up osd.27 28 hdd 12.65529 1.0 13 TiB 9.2 TiB 9.0 TiB 8.8 GiB 81 GiB 3.4 TiB 72.79 1.06 190 up osd.28 29 hdd 12.65529 1.0 13 TiB 6.8 TiB 6.6 TiB 7.2 GiB 69 GiB 5.9 TiB 53.53 0.78 194 up osd.29 30 hdd 12.65529 1.0 13 TiB 9.0 TiB 8.8 TiB 4.5 GiB 80 GiB 3.7 TiB 71.05 1.03 186 up osd.30 31 hdd 12.65529 1.0 13 TiB 8.2 TiB 8.0 TiB 8.8 GiB 76 GiB 4.4 TiB 64.92 0.94 191 up osd.31 -7 202.48462 - 202 TiB 134 TiB 131 TiB 146 GiB 1.2 TiB 68 TiB 66.33 0.97 - host trex-ceph3 32 hdd 12.65529 1.0 13 TiB 9.1 TiB 8.8 TiB 10 GiB 80 GiB 3.6 TiB 71.63 1.04 182 up osd.32 33 hdd 12.65529 1.0 13 TiB 8.9 TiB 8.7 TiB 4.3 GiB 80 GiB 3.8 TiB 70.15 1.02 180 up osd.33 34 hdd 12.65529 1.0 13 TiB 6.4 TiB 6.2 TiB 13 GiB 67 GiB 6.3 TiB 50.43 0.73 183 up osd.34