[ceph-users] Odd auto-scaler warnings about too few/many PGs

Torkil Svensgaard Fri, 26 Jan 2024 00:36:11 -0800

Hi

A few years ago we were really strapped for space so we tweaked pg_numfor some pools to ensure all pgs were as to close to the same size aspossible while stile observing the power of 2 rule, in order to get themost mileage space wise. We set the auto-scaler to off for the tweakedpools to get rid of the warnings.

We now have a lot more free space so I flipped the auto-scaler to warnfor all pools and set the bulk flag for the pools expected to be datapools, leading to this:


"
[WRN] POOL_TOO_FEW_PGS: 4 pools have too few placement groups
    Pool rbd has 512 placement groups, should have 2048
    Pool rbd_internal has 1024 placement groups, should have 2048
    Pool cephfs.nvme.data has 32 placement groups, should have 4096
    Pool cephfs.ssd.data has 32 placement groups, should have 1024
[WRN] POOL_TOO_MANY_PGS: 4 pools have too many placement groups
    Pool libvirt has 256 placement groups, should have 32
    Pool cephfs.cephfs.data has 512 placement groups, should have 32
    Pool rbd_ec_data has 4096 placement groups, should have 1024
    Pool cephfs.hdd.data has 2048 placement groups, should have 1024
"

That's a lot of warnings *ponder*

"
# ceph osd pool autoscale-status

POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIOTARGET RATIO EFFECTIVE RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE BULKlibvirt 2567G 3.0 3031T 0.00251.0 256 warn False.mgr 807.5M 2.0 6520G 0.00021.0 1 warn Falserbd_ec 9168k 3.0 6520G 0.00001.0 32 warn Falsenvme 31708G 2.0 209.5T 0.29551.0 2048 warn False.nfs 36864 3.0 6520G 0.00001.0 32 warn Falsecephfs.cephfs.meta 24914M 3.0 6520G 0.01124.0 32 warn Falsecephfs.cephfs.data 16384 3.0 6520G 0.00001.0 512 warn Falserbd.ssd.data 798.1G 2.25 6520G 0.27541.0 64 warn Falserbd_ec_data 609.2T 1.5 3031T 0.30141.0 4096 warn Truerbd 68170G 3.0 3031T 0.06591.0 512 warn Truerbd_internal 69553G 3.0 3031T 0.06721.0 1024 warn Truecephfs.nvme.data 0 2.0 209.5T 0.00001.0 32 warn Truecephfs.ssd.data 68609M 2.0 6520G 0.02061.0 32 warn Truecephfs.hdd.data 111.0T 2.25 3031T 0.08241.0 2048 warn True

"

"
# ceph df
--- RAW STORAGE ---
CLASS     SIZE    AVAIL     USED  RAW USED  %RAW USED
hdd    3.0 PiB  1.3 PiB  1.6 PiB   1.6 PiB      54.69
nvme   210 TiB  146 TiB   63 TiB    63 TiB      30.21
ssd    6.4 TiB  4.0 TiB  2.4 TiB   2.4 TiB      37.69
TOTAL  3.2 PiB  1.5 PiB  1.7 PiB   1.7 PiB      53.07

--- POOLS ---
POOL                ID   PGS   STORED  OBJECTS     USED  %USED  MAX AVAIL
rbd                  4   512   80 TiB   21.35M  200 TiB  19.31    278 TiB
libvirt              5   256  3.0 TiB  810.89k  7.5 TiB   0.89    278 TiB
rbd_internal         6  1024   86 TiB   28.22M  204 TiB  19.62    278 TiB
.mgr                 8     1  4.3 GiB    1.06k  1.6 GiB   0.07    1.0 TiB
rbd_ec              10    32   55 MiB       25   27 MiB      0    708 GiB
rbd_ec_data         11  4096  683 TiB  180.52M  914 TiB  52.26    556 TiB
nvme                23  2048   46 TiB   25.18M   62 TiB  31.62     67 TiB
.nfs                25    32  4.6 KiB       10  108 KiB      0    708 GiB
cephfs.cephfs.meta  31    32   25 GiB    1.66M   73 GiB   3.32    708 GiB
cephfs.cephfs.data  32   679    489 B   40.41M   48 KiB      0    708 GiB
cephfs.nvme.data    34    32      0 B        0      0 B      0     67 TiB
cephfs.ssd.data     35    32   77 GiB  425.03k  134 GiB   5.94    1.0 TiB
cephfs.hdd.data     37  2048  121 TiB   68.42M  250 TiB  23.03    371 TiB
rbd.ssd.data        38    64  934 GiB  239.94k  1.8 TiB  45.82    944 GiB
"

The most weird one:

Pool rbd_ec_data stores 683TB in 4096 pgs -> warn should be 1024
Pool rbd_internal stores 86TB in 1024 pgs-> warn should be 2048

That makes no sense to me based on the amount of data stored. Is this abug or what am I missing? Ceph version is 17.2.7.


Mvh.

Torkil
--
Torkil Svensgaard
Systems Administrator
Danish Research Centre for Magnetic Resonance DRCMR, Section 714
Copenhagen University Hospital Amager and Hvidovre
Kettegaard Allé 30, 2650 Hvidovre, Denmark
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Odd auto-scaler warnings about too few/many PGs

Reply via email to