[ceph-users] Re: pg_num != pgp_num - and unable to change.

2023-07-06 Thread Anthony D'Atri
Indeed.  For clarity, this process is not the same as the pg_autoscaler.  It's 
real easy to conflate the two, along with the balancer module, so I like to 
call that out to reduce confusion.

> On Jul 6, 2023, at 18:01, Dan van der Ster  wrote:
> 
> Since nautilus, pgp_num (and pg_num) will be increased by the mgr
> automatically to reach your pg_num_target over time. (If you're a source
> code reader check DaemonServer::adjust_pgs for how this works).

___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io


[ceph-users] Re: pg_num != pgp_num - and unable to change.

2023-07-06 Thread Dan van der Ster
Hi Jesper,

> In earlier versions of ceph (without autoscaler) I have only experienced
> that setting pg_num and pgp_num took immidiate effect?

That's correct -- in recent Ceph (since nautilus) you cannot manipulate
pgp_num directly anymore. There is a backdoor setting (set pgp_num_actual
...) but I don't really recommend that.

Since nautilus, pgp_num (and pg_num) will be increased by the mgr
automatically to reach your pg_num_target over time. (If you're a source
code reader check DaemonServer::adjust_pgs for how this works).

In short, the mgr is throttled by the target_max_misplaced_ratio, which
defaults to 5%.

So if you want to split more aggressively,
increase target_max_misplaced_ratio.

Cheers, Dan

__
Clyso GmbH | Ceph Support and Consulting | https://www.clyso.com



On Wed, Jul 5, 2023 at 9:41 PM Jesper Krogh  wrote:

> Hi.
>
> Fresh cluster - after a dance where the autoscaler did not work
> (returned black) as described in the doc - I now seemingly have it
> working. It has bumpted target to something reasonable -- and is slowly
> incrementing pg_num and pgp_num by 2 over time (hope this is correct?)
>
> But .
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool ls detail | grep 62
> pool 22 'cephfs.archive.ec62data' erasure profile ecprof62 size 8
> min_size 7 crush_rule 3 object_hash rjenkins pg_num 150 pgp_num 22
> pg_num_target 512 pgp_num_target 512 autoscale_mode on last_change 9159
> lfor 0/0/9147 flags hashpspool,ec_overwrites,selfmanaged_snaps,bulk
> stripe_width 24576 pg_num_min 128 target_size_ratio 0.4 application
> cephfs
>
> pg_num = 150
> pgp_num = 22
>
> and setting pgp_num seemingly have zero effect on the system .. not even
> with autoscaling set to off.
>
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_autoscale_mode off
> set pool 22 pg_autoscale_mode to off
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pgp_num 150
> set pool 22 pgp_num to 150
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_num_min 128
> set pool 22 pg_num_min to 128
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_num 150
> set pool 22 pg_num to 150
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool set cephfs.archive.ec62data
> pg_autoscale_mode on
> set pool 22 pg_autoscale_mode to on
> jskr@dkcphhpcmgt028:/$ sudo ceph progress
> PG autoscaler increasing pool 22 PGs from 150 to 512 (14s)
>  []
> jskr@dkcphhpcmgt028:/$ sudo ceph osd pool ls detail | grep 62
> pool 22 'cephfs.archive.ec62data' erasure profile ecprof62 size 8
> min_size 7 crush_rule 3 object_hash rjenkins pg_num 150 pgp_num 22
> pg_num_target 512 pgp_num_target 512 autoscale_mode on last_change 9159
> lfor 0/0/9147 flags hashpspool,ec_overwrites,selfmanaged_snaps,bulk
> stripe_width 24576 pg_num_min 128 target_size_ratio 0.4 application
> cephfs
>
> pgp_num != pg_num ?
>
> In earlier versions of ceph (without autoscaler) I have only experienced
> that setting pg_num and pgp_num took immidiate effect?
>
> Jesper
>
> jskr@dkcphhpcmgt028:/$ sudo ceph version
> ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy
> (stable)
> jskr@dkcphhpcmgt028:/$ sudo ceph health
> HEALTH_OK
> jskr@dkcphhpcmgt028:/$ sudo ceph status
>cluster:
>  id: 5c384430-da91-11ed-af9c-c780a5227aff
>  health: HEALTH_OK
>
>services:
>  mon: 3 daemons, quorum dkcphhpcmgt031,dkcphhpcmgt029,dkcphhpcmgt028
> (age 15h)
>  mgr: dkcphhpcmgt031.afbgjx(active, since 32h), standbys:
> dkcphhpcmgt029.bnsegi, dkcphhpcmgt028.bxxkqd
>  mds: 2/2 daemons up, 1 standby
>  osd: 40 osds: 40 up (since 44h), 40 in (since 39h); 33 remapped pgs
>
>data:
>  volumes: 2/2 healthy
>  pools:   9 pools, 495 pgs
>  objects: 24.85M objects, 60 TiB
>  usage:   117 TiB used, 158 TiB / 276 TiB avail
>  pgs: 13494029/145763897 objects misplaced (9.257%)
>   462 active+clean
>   23  active+remapped+backfilling
>   10  active+remapped+backfill_wait
>
>io:
>  client:   0 B/s rd, 1.1 MiB/s wr, 0 op/s rd, 94 op/s wr
>  recovery: 705 MiB/s, 208 objects/s
>
>progress:
>
>
> --
> Jesper Krogh
> ___
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io
>
___
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io