Last summer we increased an EC 8+3 pool from 1024 to 2048 PGs on our ~1500 OSD 
(Kraken) cluster. This pool contained ~2 petabytes of data at the time.

We did a fair amount of testing on a throwaway pool on the same cluster 
beforehand, starting with small increases (16/32/64).

The main observation was that the act of splitting the PGs causes issues, not 
the resulting data movement, assuming your backfills are tuned to a level where 
they don’t affect client IO.

As the PG splitting and peering (pg_num and pgp_num) increases are a) non 
reversible and b) the resulting operations happen instantaneously, overly large 
increases can end up with an unhappy mess of excessive storage node load, OSDs 
flapping and blocked requests.

We ended up doing increases of 128 PGs at a time.

I’d hazard a guess that you will be fine going straight to 512 PGs, but the 
only way to be sure of the correct increase size for your cluster is to test it.

Cheers
Tom

From: ceph-users [mailto:ceph-users-boun...@lists.ceph.com] On Behalf Of Karun 
Josy
Sent: 02 January 2018 16:23
To: Hans van den Bogert <hansbog...@gmail.com>
Cc: ceph-users <ceph-users@lists.ceph.com>
Subject: Re: [ceph-users] Increasing PG number

https://access.redhat.com/solutions/2457321

It says it is a very intensive process and can affect cluster performance.

Our Version is Luminous 12.2.2
And we are using erasure coding profile for a pool 'ecpool' with k=5 and m=3
Current PG number is 256 and it has about 20 TB of data.

Should I increase it gradually? Or set pg as 512 in one step ?




Karun Josy

On Tue, Jan 2, 2018 at 9:26 PM, Hans van den Bogert 
<hansbog...@gmail.com<mailto:hansbog...@gmail.com>> wrote:
Please refer to standard documentation as much as possible,

    
http://docs.ceph.com/docs/jewel/rados/operations/placement-groups/#set-the-number-of-placement-groups

Han’s is also incomplete, since you also need to change the ‘pgp_num’ as well.

Regards,

Hans

On Jan 2, 2018, at 4:41 PM, Vladimir Prokofev 
<v...@prokofev.me<mailto:v...@prokofev.me>> wrote:

Increased number of PGs in multiple pools in a production cluster on 12.2.2 
recently - zero issues.
CEPH claims that increasing pg_num and pgp_num are safe operations, which are 
essential for it's ability to scale, and this sounds pretty reasonable to me. 
[1]


[1] 
https://www.sebastien-han.fr/blog/2013/03/12/ceph-change-pg-number-on-the-fly/

2018-01-02 18:21 GMT+03:00 Karun Josy 
<karunjo...@gmail.com<mailto:karunjo...@gmail.com>>:
Hi,

 Initial PG count was not properly planned while setting up the cluster, so now 
there are only less than 50 PGs per OSDs.

What are the best practises to increase PG number of a pool ?
We have replicated pools as well as EC pools.

Or is it better to create a new pool with higher PG numbers?


Karun

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com<mailto:ceph-users@lists.ceph.com>
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to