Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?
On Sun, 28 Apr 2019 at 21:45, Igor Podlesny wrote: > On Sun, 28 Apr 2019 at 16:14, Paul Emmerich > wrote: > > Use k+m for PG calculation, that value also shows up as "erasure size" > > in ceph osd pool ls detail > > So does it mean that for PG calculation those 2 pools are equivalent: > > 1) EC(4, 2) > 2) replicated, size 6 > Correct > > ? Sounds weird to be honest. Replicated with size 6 means each logical > data is stored 6 times, what needed single PG now requires 6 PGs. > And with EC(4, 2) there's still only 1.5 overhead in terms of raw > occupied space -- how come PG calculation distribution needs adjusting > to 6 instead of 1.5 then? > A single logical data unit (an object in ceph terms) will be allocated to a single PG. For a replicated pool of size n this PG will simply be stored on n OSDs. For an EC(k+m) pool this PG will get stored on k+m OSDs with the difference that this single PG will contain different parts of the data on the different OSDs. http://docs.ceph.com/docs/master/architecture/#erasure-coding provides a good overview on how this is actually achieved. > Also, why does CEPH documentation say "It is equivalent to a > replicated pool of size __two__" when describing EC(2, 1) example? > This relates to fault tolerance. A replicated pool of size 2 can loose one OSD without data loss and so can a EC(2+1) pool > > -- > End of message. Next message? > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com > ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?
On Sun, 28 Apr 2019 at 16:14, Paul Emmerich wrote: > Use k+m for PG calculation, that value also shows up as "erasure size" > in ceph osd pool ls detail So does it mean that for PG calculation those 2 pools are equivalent: 1) EC(4, 2) 2) replicated, size 6 ? Sounds weird to be honest. Replicated with size 6 means each logical data is stored 6 times, what needed single PG now requires 6 PGs. And with EC(4, 2) there's still only 1.5 overhead in terms of raw occupied space -- how come PG calculation distribution needs adjusting to 6 instead of 1.5 then? Also, why does CEPH documentation say "It is equivalent to a replicated pool of size __two__" when describing EC(2, 1) example? -- End of message. Next message? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?
Use k+m for PG calculation, that value also shows up as "erasure size" in ceph osd pool ls detail The important thing here is on how many OSDs the PG shows up. And the EC PG shows up on all k+m OSDs. Paul -- Paul Emmerich Looking for help with your Ceph cluster? Contact us at https://croit.io croit GmbH Freseniusstr. 31h 81247 München www.croit.io Tel: +49 89 1896585 90 On Sun, Apr 28, 2019 at 9:41 AM Igor Podlesny wrote: > > For replicated pools (w/o rounding to nearest power of two) overall > PGs number is calculated so: > > Pools_PGs = 100 * (OSDs / Pool_Size), > > where > 100 -- target number of PGs per single OSD related to that pool, > Pool_Size -- factor showing how much raw storage would in fact be > used to store one logical data unit. > > By analogy I can suppose that with EC pools corresponding Pool_Size > can be calculated so: > > Raw_Storage_Use / Logical_Storage_Use > > or, using EC semantics, (k + m) / k. And for EC (k=2, m=1) it gives: > > Raw_Storage_Use = 3 > Logical_Storage_Use = 2 > > -- Hence, Pool_Size should be 1.5. > > OTOH, CEPH documentation says that about same EC pool (underline is mine): > > "It is equivalent to a replicated pool of size __two__ but > requires 1.5TB instead of 2TB to store 1TB of data" > > So how does CEPH calculate PGs distribution per OSD for it? > Using (k + m) / k? Or just k? Or differently at all? > > -- > End of message. Next message? > ___ > ceph-users mailing list > ceph-users@lists.ceph.com > http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
[ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?
For replicated pools (w/o rounding to nearest power of two) overall PGs number is calculated so: Pools_PGs = 100 * (OSDs / Pool_Size), where 100 -- target number of PGs per single OSD related to that pool, Pool_Size -- factor showing how much raw storage would in fact be used to store one logical data unit. By analogy I can suppose that with EC pools corresponding Pool_Size can be calculated so: Raw_Storage_Use / Logical_Storage_Use or, using EC semantics, (k + m) / k. And for EC (k=2, m=1) it gives: Raw_Storage_Use = 3 Logical_Storage_Use = 2 -- Hence, Pool_Size should be 1.5. OTOH, CEPH documentation says that about same EC pool (underline is mine): "It is equivalent to a replicated pool of size __two__ but requires 1.5TB instead of 2TB to store 1TB of data" So how does CEPH calculate PGs distribution per OSD for it? Using (k + m) / k? Or just k? Or differently at all? -- End of message. Next message? ___ ceph-users mailing list ceph-users@lists.ceph.com http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com