Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?

2019-04-29 Thread Christian Wuerdig
On Sun, 28 Apr 2019 at 21:45, Igor Podlesny  wrote:

> On Sun, 28 Apr 2019 at 16:14, Paul Emmerich 
> wrote:
> > Use k+m for PG calculation, that value also shows up as "erasure size"
> > in ceph osd pool ls detail
>
> So does it mean that for PG calculation those 2 pools are equivalent:
>
> 1) EC(4, 2)
> 2) replicated, size 6
>

Correct


>
> ? Sounds weird to be honest. Replicated with size 6 means each logical
> data is stored 6 times, what needed single PG now requires 6 PGs.
> And with EC(4, 2) there's still only 1.5 overhead in terms of raw
> occupied space -- how come PG calculation distribution needs adjusting
> to 6 instead of 1.5 then?
>

A single logical data unit (an object in ceph terms) will be allocated to a
single PG. For a replicated pool of size n this PG will simply be stored on
n OSDs. For an EC(k+m) pool this PG will get stored on k+m OSDs with the
difference that this single PG will contain different parts of the data on
the different OSDs.
http://docs.ceph.com/docs/master/architecture/#erasure-coding provides a
good overview on how this is actually achieved.


> Also, why does CEPH documentation say "It is equivalent to a
> replicated pool of size __two__" when describing EC(2, 1) example?
>

This relates to fault tolerance. A replicated pool of size 2 can loose one
OSD without data loss and so can a EC(2+1) pool


>
> --
> End of message. Next message?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?

2019-04-28 Thread Igor Podlesny
On Sun, 28 Apr 2019 at 16:14, Paul Emmerich  wrote:
> Use k+m for PG calculation, that value also shows up as "erasure size"
> in ceph osd pool ls detail

So does it mean that for PG calculation those 2 pools are equivalent:

1) EC(4, 2)
2) replicated, size 6

? Sounds weird to be honest. Replicated with size 6 means each logical
data is stored 6 times, what needed single PG now requires 6 PGs.
And with EC(4, 2) there's still only 1.5 overhead in terms of raw
occupied space -- how come PG calculation distribution needs adjusting
to 6 instead of 1.5 then?

Also, why does CEPH documentation say "It is equivalent to a
replicated pool of size __two__" when describing EC(2, 1) example?

-- 
End of message. Next message?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


Re: [ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?

2019-04-28 Thread Paul Emmerich
Use k+m for PG calculation, that value also shows up as "erasure size"
in ceph osd pool ls detail

The important thing here is on how many OSDs the PG shows up. And the
EC PG shows up on all k+m OSDs.

Paul

-- 
Paul Emmerich

Looking for help with your Ceph cluster? Contact us at https://croit.io

croit GmbH
Freseniusstr. 31h
81247 München
www.croit.io
Tel: +49 89 1896585 90

On Sun, Apr 28, 2019 at 9:41 AM Igor Podlesny  wrote:
>
> For replicated pools (w/o rounding to nearest power of two) overall
> PGs number is calculated so:
>
> Pools_PGs = 100 * (OSDs / Pool_Size),
>
> where
> 100 -- target number of PGs per single OSD related to that pool,
> Pool_Size -- factor showing how much raw storage would in fact be
> used to store one logical data unit.
>
> By analogy I can suppose that with EC pools corresponding Pool_Size
> can be calculated so:
>
> Raw_Storage_Use / Logical_Storage_Use
>
> or, using EC semantics, (k + m) / k. And for EC (k=2, m=1) it gives:
>
> Raw_Storage_Use = 3
> Logical_Storage_Use = 2
>
> -- Hence, Pool_Size should be 1.5.
>
> OTOH, CEPH documentation says that about same EC pool (underline is mine):
>
> "It is equivalent to a replicated pool of size __two__ but
> requires 1.5TB instead of 2TB to store 1TB of data"
>
> So how does CEPH calculate PGs distribution per OSD for it?
> Using (k + m) / k? Or just k? Or differently at all?
>
> --
> End of message. Next message?
> ___
> ceph-users mailing list
> ceph-users@lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com


[ceph-users] How does CEPH calculates PGs per OSD for erasure coded (EC) pools?

2019-04-28 Thread Igor Podlesny
For replicated pools (w/o rounding to nearest power of two) overall
PGs number is calculated so:

Pools_PGs = 100 * (OSDs / Pool_Size),

where
100 -- target number of PGs per single OSD related to that pool,
Pool_Size -- factor showing how much raw storage would in fact be
used to store one logical data unit.

By analogy I can suppose that with EC pools corresponding Pool_Size
can be calculated so:

Raw_Storage_Use / Logical_Storage_Use

or, using EC semantics, (k + m) / k. And for EC (k=2, m=1) it gives:

Raw_Storage_Use = 3
Logical_Storage_Use = 2

-- Hence, Pool_Size should be 1.5.

OTOH, CEPH documentation says that about same EC pool (underline is mine):

"It is equivalent to a replicated pool of size __two__ but
requires 1.5TB instead of 2TB to store 1TB of data"

So how does CEPH calculate PGs distribution per OSD for it?
Using (k + m) / k? Or just k? Or differently at all?

-- 
End of message. Next message?
___
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com