Oh dear. Every occurrence of stripe_* is wrong :)

It should be stripe_count (option --stripe-count in rbd create) everywhere in 
my text.

What choices are legal depends on the restrictions on stripe_count*stripe_unit 
(=stripe_size=stripe_width?) imposed by ceph. I believe all of this ends up 
being powers of 2.

Yes, the 6+2 is a bit surprising. I have no explanation for the observation. It 
just seems a good argument for "do not trust what you believe, gather facts". 
And to try things that seem non-obvious - just to be sure.

Best regards,

=================
Frank Schilder
AIT Risø Campus
Bygning 109, rum S14

________________________________________
From: ceph-users <ceph-users-boun...@lists.ceph.com> on behalf of Lars 
Marowsky-Bree <l...@suse.com>
Sent: 11 July 2019 12:17:37
To: ceph-users@lists.ceph.com
Subject: Re: [ceph-users] What's the best practice for Erasure Coding

On 2019-07-11T09:46:47, Frank Schilder <fr...@dtu.dk> wrote:

> Striping with stripe units other than 1 is something I also tested. I found 
> that with EC pools non-trivial striping should be avoided. Firstly, EC is 
> already a striped format and, secondly, striping on top of that with 
> stripe_unit>1 will make every write an ec_overwrite, because now shards are 
> rarely if ever written as a whole.

That's why I said that rbd's stripe_unit should match the EC pool's
stripe_width, or be a 2^n multiple of it. (Not sure what stripe_count
should be set to, probably also a small number of two.)

> The native striping in EC pools comes from k, data is striped over k disks. 
> The higher k the more throughput at the expense of cpu and network.

Increasing k also increases stripe_width though; this leads to more IO
suffering from the ec_overwrite penalty.

> In my long list, this should actually be point
>
> 6) Use stripe_unit=1 (default).

You mean stripe-count?

> To get back to your question, this is another argument for k=power-of-two. 
> Object sizes in ceph are always powers of 2 and stripe sizes contain k as a 
> factor. Hence, any prime factor other than 2 in k will imply a mismatch. How 
> badly a mismatch affects performance should be tested.

Yes, of course. Depending on the IO pattern, this means more IO will be
misaligned or have non-stripe_width portions. (Most IO patterns, if they
strive for alignment, aim for a power of two alignment, obviously.)

> Results with non-trivial striping (stripe_size>1) were so poor, I did not 
> even include them in my report.

stripe_size?

> We use the 8+2 pool for ceph fs, where throughput is important. The 6+2 pool 
> is used for VMs (RBD images), where IOP/s are more important. It also offers 
> a higher redundancy level. Its an acceptable compromise for us.

Especially with RBDs, I'm surprised that k=6 works well for you. Block
device IO is most commonly aligned on power-of-two boundaries.


Regards,
    Lars

--
SUSE Linux GmbH, GF: Felix Imendörffer, Mary Higgins, Sri Rasiah, HRB 21284 (AG 
Nürnberg)
"Architects should open possibilities and not determine everything." (Ueli 
Zbinden)
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to