On Wed, Mar 22, 2017 at 6:05 AM Peter Maloney <
peter.malo...@brockmann-consult.de> wrote:

> Does iostat (eg.  iostat -xmy 1 /dev/sd[a-z]) show high util% or await
> during these problems?
>

It does, from watching atop.


>
> Ceph filestore requires lots of metadata writing (directory splitting for
> example), xattrs, leveldb, etc. which are small sync writes that HDDs are
> bad at (100-300 iops), and SSDs are good at (cheapo would be 6k iops, and
> not so crazy DC/NVMe would be 20-200k iops and more). So in theory, these
> things are mitigated by using an SSD, like bcache on your osd device. You
> could also try something like that, at least to test.
>

That explains our previous performance gains with Areca HBAs in NVRAM /
supercap backed write cache mode.  We went to SSD journal design to be more
resilient to sustained write workloads, but this created more latency on
small/random write IO.


>
> I have tested with bcache in writeback mode and found hugely obvious
> differences seen by iostat, for example here's my before and after (heavier
> load due to converting week 49-50 or so, and the highest spikes being the
> scrub infinite loop bug in 10.2.3):
>
>
> http://www.brockmann-consult.de/ganglia/graph.php?cs=10%2F25%2F2016+10%3A27&ce=03%2F09%2F2017+17%3A26&z=xlarge&hreg
> []=ceph.*&mreg[]=sd[c-z]_await&glegend=show&aggregate=1&x=100
>
> But when you share a cache device, you get a single point of failure (and
> bcache, like all software, can be assumed to have bugs too). And I
> recommend vanilla kernel 4.9 or later which has many bcache fixes, or
> Ubuntu's 4.4 kernel which has the specific fixes I checked for.
>

Yep, I am scared of that and therefore would prefer either a vendor based
solid state design (e.g. areca), all SSD OSDs whenever these can be
affordable, or start experimenting with cache pools. Does not seem like
SSDs are getting any cheaper, just new technologies like 3DXP showing up.


>
> On 03/21/17 23:22, Alex Gorbachev wrote:
>
> I wanted to share the recent experience, in which a few RBD volumes,
> formatted as XFS and exported via Ubuntu NFS-kernel-server performed
> poorly, even generated an "out of space" warnings on a nearly empty
> filesystem.  I tried a variety of hacks and fixes to no effect, until
> things started magically working just after some dd write testing.
>
> The only explanation I can come up with is that preconditioning, or
> thickening, the images with this benchmarking is what caused the
> improvement.
>
> Ceph is Hammer 0.94.7 running on Ubuntu 14.04, kernel 4.10 on OSD nodes
> and 4.4 on NFS nodes.
>
> Regards,
> Alex
> Storcium
> --
> --
> Alex Gorbachev
> Storcium
>
>
> _______________________________________________
> ceph-users mailing 
> listceph-us...@lists.ceph.comhttp://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
> --
>
> --------------------------------------------
> Peter Maloney
> Brockmann Consult
> Max-Planck-Str. 2
> 21502 Geesthacht
> Germany
> Tel: +49 4152 889 300
> Fax: +49 4152 889 333
> E-mail: peter.malo...@brockmann-consult.de
> Internet: http://www.brockmann-consult.de
> --------------------------------------------
>
> --
--
Alex Gorbachev
Storcium
_______________________________________________
ceph-users mailing list
ceph-users@lists.ceph.com
http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com

Reply via email to