The biggest improvement would be to put all of the OSDs on SSDs with PLP.  Next would be to put the WAL/DB on drives with PLP.  If price is a concern,  you can sometimes find really good older drives like Intel P4510s on ebay for reasonable prices.  Just watch out for how much write wear they have on them.

I had an experimental PR that I was playing with to see if I could queue up more IO at once in the bstore_kv_sync thread here:

https://github.com/ceph/ceph/pull/50610


I didn't have the proper gear to test it though so it just kind of languished and was closed by the bot.  The idea was just a proof of concept to see if we could reduce the number of fdatasyncs by manually introducing latency and letting more IOs accumulate before doing a flush.


Mark


On 2/22/24 11:29, Work Ceph wrote:
Thanks for the prompt response!

I see, and indeed some of them are consumer SSD disks. Is there any parameter that we can change/tune to better handle the call "fdatsync"?

Maybe using NVMEs for the RocksDB?

On Thu, Feb 22, 2024 at 2:24 PM Mark Nelson <mark.nel...@clyso.com> wrote:

    Most likely you are seeing time spent waiting on fdatsync in
    bstore_kv_sync if the drives you are using don't have power loss
    protection and can't perform flushes quickly.  Some consumer grade
    drives are actually slower at this than HDDs.


    Mark


    On 2/22/24 11:04, Work Ceph wrote:
    > Hello guys,
    > We are running Ceph Octopus on Ubuntu 18.04, and we are noticing
    spikes of
    > IO utilization for bstore_kv_sync thread during processes such
    as adding a
    > new pool and increasing/reducing the number of PGs in a pool.
    >
    > It is funny though that the IO utilization (reported with IOTOP)
    is 99.99%,
    > but the reading for R/W speeds are slow. The devices where we
    are seeing
    > these issues are all SSDs systems. We are not using high end
    SSDs devices
    > though.
    >
    > Have you guys seen such behavior?
    >
    > Also, do you guys have any clues on why the IO utilization would
    be high,
    > when there is such a small amount of data being read and written
    to the
    > OSD/disks?
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@ceph.io
    > To unsubscribe send an email to ceph-users-le...@ceph.io

-- Best Regards,
    Mark Nelson
    Head of Research and Development

    Clyso GmbH
    p: +49 89 21552391 12 | a: Minnesota, USA
    w: https://clyso.com | e: mark.nel...@clyso.com

    We are hiring: https://www.clyso.com/jobs/
    _______________________________________________
    ceph-users mailing list -- ceph-users@ceph.io
    To unsubscribe send an email to ceph-users-le...@ceph.io

--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

Reply via email to