[ceph-users] Re: High IO utilization for bstore_kv_sync

Mark Nelson Thu, 22 Feb 2024 10:15:58 -0800

The biggest improvement would be to put all of the OSDs on SSDs withPLP. Next would be to put the WAL/DB on drives with PLP. If price is aconcern, you can sometimes find really good older drives like IntelP4510s on ebay for reasonable prices. Just watch out for how much writewear they have on them.

I had an experimental PR that I was playing with to see if I could queueup more IO at once in the bstore_kv_sync thread here:


https://github.com/ceph/ceph/pull/50610

I didn't have the proper gear to test it though so it just kind oflanguished and was closed by the bot. The idea was just a proof ofconcept to see if we could reduce the number of fdatasyncs by manuallyintroducing latency and letting more IOs accumulate before doing a flush.



Mark


On 2/22/24 11:29, Work Ceph wrote:

Thanks for the prompt response!

I see, and indeed some of them are consumer SSD disks. Is there anyparameter that we can change/tune to better handle the call "fdatsync"?


Maybe using NVMEs for the RocksDB?

On Thu, Feb 22, 2024 at 2:24 PM Mark Nelson <mark.nel...@clyso.com> wrote:

    Most likely you are seeing time spent waiting on fdatsync in
    bstore_kv_sync if the drives you are using don't have power loss
    protection and can't perform flushes quickly.  Some consumer grade
    drives are actually slower at this than HDDs.


    Mark


    On 2/22/24 11:04, Work Ceph wrote:
    > Hello guys,
    > We are running Ceph Octopus on Ubuntu 18.04, and we are noticing
    spikes of
    > IO utilization for bstore_kv_sync thread during processes such
    as adding a
    > new pool and increasing/reducing the number of PGs in a pool.
    >
    > It is funny though that the IO utilization (reported with IOTOP)
    is 99.99%,
    > but the reading for R/W speeds are slow. The devices where we
    are seeing
    > these issues are all SSDs systems. We are not using high end
    SSDs devices
    > though.
    >
    > Have you guys seen such behavior?
    >
    > Also, do you guys have any clues on why the IO utilization would
    be high,
    > when there is such a small amount of data being read and written
    to the
    > OSD/disks?
    > _______________________________________________
    > ceph-users mailing list -- ceph-users@ceph.io
    > To unsubscribe send an email to ceph-users-le...@ceph.io

--Best Regards,

    Mark Nelson
    Head of Research and Development

    Clyso GmbH
    p: +49 89 21552391 12 | a: Minnesota, USA
    w: https://clyso.com | e: mark.nel...@clyso.com

    We are hiring: https://www.clyso.com/jobs/
    _______________________________________________
    ceph-users mailing list -- ceph-users@ceph.io
    To unsubscribe send an email to ceph-users-le...@ceph.io

--
Best Regards,
Mark Nelson
Head of Research and Development

Clyso GmbH
p: +49 89 21552391 12 | a: Minnesota, USA
w: https://clyso.com | e: mark.nel...@clyso.com

We are hiring: https://www.clyso.com/jobs/
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: High IO utilization for bstore_kv_sync

Reply via email to