[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

Igor Fedotov Fri, 20 Oct 2023 06:13:07 -0700

Zakhar,

my general concern about downgrading to previous versions is that thisprocedure is generally neither assumed nor tested by dev team. Althoughis possible most of the time. But in this specific case it is not doabledue to (at least) https://github.com/ceph/ceph/pull/52212 which enables4K bluefs allocation unit support - once some daemon gets it - there isno way back.

I'm still thinking that setting "fit_to_fast" mode without enablingdynamic compaction levels is quite safe but definitely it's better to betested in the real environment and under actual payload first. Also youmight want to apply such a workaround gradually - one daemon first, bakeit for a while, then apply for the full node, bake a bit more andfinally go forward and update the remaining. Or even better - bake it ina test cluster first.

Alternatively you might consider building updated code yourself and makepatched binaries on top of .14...



Thanks,

Igor


On 20/10/2023 15:10, Zakhar Kirpichenko wrote:

Thank you, Igor.

It is somewhat disappointing that fixing this bug in Pacific has sucha low priority, considering its impact on existing clusters.

The document attached to the PR explicitly says about`level_compaction_dynamic_level_bytes` that "enabling it on anexisting DB requires special caution", we'd rather not experiment withsomething that has the potential to cause data corruption or loss in aproduction cluster. Perhaps a downgrade to the previous version,16.2.13 which worked for us without any issues, is an option, or wouldyou advise against such a downgrade from 16.2.14?


/Z

On Fri, 20 Oct 2023 at 14:46, Igor Fedotov <igor.fedo...@croit.io> wrote:

    Hi Zakhar,

    Definitely we expect one more (and apparently the last) Pacific
    minor release. There is no specific date yet though - the plans
    are to release Quincy and Reef minor releases prior to it.
    Hopefully to be done before the Christmas/New Year.

    Meanwhile you might want to workaround the issue by tuning
    bluestore_volume_selection_policy. Unfortunately most likely my
    original proposal to set it to rocksdb_original wouldn't work in
    this case so you better try "fit_to_fast" mode. This should be
    coupled with enabling 'level_compaction_dynamic_level_bytes' mode
    in RocksDB - there is pretty good spec on applying this mode to
    BlueStore attached to https://github.com/ceph/ceph/pull/37156.


    Thanks,

    Igor

    On 20/10/2023 06:03, Zakhar Kirpichenko wrote:

    Igor, I noticed that there's no roadmap for the next 16.2.x
    release. May I ask what time frame we are looking at with regards
    to a possible fix?

    We're experiencing several OSD crashes caused by this issue per day.

    /Z

    On Mon, 16 Oct 2023 at 14:19, Igor Fedotov
    <igor.fedo...@croit.io> wrote:

        That's true.

        On 16/10/2023 14:13, Zakhar Kirpichenko wrote:

        Many thanks, Igor. I found previously submitted bug reports
        and subscribed to them. My understanding is that the issue
        is going to be fixed in the next Pacific minor release.

        /Z

        On Mon, 16 Oct 2023 at 14:03, Igor Fedotov
        <igor.fedo...@croit.io> wrote:

            Hi Zakhar,

            please see my reply for the post on the similar issue at:
            
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/


            Thanks,

            Igor

            On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
            > Hi,
            >
            > After upgrading to Ceph 16.2.14 we had several OSD crashes
            > in bstore_kv_sync thread:
            >
            >
            >     1. "assert_thread_name": "bstore_kv_sync",
            >     2. "backtrace": [
            >     3. "/lib64/libpthread.so.0(+0x12cf0)
            [0x7ff2f6750cf0]",
            >     4. "gsignal()",
            >     5. "abort()",
            >     6. "(ceph::__ceph_assert_fail(char const*, char
            const*, int, char
            >     const*)+0x1a9) [0x564dc5f87d0b]",
            >     7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
            >     8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
            bluefs_fnode_t
            >     const&)+0x15e) [0x564dc6604a9e]",
            >     9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*,
            unsigned long, unsigned
            >     long)+0x77d) [0x564dc66951cd]",
            >     10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool,
            bool*)+0x90)
            >     [0x564dc6695670]",
            >     11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
            [0x564dc66b1a6b]",
            >     12. "(BlueRocksWritableFile::Sync()+0x18)
            [0x564dc66c1768]",
            >     13.
            "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
            >     const&, rocksdb::IODebugContext*)+0x1f)
            [0x564dc6b6496f]",
            >     14.
            "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
            >     [0x564dc6c761c2]",
            >     15.
            "(rocksdb::WritableFileWriter::Sync(bool)+0x88)
            [0x564dc6c77808]",
            >     16.
            "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
            >     const&, rocksdb::log::Writer*, unsigned long*,
            bool, bool, unsigned
            >     long)+0x309) [0x564dc6b780c9]",
            >     17.
            "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
            >     rocksdb::WriteBatch*, rocksdb::WriteCallback*,
            unsigned long*, unsigned
            >     long, bool, unsigned long*, unsigned long,
            >     rocksdb::PreReleaseCallback*)+0x2629)
            [0x564dc6b80c69]",
            >     18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions
            const&,
            >     rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
            >     19.
            "(RocksDBStore::submit_common(rocksdb::WriteOptions&,
            >  std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84)
            [0x564dc6b1f644]",
            >     20.
            
"(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
            >     [0x564dc6b2004a]",
            >     21. "(BlueStore::_kv_sync_thread()+0x30d8)
            [0x564dc6602ec8]",
            >     22. "(BlueStore::KVSyncThread::entry()+0x11)
            [0x564dc662ab61]",
            >     23. "/lib64/libpthread.so.0(+0x81ca)
            [0x7ff2f67461ca]",
            >     24. "clone()"
            >     25. ],
            >
            >
            > I am attaching two instances of crash info for further
            reference:
            > https://pastebin.com/E6myaHNU
            >
            > OSD configuration is rather simple and close to default:
            >
            > osd.6         dev  bluestore_cache_size_hdd 4294967296
            >   osd.6         dev
            > bluestore_cache_size_ssd 4294967296
            >                    osd  advanced  debug_rocksdb
            >    1/5                              osd
            >          advanced  osd_max_backfills              2
            >         osd           basic
            > osd_memory_target  17179869184
            >                      osd  advanced 
            osd_recovery_max_active
            >      2                          osd
            >      advanced  osd_scrub_sleep          0.100000
            > osd           advanced
            >   rbd_balance_parent_reads false
            >
            > debug_rocksdb is a recent change, otherwise this
            configuration has been
            > running without issues for months. The crashes
            happened on two different
            > hosts with identical hardware, the hosts and storage
            (NVME DB/WAL, HDD
            > block) don't exhibit any issues. We have not
            experienced such crashes with
            > Ceph < 16.2.14.
            >
            > Is this a known issue, or should I open a bug report?
            >
            > Best regards,
            > Zakhar
            > _______________________________________________
            > ceph-users mailing list -- ceph-users@ceph.io
            > To unsubscribe send an email to ceph-users-le...@ceph.io

_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-le...@ceph.io

[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync

Reply via email to