Many thanks, Igor. I found previously submitted bug reports
and subscribed to them. My understanding is that the issue
is going to be fixed in the next Pacific minor release.
/Z
On Mon, 16 Oct 2023 at 14:03, Igor Fedotov
<igor.fedo...@croit.io> wrote:
Hi Zakhar,
please see my reply for the post on the similar issue at:
https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/
Thanks,
Igor
On 16/10/2023 09:26, Zakhar Kirpichenko wrote:
> Hi,
>
> After upgrading to Ceph 16.2.14 we had several OSD crashes
> in bstore_kv_sync thread:
>
>
> 1. "assert_thread_name": "bstore_kv_sync",
> 2. "backtrace": [
> 3. "/lib64/libpthread.so.0(+0x12cf0)
[0x7ff2f6750cf0]",
> 4. "gsignal()",
> 5. "abort()",
> 6. "(ceph::__ceph_assert_fail(char const*, char
const*, int, char
> const*)+0x1a9) [0x564dc5f87d0b]",
> 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]",
> 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*,
bluefs_fnode_t
> const&)+0x15e) [0x564dc6604a9e]",
> 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*,
unsigned long, unsigned
> long)+0x77d) [0x564dc66951cd]",
> 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool,
bool*)+0x90)
> [0x564dc6695670]",
> 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b)
[0x564dc66b1a6b]",
> 12. "(BlueRocksWritableFile::Sync()+0x18)
[0x564dc66c1768]",
> 13.
"(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions
> const&, rocksdb::IODebugContext*)+0x1f)
[0x564dc6b6496f]",
> 14.
"(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
> [0x564dc6c761c2]",
> 15.
"(rocksdb::WritableFileWriter::Sync(bool)+0x88)
[0x564dc6c77808]",
> 16.
"(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup
> const&, rocksdb::log::Writer*, unsigned long*,
bool, bool, unsigned
> long)+0x309) [0x564dc6b780c9]",
> 17.
"(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&,
> rocksdb::WriteBatch*, rocksdb::WriteCallback*,
unsigned long*, unsigned
> long, bool, unsigned long*, unsigned long,
> rocksdb::PreReleaseCallback*)+0x2629)
[0x564dc6b80c69]",
> 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions
const&,
> rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]",
> 19.
"(RocksDBStore::submit_common(rocksdb::WriteOptions&,
> std::shared_ptr<KeyValueDB::TransactionImpl>)+0x84)
[0x564dc6b1f644]",
> 20.
"(RocksDBStore::submit_transaction_sync(std::shared_ptr<KeyValueDB::TransactionImpl>)+0x9a)
> [0x564dc6b2004a]",
> 21. "(BlueStore::_kv_sync_thread()+0x30d8)
[0x564dc6602ec8]",
> 22. "(BlueStore::KVSyncThread::entry()+0x11)
[0x564dc662ab61]",
> 23. "/lib64/libpthread.so.0(+0x81ca)
[0x7ff2f67461ca]",
> 24. "clone()"
> 25. ],
>
>
> I am attaching two instances of crash info for further
reference:
> https://pastebin.com/E6myaHNU
>
> OSD configuration is rather simple and close to default:
>
> osd.6 dev bluestore_cache_size_hdd 4294967296
> osd.6 dev
> bluestore_cache_size_ssd 4294967296
> osd advanced debug_rocksdb
> 1/5 osd
> advanced osd_max_backfills 2
> osd basic
> osd_memory_target 17179869184
> osd advanced
osd_recovery_max_active
> 2 osd
> advanced osd_scrub_sleep 0.100000
> osd advanced
> rbd_balance_parent_reads false
>
> debug_rocksdb is a recent change, otherwise this
configuration has been
> running without issues for months. The crashes
happened on two different
> hosts with identical hardware, the hosts and storage
(NVME DB/WAL, HDD
> block) don't exhibit any issues. We have not
experienced such crashes with
> Ceph < 16.2.14.
>
> Is this a known issue, or should I open a bug report?
>
> Best regards,
> Zakhar
> _______________________________________________
> ceph-users mailing list -- ceph-users@ceph.io
> To unsubscribe send an email to ceph-users-le...@ceph.io