[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Thank you, Igor. I was just reading the detailed list of changes for 16.2.14, as I suspected that we might not be able to go back to the previous minor release :-) Thanks again for the suggestions, we'll consider our options. /Z On Fri, 20 Oct 2023 at 16:08, Igor Fedotov wrote: > Zakhar, > > my general concern about downgrading to previous versions is that this > procedure is generally neither assumed nor tested by dev team. Although is > possible most of the time. But in this specific case it is not doable due > to (at least) https://github.com/ceph/ceph/pull/52212 which enables 4K > bluefs allocation unit support - once some daemon gets it - there is no way > back. > > I'm still thinking that setting "fit_to_fast" mode without enabling > dynamic compaction levels is quite safe but definitely it's better to be > tested in the real environment and under actual payload first. Also you > might want to apply such a workaround gradually - one daemon first, bake it > for a while, then apply for the full node, bake a bit more and finally go > forward and update the remaining. Or even better - bake it in a test > cluster first. > > Alternatively you might consider building updated code yourself and make > patched binaries on top of .14... > > > Thanks, > > Igor > > > On 20/10/2023 15:10, Zakhar Kirpichenko wrote: > > Thank you, Igor. > > It is somewhat disappointing that fixing this bug in Pacific has such a > low priority, considering its impact on existing clusters. > > The document attached to the PR explicitly says about > `level_compaction_dynamic_level_bytes` that "enabling it on an existing DB > requires special caution", we'd rather not experiment with something that > has the potential to cause data corruption or loss in a production cluster. > Perhaps a downgrade to the previous version, 16.2.13 which worked for us > without any issues, is an option, or would you advise against such a > downgrade from 16.2.14? > > /Z > > On Fri, 20 Oct 2023 at 14:46, Igor Fedotov wrote: > >> Hi Zakhar, >> >> Definitely we expect one more (and apparently the last) Pacific minor >> release. There is no specific date yet though - the plans are to release >> Quincy and Reef minor releases prior to it. Hopefully to be done before the >> Christmas/New Year. >> >> Meanwhile you might want to workaround the issue by tuning >> bluestore_volume_selection_policy. Unfortunately most likely my original >> proposal to set it to rocksdb_original wouldn't work in this case so you >> better try "fit_to_fast" mode. This should be coupled with enabling >> 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty >> good spec on applying this mode to BlueStore attached to >> https://github.com/ceph/ceph/pull/37156. >> >> >> Thanks, >> >> Igor >> On 20/10/2023 06:03, Zakhar Kirpichenko wrote: >> >> Igor, I noticed that there's no roadmap for the next 16.2.x release. May >> I ask what time frame we are looking at with regards to a possible fix? >> >> We're experiencing several OSD crashes caused by this issue per day. >> >> /Z >> >> On Mon, 16 Oct 2023 at 14:19, Igor Fedotov wrote: >> >>> That's true. >>> On 16/10/2023 14:13, Zakhar Kirpichenko wrote: >>> >>> Many thanks, Igor. I found previously submitted bug reports and >>> subscribed to them. My understanding is that the issue is going to be fixed >>> in the next Pacific minor release. >>> >>> /Z >>> >>> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov >>> wrote: >>> Hi Zakhar, please see my reply for the post on the similar issue at: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ Thanks, Igor On 16/10/2023 09:26, Zakhar Kirpichenko wrote: > Hi, > > After upgrading to Ceph 16.2.14 we had several OSD crashes > in bstore_kv_sync thread: > > > 1. "assert_thread_name": "bstore_kv_sync", > 2. "backtrace": [ > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", > 4. "gsignal()", > 5. "abort()", > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x1a9) [0x564dc5f87d0b]", > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t > const&)+0x15e) [0x564dc6604a9e]", > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned > long)+0x77d) [0x564dc66951cd]", > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) > [0x564dc6695670]", > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402)
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Zakhar, my general concern about downgrading to previous versions is that this procedure is generally neither assumed nor tested by dev team. Although is possible most of the time. But in this specific case it is not doable due to (at least) https://github.com/ceph/ceph/pull/52212 which enables 4K bluefs allocation unit support - once some daemon gets it - there is no way back. I'm still thinking that setting "fit_to_fast" mode without enabling dynamic compaction levels is quite safe but definitely it's better to be tested in the real environment and under actual payload first. Also you might want to apply such a workaround gradually - one daemon first, bake it for a while, then apply for the full node, bake a bit more and finally go forward and update the remaining. Or even better - bake it in a test cluster first. Alternatively you might consider building updated code yourself and make patched binaries on top of .14... Thanks, Igor On 20/10/2023 15:10, Zakhar Kirpichenko wrote: Thank you, Igor. It is somewhat disappointing that fixing this bug in Pacific has such a low priority, considering its impact on existing clusters. The document attached to the PR explicitly says about `level_compaction_dynamic_level_bytes` that "enabling it on an existing DB requires special caution", we'd rather not experiment with something that has the potential to cause data corruption or loss in a production cluster. Perhaps a downgrade to the previous version, 16.2.13 which worked for us without any issues, is an option, or would you advise against such a downgrade from 16.2.14? /Z On Fri, 20 Oct 2023 at 14:46, Igor Fedotov wrote: Hi Zakhar, Definitely we expect one more (and apparently the last) Pacific minor release. There is no specific date yet though - the plans are to release Quincy and Reef minor releases prior to it. Hopefully to be done before the Christmas/New Year. Meanwhile you might want to workaround the issue by tuning bluestore_volume_selection_policy. Unfortunately most likely my original proposal to set it to rocksdb_original wouldn't work in this case so you better try "fit_to_fast" mode. This should be coupled with enabling 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty good spec on applying this mode to BlueStore attached to https://github.com/ceph/ceph/pull/37156. Thanks, Igor On 20/10/2023 06:03, Zakhar Kirpichenko wrote: Igor, I noticed that there's no roadmap for the next 16.2.x release. May I ask what time frame we are looking at with regards to a possible fix? We're experiencing several OSD crashes caused by this issue per day. /Z On Mon, 16 Oct 2023 at 14:19, Igor Fedotov wrote: That's true. On 16/10/2023 14:13, Zakhar Kirpichenko wrote: Many thanks, Igor. I found previously submitted bug reports and subscribed to them. My understanding is that the issue is going to be fixed in the next Pacific minor release. /Z On Mon, 16 Oct 2023 at 14:03, Igor Fedotov wrote: Hi Zakhar, please see my reply for the post on the similar issue at: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ Thanks, Igor On 16/10/2023 09:26, Zakhar Kirpichenko wrote: > Hi, > > After upgrading to Ceph 16.2.14 we had several OSD crashes > in bstore_kv_sync thread: > > > 1. "assert_thread_name": "bstore_kv_sync", > 2. "backtrace": [ > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", > 4. "gsignal()", > 5. "abort()", > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x1a9) [0x564dc5f87d0b]", > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t > const&)+0x15e) [0x564dc6604a9e]", > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned > long)+0x77d) [0x564dc66951cd]", > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) > [0x564dc6695670]", > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", > 14.
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
On Fri, Oct 20, 2023, 8:51 AM Zakhar Kirpichenko wrote: > > We would consider upgrading, but unfortunately our Openstack Wallaby is > holding us back as its cinder doesn't support Ceph 17.x, so we're stuck > with having to find a solution for Ceph 16.x. > Wallaby is also quite old at this time... are you aware that the W release of Cinder has foregone backport the critical, high-profile CVE-2023-2088 due to it's age? https://github.com/openstack/cinder/commit/2fef6c41fa8c5ea772cde227a119dcf22ce7a07d There was some tension over this at OpenInfra Summit this past year between some of the operators and developers. Wallaby is still marked EM upstream, but even so did not get this patch. The story is, unfortunately, the same here: the only way out of some of these holes is to upgrade... Regards, Tyler > > On Fri, 20 Oct 2023 at 15:39, Tyler Stachecki > wrote: > >> On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko >> wrote: >> >>> Thank you, Igor. >>> >>> It is somewhat disappointing that fixing this bug in Pacific has such a >>> low >>> priority, considering its impact on existing clusters. >>> >> >> Unfortunately, the hard truth here is that Pacific (stable) was released >> over 30 months ago. It has had a good run for a freely distributed product, >> and there's only so much time you can dedicate to backporting bugfixes -- >> it claws time away from other forward-thinking initiatives. >> >> Speaking from someone who's been at the helm of production clusters, I >> know Ceph upgrades can be an experience and it's frustrating to hear, but >> you have to jump sometime... >> >> Regards, >> Tyler >> >> >>> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov >>> wrote: >>> >>> > Hi Zakhar, >>> > >>> > Definitely we expect one more (and apparently the last) Pacific minor >>> > release. There is no specific date yet though - the plans are to >>> release >>> > Quincy and Reef minor releases prior to it. Hopefully to be done >>> before the >>> > Christmas/New Year. >>> > >>> > Meanwhile you might want to workaround the issue by tuning >>> > bluestore_volume_selection_policy. Unfortunately most likely my >>> original >>> > proposal to set it to rocksdb_original wouldn't work in this case so >>> you >>> > better try "fit_to_fast" mode. This should be coupled with enabling >>> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is >>> pretty >>> > good spec on applying this mode to BlueStore attached to >>> > https://github.com/ceph/ceph/pull/37156. >>> > >>> > >>> > Thanks, >>> > >>> > Igor >>> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote: >>> > >>> > Igor, I noticed that there's no roadmap for the next 16.2.x release. >>> May I >>> > ask what time frame we are looking at with regards to a possible fix? >>> > >>> > We're experiencing several OSD crashes caused by this issue per day. >>> > >>> > /Z >>> > >>> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov >>> wrote: >>> > >>> >> That's true. >>> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote: >>> >> >>> >> Many thanks, Igor. I found previously submitted bug reports and >>> >> subscribed to them. My understanding is that the issue is going to be >>> fixed >>> >> in the next Pacific minor release. >>> >> >>> >> /Z >>> >> >>> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov >>> wrote: >>> >> >>> >>> Hi Zakhar, >>> >>> >>> >>> please see my reply for the post on the similar issue at: >>> >>> >>> >>> >>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ >>> >>> >>> >>> >>> >>> Thanks, >>> >>> >>> >>> Igor >>> >>> >>> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote: >>> >>> > Hi, >>> >>> > >>> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes >>> >>> > in bstore_kv_sync thread: >>> >>> > >>> >>> > >>> >>> > 1. "assert_thread_name": "bstore_kv_sync", >>> >>> > 2. "backtrace": [ >>> >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", >>> >>> > 4. "gsignal()", >>> >>> > 5. "abort()", >>> >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, >>> char >>> >>> > const*)+0x1a9) [0x564dc5f87d0b]", >>> >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", >>> >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, >>> bluefs_fnode_t >>> >>> > const&)+0x15e) [0x564dc6604a9e]", >>> >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, >>> >>> unsigned >>> >>> > long)+0x77d) [0x564dc66951cd]", >>> >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) >>> >>> > [0x564dc6695670]", >>> >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) >>> [0x564dc66b1a6b]", >>> >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", >>> >>> > 13. >>> "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions >>> >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", >>> >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) >>> >>> > [0x564dc6c761c2]",
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Thanks, Tyler. I appreciate what you're saying, though I can't fully agree: 16.2.13 didn't have crashing OSDs, so the crashes in 16.2.14 seem like a regression - please correct me if I'm wrong. If it is indeed a regression, then I'm not sure that suggesting to upgrade is the right thing to do in this case. We would consider upgrading, but unfortunately our Openstack Wallaby is holding us back as its cinder doesn't support Ceph 17.x, so we're stuck with having to find a solution for Ceph 16.x. /Z On Fri, 20 Oct 2023 at 15:39, Tyler Stachecki wrote: > On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko wrote: > >> Thank you, Igor. >> >> It is somewhat disappointing that fixing this bug in Pacific has such a >> low >> priority, considering its impact on existing clusters. >> > > Unfortunately, the hard truth here is that Pacific (stable) was released > over 30 months ago. It has had a good run for a freely distributed product, > and there's only so much time you can dedicate to backporting bugfixes -- > it claws time away from other forward-thinking initiatives. > > Speaking from someone who's been at the helm of production clusters, I > know Ceph upgrades can be an experience and it's frustrating to hear, but > you have to jump sometime... > > Regards, > Tyler > > >> On Fri, 20 Oct 2023 at 14:46, Igor Fedotov wrote: >> >> > Hi Zakhar, >> > >> > Definitely we expect one more (and apparently the last) Pacific minor >> > release. There is no specific date yet though - the plans are to release >> > Quincy and Reef minor releases prior to it. Hopefully to be done before >> the >> > Christmas/New Year. >> > >> > Meanwhile you might want to workaround the issue by tuning >> > bluestore_volume_selection_policy. Unfortunately most likely my original >> > proposal to set it to rocksdb_original wouldn't work in this case so you >> > better try "fit_to_fast" mode. This should be coupled with enabling >> > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty >> > good spec on applying this mode to BlueStore attached to >> > https://github.com/ceph/ceph/pull/37156. >> > >> > >> > Thanks, >> > >> > Igor >> > On 20/10/2023 06:03, Zakhar Kirpichenko wrote: >> > >> > Igor, I noticed that there's no roadmap for the next 16.2.x release. >> May I >> > ask what time frame we are looking at with regards to a possible fix? >> > >> > We're experiencing several OSD crashes caused by this issue per day. >> > >> > /Z >> > >> > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov >> wrote: >> > >> >> That's true. >> >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote: >> >> >> >> Many thanks, Igor. I found previously submitted bug reports and >> >> subscribed to them. My understanding is that the issue is going to be >> fixed >> >> in the next Pacific minor release. >> >> >> >> /Z >> >> >> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov >> wrote: >> >> >> >>> Hi Zakhar, >> >>> >> >>> please see my reply for the post on the similar issue at: >> >>> >> >>> >> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ >> >>> >> >>> >> >>> Thanks, >> >>> >> >>> Igor >> >>> >> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote: >> >>> > Hi, >> >>> > >> >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes >> >>> > in bstore_kv_sync thread: >> >>> > >> >>> > >> >>> > 1. "assert_thread_name": "bstore_kv_sync", >> >>> > 2. "backtrace": [ >> >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", >> >>> > 4. "gsignal()", >> >>> > 5. "abort()", >> >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, >> char >> >>> > const*)+0x1a9) [0x564dc5f87d0b]", >> >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", >> >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, >> bluefs_fnode_t >> >>> > const&)+0x15e) [0x564dc6604a9e]", >> >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, >> >>> unsigned >> >>> > long)+0x77d) [0x564dc66951cd]", >> >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) >> >>> > [0x564dc6695670]", >> >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) >> [0x564dc66b1a6b]", >> >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", >> >>> > 13. >> "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions >> >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", >> >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) >> >>> > [0x564dc6c761c2]", >> >>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) >> >>> [0x564dc6c77808]", >> >>> > 16. >> "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup >> >>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, >> unsigned >> >>> > long)+0x309) [0x564dc6b780c9]", >> >>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, >> >>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, >>
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
On Fri, Oct 20, 2023, 8:11 AM Zakhar Kirpichenko wrote: > Thank you, Igor. > > It is somewhat disappointing that fixing this bug in Pacific has such a low > priority, considering its impact on existing clusters. > Unfortunately, the hard truth here is that Pacific (stable) was released over 30 months ago. It has had a good run for a freely distributed product, and there's only so much time you can dedicate to backporting bugfixes -- it claws time away from other forward-thinking initiatives. Speaking from someone who's been at the helm of production clusters, I know Ceph upgrades can be an experience and it's frustrating to hear, but you have to jump sometime... Regards, Tyler > On Fri, 20 Oct 2023 at 14:46, Igor Fedotov wrote: > > > Hi Zakhar, > > > > Definitely we expect one more (and apparently the last) Pacific minor > > release. There is no specific date yet though - the plans are to release > > Quincy and Reef minor releases prior to it. Hopefully to be done before > the > > Christmas/New Year. > > > > Meanwhile you might want to workaround the issue by tuning > > bluestore_volume_selection_policy. Unfortunately most likely my original > > proposal to set it to rocksdb_original wouldn't work in this case so you > > better try "fit_to_fast" mode. This should be coupled with enabling > > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty > > good spec on applying this mode to BlueStore attached to > > https://github.com/ceph/ceph/pull/37156. > > > > > > Thanks, > > > > Igor > > On 20/10/2023 06:03, Zakhar Kirpichenko wrote: > > > > Igor, I noticed that there's no roadmap for the next 16.2.x release. May > I > > ask what time frame we are looking at with regards to a possible fix? > > > > We're experiencing several OSD crashes caused by this issue per day. > > > > /Z > > > > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov > wrote: > > > >> That's true. > >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote: > >> > >> Many thanks, Igor. I found previously submitted bug reports and > >> subscribed to them. My understanding is that the issue is going to be > fixed > >> in the next Pacific minor release. > >> > >> /Z > >> > >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov > wrote: > >> > >>> Hi Zakhar, > >>> > >>> please see my reply for the post on the similar issue at: > >>> > >>> > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ > >>> > >>> > >>> Thanks, > >>> > >>> Igor > >>> > >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote: > >>> > Hi, > >>> > > >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes > >>> > in bstore_kv_sync thread: > >>> > > >>> > > >>> > 1. "assert_thread_name": "bstore_kv_sync", > >>> > 2. "backtrace": [ > >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", > >>> > 4. "gsignal()", > >>> > 5. "abort()", > >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char > >>> > const*)+0x1a9) [0x564dc5f87d0b]", > >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", > >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t > >>> > const&)+0x15e) [0x564dc6604a9e]", > >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, > >>> unsigned > >>> > long)+0x77d) [0x564dc66951cd]", > >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) > >>> > [0x564dc6695670]", > >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) > [0x564dc66b1a6b]", > >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", > >>> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions > >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", > >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) > >>> > [0x564dc6c761c2]", > >>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) > >>> [0x564dc6c77808]", > >>> > 16. > "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup > >>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, > unsigned > >>> > long)+0x309) [0x564dc6b780c9]", > >>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, > >>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, > >>> unsigned > >>> > long, bool, unsigned long*, unsigned long, > >>> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", > >>> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, > >>> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", > >>> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, > >>> > std::shared_ptr)+0x84) > >>> [0x564dc6b1f644]", > >>> > 20. > >>> > "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) > >>> > [0x564dc6b2004a]", > >>> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", > >>> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", > >>> > 23.
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Thank you, Igor. It is somewhat disappointing that fixing this bug in Pacific has such a low priority, considering its impact on existing clusters. The document attached to the PR explicitly says about `level_compaction_dynamic_level_bytes` that "enabling it on an existing DB requires special caution", we'd rather not experiment with something that has the potential to cause data corruption or loss in a production cluster. Perhaps a downgrade to the previous version, 16.2.13 which worked for us without any issues, is an option, or would you advise against such a downgrade from 16.2.14? /Z On Fri, 20 Oct 2023 at 14:46, Igor Fedotov wrote: > Hi Zakhar, > > Definitely we expect one more (and apparently the last) Pacific minor > release. There is no specific date yet though - the plans are to release > Quincy and Reef minor releases prior to it. Hopefully to be done before the > Christmas/New Year. > > Meanwhile you might want to workaround the issue by tuning > bluestore_volume_selection_policy. Unfortunately most likely my original > proposal to set it to rocksdb_original wouldn't work in this case so you > better try "fit_to_fast" mode. This should be coupled with enabling > 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty > good spec on applying this mode to BlueStore attached to > https://github.com/ceph/ceph/pull/37156. > > > Thanks, > > Igor > On 20/10/2023 06:03, Zakhar Kirpichenko wrote: > > Igor, I noticed that there's no roadmap for the next 16.2.x release. May I > ask what time frame we are looking at with regards to a possible fix? > > We're experiencing several OSD crashes caused by this issue per day. > > /Z > > On Mon, 16 Oct 2023 at 14:19, Igor Fedotov wrote: > >> That's true. >> On 16/10/2023 14:13, Zakhar Kirpichenko wrote: >> >> Many thanks, Igor. I found previously submitted bug reports and >> subscribed to them. My understanding is that the issue is going to be fixed >> in the next Pacific minor release. >> >> /Z >> >> On Mon, 16 Oct 2023 at 14:03, Igor Fedotov wrote: >> >>> Hi Zakhar, >>> >>> please see my reply for the post on the similar issue at: >>> >>> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ >>> >>> >>> Thanks, >>> >>> Igor >>> >>> On 16/10/2023 09:26, Zakhar Kirpichenko wrote: >>> > Hi, >>> > >>> > After upgrading to Ceph 16.2.14 we had several OSD crashes >>> > in bstore_kv_sync thread: >>> > >>> > >>> > 1. "assert_thread_name": "bstore_kv_sync", >>> > 2. "backtrace": [ >>> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", >>> > 4. "gsignal()", >>> > 5. "abort()", >>> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char >>> > const*)+0x1a9) [0x564dc5f87d0b]", >>> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", >>> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t >>> > const&)+0x15e) [0x564dc6604a9e]", >>> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, >>> unsigned >>> > long)+0x77d) [0x564dc66951cd]", >>> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) >>> > [0x564dc6695670]", >>> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", >>> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", >>> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions >>> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", >>> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) >>> > [0x564dc6c761c2]", >>> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) >>> [0x564dc6c77808]", >>> > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup >>> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned >>> > long)+0x309) [0x564dc6b780c9]", >>> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, >>> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, >>> unsigned >>> > long, bool, unsigned long*, unsigned long, >>> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", >>> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, >>> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", >>> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, >>> > std::shared_ptr)+0x84) >>> [0x564dc6b1f644]", >>> > 20. >>> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) >>> > [0x564dc6b2004a]", >>> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", >>> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", >>> > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", >>> > 24. "clone()" >>> > 25. ], >>> > >>> > >>> > I am attaching two instances of crash info for further reference: >>> > https://pastebin.com/E6myaHNU >>> > >>> > OSD configuration is rather simple and close to default: >>> > >>> > osd.6 dev
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Hi Zakhar, Definitely we expect one more (and apparently the last) Pacific minor release. There is no specific date yet though - the plans are to release Quincy and Reef minor releases prior to it. Hopefully to be done before the Christmas/New Year. Meanwhile you might want to workaround the issue by tuning bluestore_volume_selection_policy. Unfortunately most likely my original proposal to set it to rocksdb_original wouldn't work in this case so you better try "fit_to_fast" mode. This should be coupled with enabling 'level_compaction_dynamic_level_bytes' mode in RocksDB - there is pretty good spec on applying this mode to BlueStore attached to https://github.com/ceph/ceph/pull/37156. Thanks, Igor On 20/10/2023 06:03, Zakhar Kirpichenko wrote: Igor, I noticed that there's no roadmap for the next 16.2.x release. May I ask what time frame we are looking at with regards to a possible fix? We're experiencing several OSD crashes caused by this issue per day. /Z On Mon, 16 Oct 2023 at 14:19, Igor Fedotov wrote: That's true. On 16/10/2023 14:13, Zakhar Kirpichenko wrote: Many thanks, Igor. I found previously submitted bug reports and subscribed to them. My understanding is that the issue is going to be fixed in the next Pacific minor release. /Z On Mon, 16 Oct 2023 at 14:03, Igor Fedotov wrote: Hi Zakhar, please see my reply for the post on the similar issue at: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ Thanks, Igor On 16/10/2023 09:26, Zakhar Kirpichenko wrote: > Hi, > > After upgrading to Ceph 16.2.14 we had several OSD crashes > in bstore_kv_sync thread: > > > 1. "assert_thread_name": "bstore_kv_sync", > 2. "backtrace": [ > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", > 4. "gsignal()", > 5. "abort()", > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x1a9) [0x564dc5f87d0b]", > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t > const&)+0x15e) [0x564dc6604a9e]", > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned > long)+0x77d) [0x564dc66951cd]", > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) > [0x564dc6695670]", > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) > [0x564dc6c761c2]", > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]", > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned > long)+0x309) [0x564dc6b780c9]", > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned > long, bool, unsigned long*, unsigned long, > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, > std::shared_ptr)+0x84) [0x564dc6b1f644]", > 20. "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) > [0x564dc6b2004a]", > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", > 24. "clone()" > 25. ], > > > I am attaching two instances of crash info for further reference: > https://pastebin.com/E6myaHNU > > OSD configuration is rather simple and close to default: > > osd.6 dev bluestore_cache_size_hdd 4294967296 > osd.6 dev > bluestore_cache_size_ssd 4294967296 > osd advanced debug_rocksdb > 1/5 osd > advanced osd_max_backfills 2 > osd
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Igor, I noticed that there's no roadmap for the next 16.2.x release. May I ask what time frame we are looking at with regards to a possible fix? We're experiencing several OSD crashes caused by this issue per day. /Z On Mon, 16 Oct 2023 at 14:19, Igor Fedotov wrote: > That's true. > On 16/10/2023 14:13, Zakhar Kirpichenko wrote: > > Many thanks, Igor. I found previously submitted bug reports and subscribed > to them. My understanding is that the issue is going to be fixed in the > next Pacific minor release. > > /Z > > On Mon, 16 Oct 2023 at 14:03, Igor Fedotov wrote: > >> Hi Zakhar, >> >> please see my reply for the post on the similar issue at: >> >> https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ >> >> >> Thanks, >> >> Igor >> >> On 16/10/2023 09:26, Zakhar Kirpichenko wrote: >> > Hi, >> > >> > After upgrading to Ceph 16.2.14 we had several OSD crashes >> > in bstore_kv_sync thread: >> > >> > >> > 1. "assert_thread_name": "bstore_kv_sync", >> > 2. "backtrace": [ >> > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", >> > 4. "gsignal()", >> > 5. "abort()", >> > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char >> > const*)+0x1a9) [0x564dc5f87d0b]", >> > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", >> > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t >> > const&)+0x15e) [0x564dc6604a9e]", >> > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, >> unsigned >> > long)+0x77d) [0x564dc66951cd]", >> > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) >> > [0x564dc6695670]", >> > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", >> > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", >> > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions >> > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", >> > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) >> > [0x564dc6c761c2]", >> > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) >> [0x564dc6c77808]", >> > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup >> > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned >> > long)+0x309) [0x564dc6b780c9]", >> > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, >> > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, >> unsigned >> > long, bool, unsigned long*, unsigned long, >> > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", >> > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, >> > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", >> > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, >> > std::shared_ptr)+0x84) >> [0x564dc6b1f644]", >> > 20. >> "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) >> > [0x564dc6b2004a]", >> > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", >> > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", >> > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", >> > 24. "clone()" >> > 25. ], >> > >> > >> > I am attaching two instances of crash info for further reference: >> > https://pastebin.com/E6myaHNU >> > >> > OSD configuration is rather simple and close to default: >> > >> > osd.6 dev bluestore_cache_size_hdd4294967296 >> >osd.6 dev >> > bluestore_cache_size_ssd4294967296 >> >osd advanced debug_rocksdb >> >1/5 >> osd >> > advanced osd_max_backfills 2 >> > osd basic >> > osd_memory_target 17179869184 >> > osd advanced osd_recovery_max_active >> > 2 osd >> > advanced osd_scrub_sleep 0.10 >> >osd advanced >> > rbd_balance_parent_readsfalse >> > >> > debug_rocksdb is a recent change, otherwise this configuration has been >> > running without issues for months. The crashes happened on two different >> > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD >> > block) don't exhibit any issues. We have not experienced such crashes >> with >> > Ceph < 16.2.14. >> > >> > Is this a known issue, or should I open a bug report? >> > >> > Best regards, >> > Zakhar >> > ___ >> > ceph-users mailing list -- ceph-users@ceph.io >> > To unsubscribe send an email to ceph-users-le...@ceph.io >> > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
That's true. On 16/10/2023 14:13, Zakhar Kirpichenko wrote: Many thanks, Igor. I found previously submitted bug reports and subscribed to them. My understanding is that the issue is going to be fixed in the next Pacific minor release. /Z On Mon, 16 Oct 2023 at 14:03, Igor Fedotov wrote: Hi Zakhar, please see my reply for the post on the similar issue at: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ Thanks, Igor On 16/10/2023 09:26, Zakhar Kirpichenko wrote: > Hi, > > After upgrading to Ceph 16.2.14 we had several OSD crashes > in bstore_kv_sync thread: > > > 1. "assert_thread_name": "bstore_kv_sync", > 2. "backtrace": [ > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", > 4. "gsignal()", > 5. "abort()", > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char > const*)+0x1a9) [0x564dc5f87d0b]", > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t > const&)+0x15e) [0x564dc6604a9e]", > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned > long)+0x77d) [0x564dc66951cd]", > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) > [0x564dc6695670]", > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) > [0x564dc6c761c2]", > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]", > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned > long)+0x309) [0x564dc6b780c9]", > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned > long, bool, unsigned long*, unsigned long, > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, > std::shared_ptr)+0x84) [0x564dc6b1f644]", > 20. "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) > [0x564dc6b2004a]", > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", > 24. "clone()" > 25. ], > > > I am attaching two instances of crash info for further reference: > https://pastebin.com/E6myaHNU > > OSD configuration is rather simple and close to default: > > osd.6 dev bluestore_cache_size_hdd 4294967296 > osd.6 dev > bluestore_cache_size_ssd 4294967296 > osd advanced debug_rocksdb > 1/5 osd > advanced osd_max_backfills 2 > osd basic > osd_memory_target 17179869184 > osd advanced osd_recovery_max_active > 2 osd > advanced osd_scrub_sleep 0.10 > osd advanced > rbd_balance_parent_reads false > > debug_rocksdb is a recent change, otherwise this configuration has been > running without issues for months. The crashes happened on two different > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD > block) don't exhibit any issues. We have not experienced such crashes with > Ceph < 16.2.14. > > Is this a known issue, or should I open a bug report? > > Best regards, > Zakhar > ___ > ceph-users mailing list -- ceph-users@ceph.io > To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Many thanks, Igor. I found previously submitted bug reports and subscribed to them. My understanding is that the issue is going to be fixed in the next Pacific minor release. /Z On Mon, 16 Oct 2023 at 14:03, Igor Fedotov wrote: > Hi Zakhar, > > please see my reply for the post on the similar issue at: > > https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ > > > Thanks, > > Igor > > On 16/10/2023 09:26, Zakhar Kirpichenko wrote: > > Hi, > > > > After upgrading to Ceph 16.2.14 we had several OSD crashes > > in bstore_kv_sync thread: > > > > > > 1. "assert_thread_name": "bstore_kv_sync", > > 2. "backtrace": [ > > 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", > > 4. "gsignal()", > > 5. "abort()", > > 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char > > const*)+0x1a9) [0x564dc5f87d0b]", > > 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", > > 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t > > const&)+0x15e) [0x564dc6604a9e]", > > 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, > unsigned > > long)+0x77d) [0x564dc66951cd]", > > 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) > > [0x564dc6695670]", > > 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", > > 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", > > 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions > > const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", > > 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) > > [0x564dc6c761c2]", > > 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) > [0x564dc6c77808]", > > 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup > > const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned > > long)+0x309) [0x564dc6b780c9]", > > 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, > > rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, > unsigned > > long, bool, unsigned long*, unsigned long, > > rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", > > 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, > > rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", > > 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, > > std::shared_ptr)+0x84) > [0x564dc6b1f644]", > > 20. > "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) > > [0x564dc6b2004a]", > > 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", > > 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", > > 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", > > 24. "clone()" > > 25. ], > > > > > > I am attaching two instances of crash info for further reference: > > https://pastebin.com/E6myaHNU > > > > OSD configuration is rather simple and close to default: > > > > osd.6 dev bluestore_cache_size_hdd4294967296 > >osd.6 dev > > bluestore_cache_size_ssd4294967296 > >osd advanced debug_rocksdb > >1/5 > osd > > advanced osd_max_backfills 2 > > osd basic > > osd_memory_target 17179869184 > > osd advanced osd_recovery_max_active > > 2 osd > > advanced osd_scrub_sleep 0.10 > >osd advanced > > rbd_balance_parent_readsfalse > > > > debug_rocksdb is a recent change, otherwise this configuration has been > > running without issues for months. The crashes happened on two different > > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD > > block) don't exhibit any issues. We have not experienced such crashes > with > > Ceph < 16.2.14. > > > > Is this a known issue, or should I open a bug report? > > > > Best regards, > > Zakhar > > ___ > > ceph-users mailing list -- ceph-users@ceph.io > > To unsubscribe send an email to ceph-users-le...@ceph.io > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Hi Zakhar, please see my reply for the post on the similar issue at: https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/YNJ35HXN4HXF4XWB6IOZ2RKXX7EQCEIY/ Thanks, Igor On 16/10/2023 09:26, Zakhar Kirpichenko wrote: Hi, After upgrading to Ceph 16.2.14 we had several OSD crashes in bstore_kv_sync thread: 1. "assert_thread_name": "bstore_kv_sync", 2. "backtrace": [ 3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", 4. "gsignal()", 5. "abort()", 6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x564dc5f87d0b]", 7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", 8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t const&)+0x15e) [0x564dc6604a9e]", 9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, unsigned long)+0x77d) [0x564dc66951cd]", 10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) [0x564dc6695670]", 11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", 12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", 13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", 14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) [0x564dc6c761c2]", 15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]", 16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned long)+0x309) [0x564dc6b780c9]", 17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", 18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", 19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, std::shared_ptr)+0x84) [0x564dc6b1f644]", 20. "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) [0x564dc6b2004a]", 21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", 22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", 23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", 24. "clone()" 25. ], I am attaching two instances of crash info for further reference: https://pastebin.com/E6myaHNU OSD configuration is rather simple and close to default: osd.6 dev bluestore_cache_size_hdd4294967296 osd.6 dev bluestore_cache_size_ssd4294967296 osd advanced debug_rocksdb 1/5 osd advanced osd_max_backfills 2 osd basic osd_memory_target 17179869184 osd advanced osd_recovery_max_active 2 osd advanced osd_scrub_sleep 0.10 osd advanced rbd_balance_parent_readsfalse debug_rocksdb is a recent change, otherwise this configuration has been running without issues for months. The crashes happened on two different hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD block) don't exhibit any issues. We have not experienced such crashes with Ceph < 16.2.14. Is this a known issue, or should I open a bug report? Best regards, Zakhar ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Unfortunately, the OSD log from the earlier crash is not available. I have extracted the OSD log, including the recent events, from the latest crash: https://www.dropbox.com/scl/fi/1ne8h85iuc5vx78qm1t93/20231016_osd6.zip?rlkey=fxyn242q7c69ec5lkv29csx13=0 I hope this helps to identify the crash reason. The log entries that I find suspicious are these right before the crash: debug -1726> 2023-10-15T22:31:21.575+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17024319488 unmapped: 4164763648 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 ... debug -1723> 2023-10-15T22:31:22.579+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17024589824 unmapped: 4164493312 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 ... debug -1718> 2023-10-15T22:31:23.579+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17027031040 unmapped: 4162052096 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 ... debug -1714> 2023-10-15T22:31:24.579+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17026301952 unmapped: 4162781184 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 debug -1713> 2023-10-15T22:31:25.383+ 7f961ccb8700 5 bluestore.MempoolThread(0x55c5bee8cb98) _resize_shards cache_size: 13797582406 kv_alloc: 8321499136 kv_used: 8245313424 kv_onode_alloc: 4697620480 kv_onode_used: 4690617424 meta_alloc: 469762048 meta_used: 371122625 data_alloc: 134217728 data_used: 44314624 ... debug -1710> 2023-10-15T22:31:25.583+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17026367488 unmapped: 4162715648 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 ... debug -1707> 2023-10-15T22:31:26.583+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17026211840 unmapped: 4162871296 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 ... debug -1704> 2023-10-15T22:31:27.583+ 7f961ccb8700 5 prioritycache tune_memory target: 17179869184 mapped: 17024548864 unmapped: 4164534272 heap: 21189083136 old mem: 13797582406 new mem: 13797582406 There's plenty of RAM in the system, about 120 GB free and used for cache. /Z On Mon, 16 Oct 2023 at 09:26, Zakhar Kirpichenko wrote: > Hi, > > After upgrading to Ceph 16.2.14 we had several OSD crashes > in bstore_kv_sync thread: > > >1. "assert_thread_name": "bstore_kv_sync", >2. "backtrace": [ >3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", >4. "gsignal()", >5. "abort()", >6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char >const*)+0x1a9) [0x564dc5f87d0b]", >7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", >8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t >const&)+0x15e) [0x564dc6604a9e]", >9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, >unsigned long)+0x77d) [0x564dc66951cd]", >10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) >[0x564dc6695670]", >11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", >12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", >13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions >const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", >14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) >[0x564dc6c761c2]", >15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]", >16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup >const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned >long)+0x309) [0x564dc6b780c9]", >17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, >rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned >long, bool, unsigned long*, unsigned long, >rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", >18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, >rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", >19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, >std::shared_ptr)+0x84) [0x564dc6b1f644]", >20. > "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) >[0x564dc6b2004a]", >21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", >22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", >23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", >24. "clone()" >25. ], > > > I am attaching two instances of crash info for further reference: > https://pastebin.com/E6myaHNU > > OSD configuration is rather simple and close to default: > > osd.6 dev bluestore_cache_size_hdd4294967296 > osd.6 dev > bluestore_cache_size_ssd4294967296 > osd advanced debug_rocksdb > 1/5 osd > advanced
[ceph-users] Re: Ceph 16.2.14: OSDs randomly crash in bstore_kv_sync
Not sure how it managed to screw up formatting, OSD configuration in a more readable form: https://pastebin.com/mrC6UdzN /Z On Mon, 16 Oct 2023 at 09:26, Zakhar Kirpichenko wrote: > Hi, > > After upgrading to Ceph 16.2.14 we had several OSD crashes > in bstore_kv_sync thread: > > >1. "assert_thread_name": "bstore_kv_sync", >2. "backtrace": [ >3. "/lib64/libpthread.so.0(+0x12cf0) [0x7ff2f6750cf0]", >4. "gsignal()", >5. "abort()", >6. "(ceph::__ceph_assert_fail(char const*, char const*, int, char >const*)+0x1a9) [0x564dc5f87d0b]", >7. "/usr/bin/ceph-osd(+0x584ed4) [0x564dc5f87ed4]", >8. "(RocksDBBlueFSVolumeSelector::sub_usage(void*, bluefs_fnode_t >const&)+0x15e) [0x564dc6604a9e]", >9. "(BlueFS::_flush_range_F(BlueFS::FileWriter*, unsigned long, >unsigned long)+0x77d) [0x564dc66951cd]", >10. "(BlueFS::_flush_F(BlueFS::FileWriter*, bool, bool*)+0x90) >[0x564dc6695670]", >11. "(BlueFS::fsync(BlueFS::FileWriter*)+0x18b) [0x564dc66b1a6b]", >12. "(BlueRocksWritableFile::Sync()+0x18) [0x564dc66c1768]", >13. "(rocksdb::LegacyWritableFileWrapper::Sync(rocksdb::IOOptions >const&, rocksdb::IODebugContext*)+0x1f) [0x564dc6b6496f]", >14. "(rocksdb::WritableFileWriter::SyncInternal(bool)+0x402) >[0x564dc6c761c2]", >15. "(rocksdb::WritableFileWriter::Sync(bool)+0x88) [0x564dc6c77808]", >16. "(rocksdb::DBImpl::WriteToWAL(rocksdb::WriteThread::WriteGroup >const&, rocksdb::log::Writer*, unsigned long*, bool, bool, unsigned >long)+0x309) [0x564dc6b780c9]", >17. "(rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, >rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned >long, bool, unsigned long*, unsigned long, >rocksdb::PreReleaseCallback*)+0x2629) [0x564dc6b80c69]", >18. "(rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, >rocksdb::WriteBatch*)+0x21) [0x564dc6b80e61]", >19. "(RocksDBStore::submit_common(rocksdb::WriteOptions&, >std::shared_ptr)+0x84) [0x564dc6b1f644]", >20. > "(RocksDBStore::submit_transaction_sync(std::shared_ptr)+0x9a) >[0x564dc6b2004a]", >21. "(BlueStore::_kv_sync_thread()+0x30d8) [0x564dc6602ec8]", >22. "(BlueStore::KVSyncThread::entry()+0x11) [0x564dc662ab61]", >23. "/lib64/libpthread.so.0(+0x81ca) [0x7ff2f67461ca]", >24. "clone()" >25. ], > > > I am attaching two instances of crash info for further reference: > https://pastebin.com/E6myaHNU > > OSD configuration is rather simple and close to default: > > osd.6 dev bluestore_cache_size_hdd4294967296 > osd.6 dev > bluestore_cache_size_ssd4294967296 > osd advanced debug_rocksdb > 1/5 osd > advanced osd_max_backfills 2 > osd basic > osd_memory_target 17179869184 > osd advanced osd_recovery_max_active > 2 osd > advanced osd_scrub_sleep 0.10 > osd advanced > rbd_balance_parent_readsfalse > > debug_rocksdb is a recent change, otherwise this configuration has been > running without issues for months. The crashes happened on two different > hosts with identical hardware, the hosts and storage (NVME DB/WAL, HDD > block) don't exhibit any issues. We have not experienced such crashes with > Ceph < 16.2.14. > > Is this a known issue, or should I open a bug report? > > Best regards, > Zakhar > ___ ceph-users mailing list -- ceph-users@ceph.io To unsubscribe send an email to ceph-users-le...@ceph.io